Skip to content

Language Model

Customer Feedback Tagging with NLP

Overview

Figure 1 Lanaguage Model in AI

Project

With the data pipeline below to collect, pre-process, feature-engineer, NLP alogorithm applied to provide useful dashboard for analysis and further actions.

Steps


Data Collection:

Data mining or ETL (extract-transform-load) process to collect a corpus of unstructured data.


Data Preprocessing:

  • Tokenization: Segmentation of running text into words.

  • Lemmatization: Removal of inflectional endings to return the base form.

  • Parts-of-speech tagging: Identification of words as nouns, verbs, adjectives etc.

  • Lanaguage detection: Identification of the lanauges from single or several sentensces even a short one.


Feature Engineering (NLP visualization):
  • Word Embeddings: Transforming text into a meaningful vector or array of numbers.

  • N-grams: An unigram is a set of individual words within a document; bi-gram is a set of 2 adjacent words within a document.

  • TF-IDF values: Term-Frequency-Inverse-Document-Frequency is a numerical statistic representing how important a word is to a document within a collection of documents.


Application of NLP Algorithms:
  • Latent Dirichlet Allocation: Topic modeling algorithm for detecting abstract themes from a collection of documents.

  • Support Vector Machine: Classification algorithm for detection of underlying consumer sentiment.

  • Long Short-Term Memory Network: Type of recurrent neural networks for machine translation used in Google Translate.

Scope

  • Topic Modeling: How to automatically categorize customer complaints or intent classification?

  • Sentiment analysis
    • How to detect sentiment from customer feedback, a complaint or a positive feedback?
    • How to detect urgency?

Implementation: WinkNLP

Customize tagging keyword

Implementation: spaCy

Customize tagging keyword

A high level view of generic model and the refine model in the whole process.

The detailed NLP refinement model is as below to improve the models of NER Tagging in spaCy model on user feedback.

Another better idea is Active learning as below

and the whole data pipeline diagram for user feedback tagging is as below

References

spaCy NER

NLP Kits

Label Annotation

ML Backend