Language Model¶

May 13, 2021
in AI, Language Model
1 min read

Draft

NLP- Word Understanding

Overview

LanaguageModelinAllVision — Figure 1 Language Model in AI

[![](https://www.googleapis.com/download/storage/v1/b/kaggle-forum-message-attachments/o/inbox%2F4440537%2Fa3a11ff3167936d62cfc8af32

Feb 23, 2021
in AI, Language Model
6 min read

Draft

Customer Feedback Tagging with NLP

Overview

Project

With the data pipeline below to collect, pre-process, feature-engineer, NLP alogorithm applied to provide useful dashboard for analysis and further actions.

Steps

Data Collection:

Data mining or ETL (extract-transform-load) process to collect a corpus of unstructured data.

Data Preprocessing:

Tokenization: Segmentation of running text into words.
Lemmatization: Removal of inflectional endings to return the base form.
Parts-of-speech tagging: Identification of words as nouns, verbs, adjectives etc.
Lanaguage detection: Identification of the lanauges from single or several sentensces even a short one.

Feature Engineering (NLP visualization):

Word Embeddings: Transforming text into a meaningful vector or array of numbers.
N-grams: An unigram is a set of individual words within a document; bi-gram is a set of 2 adjacent words within a document.
TF-IDF values: Term-Frequency-Inverse-Document-Frequency is a numerical statistic representing how important a word is to a document within a collection of documents.

Application of NLP Algorithms:

Latent Dirichlet Allocation: Topic modeling algorithm for detecting abstract themes from a collection of documents.
Support Vector Machine: Classification algorithm for detection of underlying consumer sentiment.
Long Short-Term Memory Network: Type of recurrent neural networks for machine translation used in Google Translate.