Transfer Learning for Languages
The intuition of this article is to illustrate how transfer learning can be applied to NLP tasks.
In conventional NLP task we perform the followings tasks in general:
- Text Preprocessing
- Tokenization
- Normalize data using stemming/lemmatization
- Create an embedding vector
- Develop the final model
Rather than doing all these steps is there an easy way out where we just feed the incoming texts and a pretrianed model takes care of the subsequent processes?
Here we make use of Tensorflow Hub which allows models to be easily loaded into Tensorflow. To install TensorHub use the following command.
pip install tensorflow_hub
It is also necessary to install TensorFlow datasets .Use the following command to download .
pip install tensorflow_datasets
Import Libraries
Load Amazon Personal Care Appiances Review data from tensorflow dataset
Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazons iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews
The dataset contains the following columns :
marketplace - 2 letter country code of the marketplace where the review was written.
customer_id - Random identifier that can be used to aggregate reviews written by a single author.
review_id - The unique ID of the review.
product_id - The unique Product ID the review pertains to. In the multilingual dataset the reviews for the same product in different countries can be grouped by the same product_id.
product_parent - Random identifier that can be used to aggregate reviews for the same product.
product_title - Title of the product.
product_category - Broad product category that can be used to group reviews
(also used to group the dataset into coherent parts).
star_rating - The 1-5 star rating of the review.
helpful_votes - Number of helpful votes.
total_votes - Number of total votes the review received.
vine - Review was written as part of the Vine program.
verified_purchase - The review is on a verified purchase.
review_headline - The title of the review.
review_body - The review text.
review_date - The date the review was written
we are concerned with the review_body as a predictor in order to predict the label star_rating
Consuming the data loaded
Create a function to generate the labels
Define a function to take tensor object as input and retrieve text and label
Apply the function on the sample dataset to fetch review text and corresponding label
Fetch a pretrained model from tensorflow hub
Pretrained Model : Token based text embedding trained on English Google News 200B corpus(https://tfhub.dev/google/nnlm-en-dim128/2)
Steps:
- Takes as input a 1D tensor
- The input split it based on spaces
- Return individual embedding for each word and combine it into a sentence embedding and return the value
Add lop layers to the pretrained model base
Compile the Model
Shuffle the data and create a batch of 512 vectors
Train the model for 5 epochs
Visualize Training and validation Accuracy per epoch
Visualize Training and validation loss
Evaluate the trained model for the Test Data
The model does not over fit as evident from the validation accuracy which is 85%. Also we have only built a base model
Make Predictions
Conclusion
We could see that Transfer learning can also be applicable to NLP tasks , not limiting to image classification.
We did not perform any of the generic text preprocessing part before feeding the data to the model.
The text processing part was handled by the pretrained model.
Note :The accuracy of the current problem statement can still be improved by adding a Conv1D layer or introducing a LSTM layer , but here the main intuition was to apply transfer learning to text data.
Resources Referred :
Deep learning tutorials by Jeff Heaton