Natural Language Processing ( NLP for short) is an exciting and useful field of Data Science. Some of applications of NLP involves
- Text Classification
- Sentiment Analysis
- Machine Translation
- Chatbots Creation
- Keyword Extraction
With the increase of textual data, comes the increase of performant and fine tuned State of the Art(SoTA) models. These models may take several days and compute to generate, fortunately in the field of IT and Data Science people are very generous to open source their models and work for others to use. In this tutorial we will be exploring a simple NLP package created by John Snow Labs that offers several pretrained NLP models with simplicity.
By the end of this tutorial you will learn
- Difference between Spark NLU and Spark NLP
- How to Perform Sentiment Analysis with Spark NLU
- Text Classification with Spark NLU
- Question Classification & Q and A with Spark NLU
Let us start.
Difference Between Spark NLU and Spark NLP
Both Spark NLU and Spark NLP libraries were developed by the same company John Snow Labs in collaboration with Apache Spark, however the difference between spark nlu library is that is it a simplified one liner API that gives you access to several pretrained language models to perform NLP task such as text classification, sentiment analysis,Q&A, NER etc. It is can be installed with pip via the command
pip install nlu pyspark
Spark NLP on the other hand is the robust NLP library.This is under the name sparknlp
Overview of Spark NLU
As we stated earlier Spark NLU is a simple one liner API that gives you access to several pretrained models for your task. The usage is as below
Text Classification with Spark NLU
Spark NLU from John Snow Labs has several models for different text classification task such as sentiment analysis,emotion in text classification,spam classification,etc. To use it you will need to download the models via the same command use to load the model. Any time you load the model , if the model is not available on your system, it will be downloaded from their servers/ modelhub. Hence you may need internet connection and space to download it for the first time.
Let us see a simple example
# Load Pkgs import nlu #Usage nlu.load('sentiment').predict('I like coding and writing')
Alternatively you can try this method
sentiment_model = nlu.load('sentiment') sentiment_model.predict('I like coding and writing')
To check for all the various components and supported language models you can use
import nlu nlu.print_components() nlu.print_all_models()
NER and Visualizing NER with Spark NLU
You can also perform Named Entity Recognition with Spark NLU. John Snow Labs also offers powerful clinical NER models for clinical NLP with is part of their enterprise. versions. To perform NER you can use
nlu.load('ner').predict('John lives in Accra but works in a remote job at London')
To visualize your NER, Spark NLU offers two options the .viz() and the .viz_streamlit().
nlu.load('ner').viz('John lives in Accra but works in a remote job at London')
There are others task we can perform with John Snow Labs NLU but we will limit ourselves to these for now.
You can check out the video tutorial below for more
To conclude Spark NLU makes it easy to perform several NLP task with state of the art language and ML models.
Thanks For your Time
By Jesse E.Agbe(JCharis)