Practical Natural Language Processing

In this series of tutorials and blog posts  we will be exploring Natural Language Processing(NLP for short) – a branch of Machine Learning and Data Science. We will be taking short and exciting tutorials on most of the concepts used in NLP and how it is applied in real life. We will take each concept and then use the 5 W+H approach on each concept. (The 5 W+H approach represent the five questions we mostly ask in any endeavor  these include : what,why,when,who ,where and then how.)

Let us see the list of topics we will be discussing over time.

  • Textual Data Acquisition
  • Text Preprocessing
  • Basic NLP Task
  • Sentiment Analysis
  • Text Summarization
  • Text Classification
  • Text Generation
  • Topic Modelling
  • Author Attribution
  • Named Entity Recognition and Its Applications
  • Plagiarisim Detection
  • The unknown
  • NLP with Deep Learning
  • etc

The basic tools and libraries we will be using will span from Python to JavaScript, R and Julia where necessary. Let us start

What is Natural Language Processing?

To understand what we mean by Natural Language Processing(NLP for short) let us first of all break down our term one by one.

What is a Language? And What is Natural Language?

A language is simply a system of communication of ideas from one person to another.It is a way we communicate – in our context, the way humans communicate among themselves.It is one of the most important gifts and abilities from our Creator God. With the advancement of civilization and the emergence of intelligent machines humans designed a way to be able to communicate not only  among themselves but also with the machines they work with and vice versa.  Therefore we can group languages as

  • Human Languages: language used by humans in everyday activity but not intended for machines to easily understand, also called Natural language.
  • Machine Languages: language designed artificially by humans to give instructions to machines and do activities. This is what is termed as a  Programming language. Hence almost all programming languages are disintegrated to machine code, 1s and 0s or bits and qits. They are the formally  human constructed languages.

So what you and I  naturally speak with as humans (or aliens,angels,etc) is termed a Natural Language.  An example of natural language is your native tongue or second language eg. English, Twi, French,Chinese,Russian,Spanish,Hindi,Yoruba or German.

Let us see what wiki defines Natural Language for more details. It says ”

natural language or ordinary language is any language that has evolved naturally in humans through use and repetition without conscious planning or premeditation.”[source]

Why Natural Language Processing?

As we continue to advance in civilization we designed programming languages to help us work with machines and computers but there came the need for machines and computers to be able to understand our  Natural language the way we do, thus our human everyday language in addition to the one we constructed for them. This concept of enabling machines and computers to be able to understand and process human natural language such as English,French or Twi is what is termed Natural Language Processing in simple terms. Hence the reason behind NLP is just to process human language mostly text and speech like a human will do but this time by  a computer or a machine.

But there is a challenge, natural(human) language has a lot of ambiguity and rules that we can easily understand  and know what we mean among ourselves but machine can’t do that easily . This makes NLP a challenging task as there are several pentabytes of human generated text and speech. There is the challenge of text(written language) and another challenge of speech(spoken language). Based on these two challenges we can classify NLP as

  • NLP for Textual Data: understanding the natural language written in a text form  and generating meaningful text
  • NLP for Speech: recognizing the spoken words and then converting to text for processing like above.

In another post we will briefly explore NLP in the light of these challenges.

 

Applications of Natural Language Processing(How NLP is used)

NLP has several applications mostly in parallel to how humans use our natural language. For example we easily summarize a speech or a text or we can identify and recognized certain entities when someone is talking with us. These two task can also be performed using natural language processing ( document summarization and named entity recognition respectively).

The various applications can be in diverse fields. These include

  • Text Analysis (Semantic and Syntax)
  • Information extraction
  • Building chatbots and Personal Assistants
  • Sentiment Analysis
  • Classifying Text
  • Topic Modelling
  • Summarization
  • Author Attribution and Literature Forensics
  • Detection of Plagiarism and Spam
  • Search Engine
  • Spell and Grammar Checking
  • etc

 

In the upcoming tutorials we will be doing more exploration on Natural Language Processing.

See you in the next tutorial.

Thanks For Your Time

Jesus Saves

By Jesse E.Agbe(JCharis)

 

 

 

 

 

2 thoughts on “Practical Natural Language Processing”

Leave a Reply to jesse_jcharis Cancel Reply

Your email address will not be published. Required fields are marked *