Natural Language Processing or NLP for short, is a form of artificial intelligence focused on understanding everyday human language. Hence the term Natural language.

In this discourse we will learn about how to do basic text analysis in Julia.

So what is Julia, – Julia language is a next generation programming language that is easy to learn like Python but powerful and fast like C. It is a functional programming language and a high-level, high-performance dynamic programming language for numerical computing that harness multiple dispatch which allows built-in and user-defined functions to be overloaded for different combinations of argument types.

Let starts

Since Julia is quite a young programming language ( 5 years+) there are not a lot of fully developed stand alone native libraries or packages suitable for NLP, but don’t underestimate the power of Julia. Julia has the ability to utilize the features of other programming language such as Python,R,Java,C etc via it <Programming-language>call system eg.PyCall for Python, RCall for R,JavaCall for Java.

Hence we can still import fully developed NLP libraries such as NLTK,word2vec. into Julia to do our natural language processing.

First of all you will need to download this packages

Pkg.add(“TextAnalysis”)
Pkg.clone(“WordTokenizers”)
Pkg.add(“PyCall”) # Helps us to use Python Packages
Pkg.add(“Conda”) # Helps us to use conda(Anaconda) to download Python Packages simply.

In [1]:

using TextAnalysis

In [2]:

mystr = """The best error message is the one that never shows up.
You Learn More From Failure Than From Success. 
The purpose of software engineering is to control complexity, not to create it"""

Out[2]:

"The best error message is the one that never shows up.\nYou Learn More From Failure Than From Success. \nThe purpose of software engineering is to control complexity, not to create it"

In [3]:

# Basic Way
sd1 = Document(mystr)

Out[3]:

A TextAnalysis.StringDocument

In [4]:

# Best Way
sd2 = StringDocument(mystr)

Out[4]:

A TextAnalysis.StringDocument

In [5]:

# Reading from a file
filepath = "samplefile.txt"

Out[5]:

"samplefile.txt"

In [6]:

# Basic Way
filedoc = Document("samplefile.txt")

Out[6]:

A TextAnalysis.FileDocument

In [7]:

# Best Way
fd = FileDocument("samplefile.txt")

Out[7]:

A TextAnalysis.FileDocument

There is also

TokenDocument()
NGramDocument()

In [8]:

# Working  With Our Document
text(sd1)

Out[8]:

"The best error message is the one that never shows up.\nYou Learn More From Failure Than From Success. \nThe purpose of software engineering is to control complexity, not to create it"

What language is it?

In [9]:

# Getting the Base Info About it
language(sd1)

Out[9]:

Languages.EnglishLanguage

Tokenization With TextAnalysis

Word Tokens
Sentence Tokens

In [10]:

text(sd1)

Out[10]:

"The best error message is the one that never shows up.\nYou Learn More From Failure Than From Success. \nThe purpose of software engineering is to control complexity, not to create it"

In [11]:

# Word Tokens from a String Document
tokens(sd1)

Out[11]:

32-element Array{SubString{String},1}:
 "The"        
 "best"       
 "error"      
 "message"    
 "is"         
 "the"        
 "one"        
 "that"       
 "never"      
 "shows"      
 "up."        
 "You"        
 "Learn"      
 ⋮            
 "purpose"    
 "of"         
 "software"   
 "engineering"
 "is"         
 "to"         
 "control"    
 "complexity,"
 "not"        
 "to"         
 "create"     
 "it"

In [12]:

text(fd)

Out[12]:

"The best error message is the one that never shows up.\r\nYou Learn More From Failure Than From Success. \r\nThe purpose of software engineering is to control complexity, not to create it"

In [13]:

# Word Tokens from a File Document
tokens(fd)

Out[13]:

32-element Array{SubString{String},1}:
 "The"        
 "best"       
 "error"      
 "message"    
 "is"         
 "the"        
 "one"        
 "that"       
 "never"      
 "shows"      
 "up."        
 "You"        
 "Learn"      
 ⋮            
 "purpose"    
 "of"         
 "software"   
 "engineering"
 "is"         
 "to"         
 "control"    
 "complexity,"
 "not"        
 "to"         
 "create"     
 "it"

Tokenization With WordTokenizer

Word Tokens
Sentence Tokens

In [15]:

using WordTokenizers

In [16]:

sd1

Out[16]:

A TextAnalysis.StringDocument

In [17]:

# Must convert from TextAnalysis Type to String Type
tokenize(text(sd1))

Out[17]:

33-element Array{SubString{String},1}:
 "The"        
 "best"       
 "error"      
 "message"    
 "is"         
 "the"        
 "one"        
 "that"       
 "never"      
 "shows"      
 "up."        
 "You"        
 "Learn"      
 ⋮            
 "of"         
 "software"   
 "engineering"
 "is"         
 "to"         
 "control"    
 "complexity" 
 ","          
 "not"        
 "to"         
 "create"     
 "it"

In [18]:

tokenize("Hello world this is Julia")

Out[18]:

5-element Array{SubString{String},1}:
 "Hello"
 "world"
 "this" 
 "is"   
 "Julia"

Sentence Tokenization

First, solve the problem. Then, write the code. Fix the cause, not the symptom. Simplicity is the soul of efficiency. Good design adds value faster than it adds cost. In theory, theory and practice are the same. In practice, they’re not. There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.

In [19]:

# Read a file with sentences
sent_files = FileDocument("quotesfiles.txt")

Out[19]:

A TextAnalysis.FileDocument

In [20]:

text(sent_files)

Out[20]:

"\ufeffFirst, solve the problem. Then, write the code.\r\nFix the cause, not the symptom.\r\nSimplicity is the soul of efficiency.\r\nGood design adds value faster than it adds cost.\r\nIn theory, theory and practice are the same. In practice, they’re not.\r\nThere are two ways of constructing a software design.\r\nOne way is to make it so simple that there are obviously no deficiencies.\r\nAnd the other way is to make it so complicated that there are no obvious deficiencies."

In [23]:

# Sentence Tokenization
split_sentences(text(sent_files))

Out[23]:

17-element Array{SubString{String},1}:
 "\ufeffFirst, solve the problem."                                                       
 "Then, write the code."                                                                 
 ""                                                                                      
 "Fix the cause, not the symptom."                                                       
 ""                                                                                      
 "Simplicity is the soul of efficiency."                                                 
 ""                                                                                      
 "Good design adds value faster than it adds cost."                                      
 ""                                                                                      
 "In theory, theory and practice are the same."                                          
 "In practice, they’re not."                                                             
 ""                                                                                      
 "There are two ways of constructing a software design."                                 
 ""                                                                                      
 "One way is to make it so simple that there are obviously no deficiencies."             
 ""                                                                                      
 "And the other way is to make it so complicated that there are no obvious deficiencies."

In [24]:

for sentence in split_sentences(text(sent_files))
    println(sentence)
end

First, solve the problem.
Then, write the code.

Fix the cause, not the symptom.

Simplicity is the soul of efficiency.

Good design adds value faster than it adds cost.

In theory, theory and practice are the same.
In practice, they’re not.

There are two ways of constructing a software design.

One way is to make it so simple that there are obviously no deficiencies.

And the other way is to make it so complicated that there are no obvious deficiencies.

In [25]:

for sentence in split_sentences(text(sent_files))
    wordtokens = tokenize(sentence)
    println("Word token=> $wordtokens")
end

Word token=> SubString{String}["\ufeffFirst", ",", "solve", "the", "problem", "."]
Word token=> SubString{String}["Then", ",", "write", "the", "code", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["Fix", "the", "cause", ",", "not", "the", "symptom", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["Simplicity", "is", "the", "soul", "of", "efficiency", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["Good", "design", "adds", "value", "faster", "than", "it", "adds", "cost", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["In", "theory", ",", "theory", "and", "practice", "are", "the", "same", "."]
Word token=> SubString{String}["In", "practice", ",", "they", "’", "re", "not", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["There", "are", "two", "ways", "of", "constructing", "a", "software", "design", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["One", "way", "is", "to", "make", "it", "so", "simple", "that", "there", "are", "obviously", "no", "deficiencies", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["And", "the", "other", "way", "is", "to", "make", "it", "so", "complicated", "that", "there", "are", "no", "obvious", "deficiencies", "."]

You can Find Video Here

N-Grams

Combinations of multiple words
Useful for creating features during language modeling

In [26]:

mystr

Out[26]:

"The best error message is the one that never shows up.\nYou Learn More From Failure Than From Success. \nThe purpose of software engineering is to control complexity, not to create it"

In [27]:

sd3 = StringDocument(mystr)

Out[27]:

A TextAnalysis.StringDocument

In [28]:

# Unigram
ngrams(sd3)

Out[28]:

Dict{SubString{String},Int64} with 28 entries:
  "engineering" => 1
  "Learn"       => 1
  "is"          => 2
  "From"        => 2
  "not"         => 1
  "one"         => 1
  "never"       => 1
  "up."         => 1
  "complexity," => 1
  "create"      => 1
  "software"    => 1
  "that"        => 1
  "it"          => 1
  "You"         => 1
  "Failure"     => 1
  "best"        => 1
  "shows"       => 1
  "purpose"     => 1
  "error"       => 1
  "the"         => 1
  "Success."    => 1
  "The"         => 2
  "Than"        => 1
  "of"          => 1
  "More"        => 1
  ⋮             => ⋮

In [29]:

# Bigrams
ngrams(sd3,2)

Out[29]:

Dict{AbstractString,Int64} with 59 entries:
  "that never"     => 1
  "is to"          => 1
  "create"         => 1
  "that"           => 1
  "best"           => 1
  "Than From"      => 1
  "shows up."      => 1
  "purpose"        => 1
  "of"             => 1
  "purpose of"     => 1
  "More"           => 1
  "to"             => 2
  "the one"        => 1
  "is"             => 2
  "never"          => 1
  "complexity,"    => 1
  "software"       => 1
  "one that"       => 1
  "shows"          => 1
  "From Success."  => 1
  "The purpose"    => 1
  "message is"     => 1
  "engineering is" => 1
  "engineering"    => 1
  "not"            => 1
  ⋮                => ⋮

In [41]:

# Trigram
for trigram in ngrams(sd3,3)
    println(trigram)
end

Pair{AbstractString,Int64}("that never", 1)
Pair{AbstractString,Int64}("Success. The purpose", 1)
Pair{AbstractString,Int64}("You Learn More", 1)
Pair{AbstractString,Int64}("From Failure Than", 1)
Pair{AbstractString,Int64}("is to", 1)
Pair{AbstractString,Int64}("purpose of software", 1)
Pair{AbstractString,Int64}("create", 1)
Pair{AbstractString,Int64}("the one that", 1)
Pair{AbstractString,Int64}("software engineering is", 1)
Pair{AbstractString,Int64}("that", 1)
Pair{AbstractString,Int64}("control complexity, not", 1)
Pair{AbstractString,Int64}("best", 1)
Pair{AbstractString,Int64}("Than From", 1)
Pair{AbstractString,Int64}("shows up.", 1)
Pair{AbstractString,Int64}("purpose", 1)
Pair{AbstractString,Int64}("of", 1)
Pair{AbstractString,Int64}("purpose of", 1)
Pair{AbstractString,Int64}("More", 1)
Pair{AbstractString,Int64}("to", 2)
Pair{AbstractString,Int64}("the one", 1)
Pair{AbstractString,Int64}("is", 2)
Pair{AbstractString,Int64}("is the one", 1)
Pair{AbstractString,Int64}("one that never", 1)
Pair{AbstractString,Int64}("never", 1)
Pair{AbstractString,Int64}("complexity,", 1)
Pair{AbstractString,Int64}("message is the", 1)
Pair{AbstractString,Int64}("software", 1)
Pair{AbstractString,Int64}("error message is", 1)
Pair{AbstractString,Int64}("shows up. You", 1)
Pair{AbstractString,Int64}("one that", 1)
Pair{AbstractString,Int64}("shows", 1)
Pair{AbstractString,Int64}("that never shows", 1)
Pair{AbstractString,Int64}("From Success.", 1)
Pair{AbstractString,Int64}("The purpose", 1)
Pair{AbstractString,Int64}("to control complexity,", 1)
Pair{AbstractString,Int64}("message is", 1)
Pair{AbstractString,Int64}("engineering is", 1)
Pair{AbstractString,Int64}("engineering", 1)
Pair{AbstractString,Int64}("not", 1)
Pair{AbstractString,Int64}("best error message", 1)
Pair{AbstractString,Int64}("is to control", 1)
Pair{AbstractString,Int64}("not to create", 1)
Pair{AbstractString,Int64}("to create it", 1)
Pair{AbstractString,Int64}("The best", 1)
Pair{AbstractString,Int64}("Failure Than From", 1)
Pair{AbstractString,Int64}("software engineering", 1)
Pair{AbstractString,Int64}("best error", 1)
Pair{AbstractString,Int64}("More From", 1)
Pair{AbstractString,Int64}("From Failure", 1)
Pair{AbstractString,Int64}("not to", 1)
Pair{AbstractString,Int64}("Failure", 1)
Pair{AbstractString,Int64}("up. You", 1)
Pair{AbstractString,Int64}("of software", 1)
Pair{AbstractString,Int64}("error", 1)
Pair{AbstractString,Int64}("error message", 1)
Pair{AbstractString,Int64}("Failure Than", 1)
Pair{AbstractString,Int64}("Success. The", 1)
Pair{AbstractString,Int64}("The", 2)
Pair{AbstractString,Int64}("up. You Learn", 1)
Pair{AbstractString,Int64}("The purpose of", 1)
Pair{AbstractString,Int64}("complexity, not to", 1)
Pair{AbstractString,Int64}("message", 1)
Pair{AbstractString,Int64}("create it", 1)
Pair{AbstractString,Int64}("never shows up.", 1)
Pair{AbstractString,Int64}("control complexity,", 1)
Pair{AbstractString,Int64}("control", 1)
Pair{AbstractString,Int64}("The best error", 1)
Pair{AbstractString,Int64}("Learn", 1)
Pair{AbstractString,Int64}("From", 2)
Pair{AbstractString,Int64}("one", 1)
Pair{AbstractString,Int64}("up.", 1)
Pair{AbstractString,Int64}("Learn More From", 1)
Pair{AbstractString,Int64}("to create", 1)
Pair{AbstractString,Int64}("of software engineering", 1)
Pair{AbstractString,Int64}("Learn More", 1)
Pair{AbstractString,Int64}("it", 1)
Pair{AbstractString,Int64}("You", 1)
Pair{AbstractString,Int64}("You Learn", 1)
Pair{AbstractString,Int64}("to control", 1)
Pair{AbstractString,Int64}("the", 1)
Pair{AbstractString,Int64}("Success.", 1)
Pair{AbstractString,Int64}("More From Failure", 1)
Pair{AbstractString,Int64}("Than From Success.", 1)
Pair{AbstractString,Int64}("is the", 1)
Pair{AbstractString,Int64}("Than", 1)
Pair{AbstractString,Int64}("From Success. The", 1)
Pair{AbstractString,Int64}("complexity, not", 1)
Pair{AbstractString,Int64}("never shows", 1)
Pair{AbstractString,Int64}("engineering is to", 1)

In [30]:

# Creating an NGram 
my_ngrams = Dict{String, Int}("To" => 1, "be" => 2,
                                "or" => 1, "not" => 1,
                                "to" => 1, "be..." => 1)

Out[30]:

Dict{String,Int64} with 6 entries:
  "or"    => 1
  "be..." => 1
  "not"   => 1
  "to"    => 1
  "To"    => 1
  "be"    => 2

In [31]:

ngd = NGramDocument(my_ngrams)

Out[31]:

A TextAnalysis.NGramDocument

In [32]:

# Detecting Which NGram it is
ngram_complexity(ngd)

Out[32]:

In [33]:

my_ngrams2 = Dict{AbstractString,Int64}(
  "that never" => 1,"is to" => 1,"create" => 1,"that" => 1,"best" => 1,"Than From" => 1,"shows up." => 1,
    "purpose" => 1,"of" => 1,"purpose of" => 1,"More" => 1,"to" => 2,"the one" => 1,
    "is" => 2,"never" => 1,"complexity,"=> 1,"software" => 1,"one that" => 1)

Out[33]:

Dict{AbstractString,Int64} with 18 entries:
  "that never"  => 1
  "shows up."   => 1
  "the one"     => 1
  "purpose"     => 1
  "is"          => 2
  "never"       => 1
  "complexity," => 1
  "is to"       => 1
  "of"          => 1
  "create"      => 1
  "purpose of"  => 1
  "software"    => 1
  "that"        => 1
  "More"        => 1
  "one that"    => 1
  "to"          => 2
  "best"        => 1
  "Than From"   => 1

In [34]:

ngd2 = NGramDocument(my_ngrams2)

Out[34]:

A TextAnalysis.NGramDocument

In [35]:

ngram_complexity(ngd2)

Out[35]:

Video For N-Grams Here

Using Other Libraries for Performing NLP in Julia

First you will need to use pip to install nltk and add it to your system.

pip install nltk

Open your Python REPL and type the following:

import nltk
nltk.download

A Dialogue box will pop up and you can select the available options to download the modules of the NLTK library.

After that you can follow with this in your Julia Environment

In Julia

using Conda
Conda.add(“nltk”)

Parts of Speech Tagging In Julia

We will be using NLTK.tags via PyCall for this task.

In [38]:

using PyCall

In [39]:

# Importing Part of Speech Tag from NLTK
@pyimport nltk.tag as ptag

In [41]:

# Using TextAnalysis to tokenize or WordTokenizer to do the same
ex = StringDocument("Julia is very fast but it is still young")

Out[41]:

A TextAnalysis.StringDocument

In [43]:

# TextAnalysis.tokens()
mytokens = tokens(ex)

Out[43]:

9-element Array{SubString{String},1}:
 "Julia"
 "is"   
 "very" 
 "fast" 
 "but"  
 "it"   
 "is"   
 "still"
 "young"

In [44]:

# Using NLTK tags for finding the part of speech of our tokens
ptag.pos_tag(mytokens)

Out[44]:

9-element Array{Tuple{String,String},1}:
 ("Julia", "NNP")
 ("is", "VBZ")   
 ("very", "RB")  
 ("fast", "RB")  
 ("but", "CC")   
 ("it", "PRP")   
 ("is", "VBZ")   
 ("still", "RB") 
 ("young", "JJ")

Video Tutorial For Part of Speech Tagging

Word Inflection == Word Formation by adding to base/root word

Stemming (Basics) stem!()
Lemmatizing
- How do we do these?
- PyCall To the Rescue

In [45]:

whos(TextAnalysis)

              AbstractDocument     92 bytes  DataType
                        Corpus     40 bytes  UnionAll
               DirectoryCorpus      0 bytes  TextAnalysis.#DirectoryCorpus
                      Document      0 bytes  TextAnalysis.#Document
            DocumentTermMatrix    136 bytes  DataType
                  FileDocument    124 bytes  DataType
               GenericDocument     48 bytes  Union
                 NGramDocument    136 bytes  DataType
                       Stemmer    136 bytes  DataType
                StringDocument    124 bytes  DataType
                  TextAnalysis  23658 KB     Module
              TextHashFunction    124 bytes  DataType
                 TokenDocument    124 bytes  DataType
                        author      0 bytes  TextAnalysis.#author
                       author!      0 bytes  TextAnalysis.#author!
                   cardinality      0 bytes  TextAnalysis.#cardinality
                     documents      0 bytes  TextAnalysis.#documents
                           dtm      0 bytes  TextAnalysis.#dtm
                           dtv      0 bytes  TextAnalysis.#dtv
                      each_dtv      0 bytes  TextAnalysis.#each_dtv
                 each_hash_dtv      0 bytes  TextAnalysis.#each_hash_dtv
                frequent_terms      0 bytes  TextAnalysis.#frequent_terms
                      hash_dtm      0 bytes  TextAnalysis.#hash_dtm
                      hash_dtv      0 bytes  TextAnalysis.#hash_dtv
                 hash_function      0 bytes  TextAnalysis.#hash_function
                hash_function!      0 bytes  TextAnalysis.#hash_function!
                      hash_tdm      0 bytes  TextAnalysis.#hash_tdm
                    index_hash      0 bytes  TextAnalysis.#index_hash
                    index_size      0 bytes  TextAnalysis.#index_size
                 inverse_index      0 bytes  TextAnalysis.#inverse_index
                      language      0 bytes  TextAnalysis.#language
                     language!      0 bytes  TextAnalysis.#language!
                           lda      0 bytes  TextAnalysis.#lda
             lexical_frequency      0 bytes  TextAnalysis.#lexical_frequency
                       lexicon      0 bytes  TextAnalysis.#lexicon
                  lexicon_size      0 bytes  TextAnalysis.#lexicon_size
                           lsa      0 bytes  TextAnalysis.#lsa
                          name      0 bytes  Languages.#name
                         name!      0 bytes  TextAnalysis.#name!
              ngram_complexity      0 bytes  TextAnalysis.#ngram_complexity
                        ngrams      0 bytes  TextAnalysis.#ngrams
                       ngrams!      0 bytes  TextAnalysis.#ngrams!
                      prepare!      0 bytes  TextAnalysis.#prepare!
              remove_articles!      0 bytes  TextAnalysis.#remove_articles!
                   remove_case      0 bytes  TextAnalysis.#remove_case
                  remove_case!      0 bytes  TextAnalysis.#remove_case!
           remove_corrupt_utf8      0 bytes  TextAnalysis.#remove_corrupt_utf8
          remove_corrupt_utf8!      0 bytes  TextAnalysis.#remove_corrupt_utf8!
     remove_definite_articles!      0 bytes  TextAnalysis.#remove_definite_arti…
        remove_frequent_terms!      0 bytes  TextAnalysis.#remove_frequent_term…
              remove_html_tags      0 bytes  TextAnalysis.#remove_html_tags
             remove_html_tags!      0 bytes  TextAnalysis.#remove_html_tags!
   remove_indefinite_articles!      0 bytes  TextAnalysis.#remove_indefinite_ar…
            remove_nonletters!      0 bytes  TextAnalysis.#remove_nonletters!
               remove_numbers!      0 bytes  TextAnalysis.#remove_numbers!
               remove_patterns      0 bytes  TextAnalysis.#remove_patterns
              remove_patterns!      0 bytes  TextAnalysis.#remove_patterns!
          remove_prepositions!      0 bytes  TextAnalysis.#remove_prepositions!
              remove_pronouns!      0 bytes  TextAnalysis.#remove_pronouns!
           remove_punctuation!      0 bytes  TextAnalysis.#remove_punctuation!
          remove_sparse_terms!      0 bytes  TextAnalysis.#remove_sparse_terms!
            remove_stop_words!      0 bytes  TextAnalysis.#remove_stop_words!
            remove_whitespace!      0 bytes  TextAnalysis.#remove_whitespace!
                 remove_words!      0 bytes  TextAnalysis.#remove_words!
                  sparse_terms      0 bytes  TextAnalysis.#sparse_terms
                  standardize!      0 bytes  TextAnalysis.#standardize!
                          stem      0 bytes  TextAnalysis.#stem
                         stem!      0 bytes  TextAnalysis.#stem!
                    stem_words      4 bytes  UInt32
                 stemmer_types      0 bytes  TextAnalysis.#stemmer_types
                strip_articles      4 bytes  UInt32
                    strip_case      4 bytes  UInt32
            strip_corrupt_utf8      4 bytes  UInt32
       strip_definite_articles      4 bytes  UInt32
          strip_frequent_terms      4 bytes  UInt32
               strip_html_tags      4 bytes  UInt32
     strip_indefinite_articles      4 bytes  UInt32
             strip_non_letters      4 bytes  UInt32
                 strip_numbers      4 bytes  UInt32
                strip_patterns      4 bytes  UInt32
            strip_prepositions      4 bytes  UInt32
                strip_pronouns      4 bytes  UInt32
             strip_punctuation      4 bytes  UInt32
            strip_sparse_terms      4 bytes  UInt32
               strip_stopwords      4 bytes  UInt32
              strip_whitespace      4 bytes  UInt32
            tag_part_of_speech      4 bytes  UInt32
                      tag_pos!      0 bytes  TextAnalysis.#tag_pos!
                           tdm      0 bytes  TextAnalysis.#tdm
                          text      0 bytes  TextAnalysis.#text
                         text!      0 bytes  TextAnalysis.#text!
                            tf      0 bytes  TextAnalysis.#tf
                           tf!      0 bytes  TextAnalysis.#tf!
                        tf_idf      0 bytes  TextAnalysis.#tf_idf
                       tf_idf!      0 bytes  TextAnalysis.#tf_idf!
                     timestamp      0 bytes  TextAnalysis.#timestamp
                    timestamp!      0 bytes  TextAnalysis.#timestamp!
                        tokens      0 bytes  TextAnalysis.#tokens
                       tokens!      0 bytes  TextAnalysis.#tokens!
         update_inverse_index!      0 bytes  TextAnalysis.#update_inverse_index!
               update_lexicon!      0 bytes  TextAnalysis.#update_lexicon!

Stay tuned for More! Thanks

Natural Language Processing In Julia (Text Analysis)