NLP In Julia

Natural Language Processing In Julia (Text Analysis)

Natural Language Processing or NLP for short, is a form of artificial intelligence focused on understanding everyday human language. Hence the term Natural language.

In this discourse we will learn about how to do basic text analysis in Julia.

So what is Julia, – Julia language is a next generation programming language that is easy to learn like Python but powerful and fast like C. It is a functional programming language and  a high-level, high-performance dynamic programming language for numerical computing that harness  multiple dispatch which allows built-in and user-defined functions to be overloaded for different combinations of argument types.

Let starts

Since Julia is quite a young programming language ( 5 years+)  there are not a lot of fully developed stand alone  native libraries or packages suitable for NLP, but don’t underestimate the power of Julia. Julia has the ability to utilize the features of other programming language such as Python,R,Java,C etc via it <Programming-language>call system eg.PyCall for Python, RCall for R,JavaCall for Java.

Hence we can still import fully developed NLP libraries such as NLTK,word2vec. into Julia to do our natural language processing.

First of all you will need to download this packages

  • Pkg.add(“TextAnalysis”)
  • Pkg.clone(“WordTokenizers”)
  • Pkg.add(“PyCall”) # Helps us to use Python Packages
  • Pkg.add(“Conda”) # Helps us to use conda(Anaconda) to download Python Packages simply.
In [1]:
using TextAnalysis
In [2]:
mystr = """The best error message is the one that never shows up.
You Learn More From Failure Than From Success. 
The purpose of software engineering is to control complexity, not to create it"""
Out[2]:
"The best error message is the one that never shows up.\nYou Learn More From Failure Than From Success. \nThe purpose of software engineering is to control complexity, not to create it"
In [3]:
# Basic Way
sd1 = Document(mystr)
Out[3]:
A TextAnalysis.StringDocument
In [4]:
# Best Way
sd2 = StringDocument(mystr)
Out[4]:
A TextAnalysis.StringDocument
In [5]:
# Reading from a file
filepath = "samplefile.txt"
Out[5]:
"samplefile.txt"
In [6]:
# Basic Way
filedoc = Document("samplefile.txt")
Out[6]:
A TextAnalysis.FileDocument
In [7]:
# Best Way
fd = FileDocument("samplefile.txt")
Out[7]:
A TextAnalysis.FileDocument

There is also

  • TokenDocument()
  • NGramDocument()
In [8]:
# Working  With Our Document
text(sd1)
Out[8]:
"The best error message is the one that never shows up.\nYou Learn More From Failure Than From Success. \nThe purpose of software engineering is to control complexity, not to create it"

What language is it?

In [9]:
# Getting the Base Info About it
language(sd1)
Out[9]:
Languages.EnglishLanguage

Tokenization With TextAnalysis

  • Word Tokens
  • Sentence Tokens
In [10]:
text(sd1)
Out[10]:
"The best error message is the one that never shows up.\nYou Learn More From Failure Than From Success. \nThe purpose of software engineering is to control complexity, not to create it"
In [11]:
# Word Tokens from a String Document
tokens(sd1)
Out[11]:
32-element Array{SubString{String},1}:
 "The"        
 "best"       
 "error"      
 "message"    
 "is"         
 "the"        
 "one"        
 "that"       
 "never"      
 "shows"      
 "up."        
 "You"        
 "Learn"      
 ⋮            
 "purpose"    
 "of"         
 "software"   
 "engineering"
 "is"         
 "to"         
 "control"    
 "complexity,"
 "not"        
 "to"         
 "create"     
 "it"
In [12]:
text(fd)
Out[12]:
"The best error message is the one that never shows up.\r\nYou Learn More From Failure Than From Success. \r\nThe purpose of software engineering is to control complexity, not to create it"
In [13]:
# Word Tokens from a File Document
tokens(fd)
Out[13]:
32-element Array{SubString{String},1}:
 "The"        
 "best"       
 "error"      
 "message"    
 "is"         
 "the"        
 "one"        
 "that"       
 "never"      
 "shows"      
 "up."        
 "You"        
 "Learn"      
 ⋮            
 "purpose"    
 "of"         
 "software"   
 "engineering"
 "is"         
 "to"         
 "control"    
 "complexity,"
 "not"        
 "to"         
 "create"     
 "it"

Tokenization With WordTokenizer

  • Word Tokens
  • Sentence Tokens
In [15]:
using WordTokenizers
In [16]:
sd1
Out[16]:
A TextAnalysis.StringDocument
In [17]:
# Must convert from TextAnalysis Type to String Type
tokenize(text(sd1))
Out[17]:
33-element Array{SubString{String},1}:
 "The"        
 "best"       
 "error"      
 "message"    
 "is"         
 "the"        
 "one"        
 "that"       
 "never"      
 "shows"      
 "up."        
 "You"        
 "Learn"      
 ⋮            
 "of"         
 "software"   
 "engineering"
 "is"         
 "to"         
 "control"    
 "complexity" 
 ","          
 "not"        
 "to"         
 "create"     
 "it"
In [18]:
tokenize("Hello world this is Julia")
Out[18]:
5-element Array{SubString{String},1}:
 "Hello"
 "world"
 "this" 
 "is"   
 "Julia"

Sentence Tokenization

First, solve the problem. Then, write the code. Fix the cause, not the symptom. Simplicity is the soul of efficiency. Good design adds value faster than it adds cost. In theory, theory and practice are the same. In practice, they’re not. There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.

In [19]:
# Read a file with sentences
sent_files = FileDocument("quotesfiles.txt")
Out[19]:
A TextAnalysis.FileDocument
In [20]:
text(sent_files)
Out[20]:
"\ufeffFirst, solve the problem. Then, write the code.\r\nFix the cause, not the symptom.\r\nSimplicity is the soul of efficiency.\r\nGood design adds value faster than it adds cost.\r\nIn theory, theory and practice are the same. In practice, they’re not.\r\nThere are two ways of constructing a software design.\r\nOne way is to make it so simple that there are obviously no deficiencies.\r\nAnd the other way is to make it so complicated that there are no obvious deficiencies."
In [23]:
# Sentence Tokenization
split_sentences(text(sent_files))
Out[23]:
17-element Array{SubString{String},1}:
 "\ufeffFirst, solve the problem."                                                       
 "Then, write the code."                                                                 
 ""                                                                                      
 "Fix the cause, not the symptom."                                                       
 ""                                                                                      
 "Simplicity is the soul of efficiency."                                                 
 ""                                                                                      
 "Good design adds value faster than it adds cost."                                      
 ""                                                                                      
 "In theory, theory and practice are the same."                                          
 "In practice, they’re not."                                                             
 ""                                                                                      
 "There are two ways of constructing a software design."                                 
 ""                                                                                      
 "One way is to make it so simple that there are obviously no deficiencies."             
 ""                                                                                      
 "And the other way is to make it so complicated that there are no obvious deficiencies."
In [24]:
for sentence in split_sentences(text(sent_files))
    println(sentence)
end
First, solve the problem.
Then, write the code.

Fix the cause, not the symptom.

Simplicity is the soul of efficiency.

Good design adds value faster than it adds cost.

In theory, theory and practice are the same.
In practice, they’re not.

There are two ways of constructing a software design.

One way is to make it so simple that there are obviously no deficiencies.

And the other way is to make it so complicated that there are no obvious deficiencies.
In [25]:
for sentence in split_sentences(text(sent_files))
    wordtokens = tokenize(sentence)
    println("Word token=> $wordtokens")
end
Word token=> SubString{String}["\ufeffFirst", ",", "solve", "the", "problem", "."]
Word token=> SubString{String}["Then", ",", "write", "the", "code", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["Fix", "the", "cause", ",", "not", "the", "symptom", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["Simplicity", "is", "the", "soul", "of", "efficiency", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["Good", "design", "adds", "value", "faster", "than", "it", "adds", "cost", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["In", "theory", ",", "theory", "and", "practice", "are", "the", "same", "."]
Word token=> SubString{String}["In", "practice", ",", "they", "’", "re", "not", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["There", "are", "two", "ways", "of", "constructing", "a", "software", "design", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["One", "way", "is", "to", "make", "it", "so", "simple", "that", "there", "are", "obviously", "no", "deficiencies", "."]
Word token=> SubString{String}[]
Word token=> SubString{String}["And", "the", "other", "way", "is", "to", "make", "it", "so", "complicated", "that", "there", "are", "no", "obvious", "deficiencies", "."]
 You can Find Video Here

N-Grams

  • Combinations of multiple words
  • Useful for creating features during language modeling
In [26]:
mystr
Out[26]:
"The best error message is the one that never shows up.\nYou Learn More From Failure Than From Success. \nThe purpose of software engineering is to control complexity, not to create it"
In [27]:
sd3 = StringDocument(mystr)
Out[27]:
A TextAnalysis.StringDocument
In [28]:
# Unigram
ngrams(sd3)
Out[28]:
Dict{SubString{String},Int64} with 28 entries:
  "engineering" => 1
  "Learn"       => 1
  "is"          => 2
  "From"        => 2
  "not"         => 1
  "one"         => 1
  "never"       => 1
  "up."         => 1
  "complexity," => 1
  "create"      => 1
  "software"    => 1
  "that"        => 1
  "it"          => 1
  "You"         => 1
  "Failure"     => 1
  "best"        => 1
  "shows"       => 1
  "purpose"     => 1
  "error"       => 1
  "the"         => 1
  "Success."    => 1
  "The"         => 2
  "Than"        => 1
  "of"          => 1
  "More"        => 1
  ⋮             => ⋮
In [29]:
# Bigrams
ngrams(sd3,2)
Out[29]:
Dict{AbstractString,Int64} with 59 entries:
  "that never"     => 1
  "is to"          => 1
  "create"         => 1
  "that"           => 1
  "best"           => 1
  "Than From"      => 1
  "shows up."      => 1
  "purpose"        => 1
  "of"             => 1
  "purpose of"     => 1
  "More"           => 1
  "to"             => 2
  "the one"        => 1
  "is"             => 2
  "never"          => 1
  "complexity,"    => 1
  "software"       => 1
  "one that"       => 1
  "shows"          => 1
  "From Success."  => 1
  "The purpose"    => 1
  "message is"     => 1
  "engineering is" => 1
  "engineering"    => 1
  "not"            => 1
  ⋮                => ⋮
In [41]:
# Trigram
for trigram in ngrams(sd3,3)
    println(trigram)
end
Pair{AbstractString,Int64}("that never", 1)
Pair{AbstractString,Int64}("Success. The purpose", 1)
Pair{AbstractString,Int64}("You Learn More", 1)
Pair{AbstractString,Int64}("From Failure Than", 1)
Pair{AbstractString,Int64}("is to", 1)
Pair{AbstractString,Int64}("purpose of software", 1)
Pair{AbstractString,Int64}("create", 1)
Pair{AbstractString,Int64}("the one that", 1)
Pair{AbstractString,Int64}("software engineering is", 1)
Pair{AbstractString,Int64}("that", 1)
Pair{AbstractString,Int64}("control complexity, not", 1)
Pair{AbstractString,Int64}("best", 1)
Pair{AbstractString,Int64}("Than From", 1)
Pair{AbstractString,Int64}("shows up.", 1)
Pair{AbstractString,Int64}("purpose", 1)
Pair{AbstractString,Int64}("of", 1)
Pair{AbstractString,Int64}("purpose of", 1)
Pair{AbstractString,Int64}("More", 1)
Pair{AbstractString,Int64}("to", 2)
Pair{AbstractString,Int64}("the one", 1)
Pair{AbstractString,Int64}("is", 2)
Pair{AbstractString,Int64}("is the one", 1)
Pair{AbstractString,Int64}("one that never", 1)
Pair{AbstractString,Int64}("never", 1)
Pair{AbstractString,Int64}("complexity,", 1)
Pair{AbstractString,Int64}("message is the", 1)
Pair{AbstractString,Int64}("software", 1)
Pair{AbstractString,Int64}("error message is", 1)
Pair{AbstractString,Int64}("shows up. You", 1)
Pair{AbstractString,Int64}("one that", 1)
Pair{AbstractString,Int64}("shows", 1)
Pair{AbstractString,Int64}("that never shows", 1)
Pair{AbstractString,Int64}("From Success.", 1)
Pair{AbstractString,Int64}("The purpose", 1)
Pair{AbstractString,Int64}("to control complexity,", 1)
Pair{AbstractString,Int64}("message is", 1)
Pair{AbstractString,Int64}("engineering is", 1)
Pair{AbstractString,Int64}("engineering", 1)
Pair{AbstractString,Int64}("not", 1)
Pair{AbstractString,Int64}("best error message", 1)
Pair{AbstractString,Int64}("is to control", 1)
Pair{AbstractString,Int64}("not to create", 1)
Pair{AbstractString,Int64}("to create it", 1)
Pair{AbstractString,Int64}("The best", 1)
Pair{AbstractString,Int64}("Failure Than From", 1)
Pair{AbstractString,Int64}("software engineering", 1)
Pair{AbstractString,Int64}("best error", 1)
Pair{AbstractString,Int64}("More From", 1)
Pair{AbstractString,Int64}("From Failure", 1)
Pair{AbstractString,Int64}("not to", 1)
Pair{AbstractString,Int64}("Failure", 1)
Pair{AbstractString,Int64}("up. You", 1)
Pair{AbstractString,Int64}("of software", 1)
Pair{AbstractString,Int64}("error", 1)
Pair{AbstractString,Int64}("error message", 1)
Pair{AbstractString,Int64}("Failure Than", 1)
Pair{AbstractString,Int64}("Success. The", 1)
Pair{AbstractString,Int64}("The", 2)
Pair{AbstractString,Int64}("up. You Learn", 1)
Pair{AbstractString,Int64}("The purpose of", 1)
Pair{AbstractString,Int64}("complexity, not to", 1)
Pair{AbstractString,Int64}("message", 1)
Pair{AbstractString,Int64}("create it", 1)
Pair{AbstractString,Int64}("never shows up.", 1)
Pair{AbstractString,Int64}("control complexity,", 1)
Pair{AbstractString,Int64}("control", 1)
Pair{AbstractString,Int64}("The best error", 1)
Pair{AbstractString,Int64}("Learn", 1)
Pair{AbstractString,Int64}("From", 2)
Pair{AbstractString,Int64}("one", 1)
Pair{AbstractString,Int64}("up.", 1)
Pair{AbstractString,Int64}("Learn More From", 1)
Pair{AbstractString,Int64}("to create", 1)
Pair{AbstractString,Int64}("of software engineering", 1)
Pair{AbstractString,Int64}("Learn More", 1)
Pair{AbstractString,Int64}("it", 1)
Pair{AbstractString,Int64}("You", 1)
Pair{AbstractString,Int64}("You Learn", 1)
Pair{AbstractString,Int64}("to control", 1)
Pair{AbstractString,Int64}("the", 1)
Pair{AbstractString,Int64}("Success.", 1)
Pair{AbstractString,Int64}("More From Failure", 1)
Pair{AbstractString,Int64}("Than From Success.", 1)
Pair{AbstractString,Int64}("is the", 1)
Pair{AbstractString,Int64}("Than", 1)
Pair{AbstractString,Int64}("From Success. The", 1)
Pair{AbstractString,Int64}("complexity, not", 1)
Pair{AbstractString,Int64}("never shows", 1)
Pair{AbstractString,Int64}("engineering is to", 1)
In [30]:
# Creating an NGram 
my_ngrams = Dict{String, Int}("To" => 1, "be" => 2,
                                "or" => 1, "not" => 1,
                                "to" => 1, "be..." => 1)
Out[30]:
Dict{String,Int64} with 6 entries:
  "or"    => 1
  "be..." => 1
  "not"   => 1
  "to"    => 1
  "To"    => 1
  "be"    => 2
In [31]:
ngd = NGramDocument(my_ngrams)
Out[31]:
A TextAnalysis.NGramDocument
In [32]:
# Detecting Which NGram it is
ngram_complexity(ngd)
Out[32]:
1
In [33]:
my_ngrams2 = Dict{AbstractString,Int64}(
  "that never" => 1,"is to" => 1,"create" => 1,"that" => 1,"best" => 1,"Than From" => 1,"shows up." => 1,
    "purpose" => 1,"of" => 1,"purpose of" => 1,"More" => 1,"to" => 2,"the one" => 1,
    "is" => 2,"never" => 1,"complexity,"=> 1,"software" => 1,"one that" => 1)
Out[33]:
Dict{AbstractString,Int64} with 18 entries:
  "that never"  => 1
  "shows up."   => 1
  "the one"     => 1
  "purpose"     => 1
  "is"          => 2
  "never"       => 1
  "complexity," => 1
  "is to"       => 1
  "of"          => 1
  "create"      => 1
  "purpose of"  => 1
  "software"    => 1
  "that"        => 1
  "More"        => 1
  "one that"    => 1
  "to"          => 2
  "best"        => 1
  "Than From"   => 1
In [34]:
ngd2 = NGramDocument(my_ngrams2)
Out[34]:
A TextAnalysis.NGramDocument
In [35]:
ngram_complexity(ngd2)
Out[35]:
1
 Video For N-Grams Here

Using Other Libraries for Performing NLP in Julia

First you will need to use pip to install nltk and add it to your system.
  • pip install nltk

Open your Python REPL and type the following:

  • import nltk
  • nltk.download

A Dialogue box will pop up and you can select the available options to download the modules of the NLTK library.

After that you can follow with this in your Julia Environment

In Julia

  • using Conda
  • Conda.add(“nltk”)

Parts of Speech Tagging In Julia

  • We will be using NLTK.tags via PyCall for this task.
In [38]:
using PyCall
In [39]:
# Importing Part of Speech Tag from NLTK
@pyimport nltk.tag as ptag
In [41]:
# Using TextAnalysis to tokenize or WordTokenizer to do the same
ex = StringDocument("Julia is very fast but it is still young")
Out[41]:
A TextAnalysis.StringDocument
In [43]:
# TextAnalysis.tokens()
mytokens = tokens(ex)
Out[43]:
9-element Array{SubString{String},1}:
 "Julia"
 "is"   
 "very" 
 "fast" 
 "but"  
 "it"   
 "is"   
 "still"
 "young"
In [44]:
# Using NLTK tags for finding the part of speech of our tokens
ptag.pos_tag(mytokens)
Out[44]:
9-element Array{Tuple{String,String},1}:
 ("Julia", "NNP")
 ("is", "VBZ")   
 ("very", "RB")  
 ("fast", "RB")  
 ("but", "CC")   
 ("it", "PRP")   
 ("is", "VBZ")   
 ("still", "RB") 
 ("young", "JJ")
Video Tutorial For Part of Speech Tagging

Word Inflection == Word Formation by adding to base/root word

  • Stemming (Basics) stem!()
  • Lemmatizing
    • How do we do these?
    • PyCall To the Rescue
In [45]:
whos(TextAnalysis)
              AbstractDocument     92 bytes  DataType
                        Corpus     40 bytes  UnionAll
               DirectoryCorpus      0 bytes  TextAnalysis.#DirectoryCorpus
                      Document      0 bytes  TextAnalysis.#Document
            DocumentTermMatrix    136 bytes  DataType
                  FileDocument    124 bytes  DataType
               GenericDocument     48 bytes  Union
                 NGramDocument    136 bytes  DataType
                       Stemmer    136 bytes  DataType
                StringDocument    124 bytes  DataType
                  TextAnalysis  23658 KB     Module
              TextHashFunction    124 bytes  DataType
                 TokenDocument    124 bytes  DataType
                        author      0 bytes  TextAnalysis.#author
                       author!      0 bytes  TextAnalysis.#author!
                   cardinality      0 bytes  TextAnalysis.#cardinality
                     documents      0 bytes  TextAnalysis.#documents
                           dtm      0 bytes  TextAnalysis.#dtm
                           dtv      0 bytes  TextAnalysis.#dtv
                      each_dtv      0 bytes  TextAnalysis.#each_dtv
                 each_hash_dtv      0 bytes  TextAnalysis.#each_hash_dtv
                frequent_terms      0 bytes  TextAnalysis.#frequent_terms
                      hash_dtm      0 bytes  TextAnalysis.#hash_dtm
                      hash_dtv      0 bytes  TextAnalysis.#hash_dtv
                 hash_function      0 bytes  TextAnalysis.#hash_function
                hash_function!      0 bytes  TextAnalysis.#hash_function!
                      hash_tdm      0 bytes  TextAnalysis.#hash_tdm
                    index_hash      0 bytes  TextAnalysis.#index_hash
                    index_size      0 bytes  TextAnalysis.#index_size
                 inverse_index      0 bytes  TextAnalysis.#inverse_index
                      language      0 bytes  TextAnalysis.#language
                     language!      0 bytes  TextAnalysis.#language!
                           lda      0 bytes  TextAnalysis.#lda
             lexical_frequency      0 bytes  TextAnalysis.#lexical_frequency
                       lexicon      0 bytes  TextAnalysis.#lexicon
                  lexicon_size      0 bytes  TextAnalysis.#lexicon_size
                           lsa      0 bytes  TextAnalysis.#lsa
                          name      0 bytes  Languages.#name
                         name!      0 bytes  TextAnalysis.#name!
              ngram_complexity      0 bytes  TextAnalysis.#ngram_complexity
                        ngrams      0 bytes  TextAnalysis.#ngrams
                       ngrams!      0 bytes  TextAnalysis.#ngrams!
                      prepare!      0 bytes  TextAnalysis.#prepare!
              remove_articles!      0 bytes  TextAnalysis.#remove_articles!
                   remove_case      0 bytes  TextAnalysis.#remove_case
                  remove_case!      0 bytes  TextAnalysis.#remove_case!
           remove_corrupt_utf8      0 bytes  TextAnalysis.#remove_corrupt_utf8
          remove_corrupt_utf8!      0 bytes  TextAnalysis.#remove_corrupt_utf8!
     remove_definite_articles!      0 bytes  TextAnalysis.#remove_definite_arti…
        remove_frequent_terms!      0 bytes  TextAnalysis.#remove_frequent_term…
              remove_html_tags      0 bytes  TextAnalysis.#remove_html_tags
             remove_html_tags!      0 bytes  TextAnalysis.#remove_html_tags!
   remove_indefinite_articles!      0 bytes  TextAnalysis.#remove_indefinite_ar…
            remove_nonletters!      0 bytes  TextAnalysis.#remove_nonletters!
               remove_numbers!      0 bytes  TextAnalysis.#remove_numbers!
               remove_patterns      0 bytes  TextAnalysis.#remove_patterns
              remove_patterns!      0 bytes  TextAnalysis.#remove_patterns!
          remove_prepositions!      0 bytes  TextAnalysis.#remove_prepositions!
              remove_pronouns!      0 bytes  TextAnalysis.#remove_pronouns!
           remove_punctuation!      0 bytes  TextAnalysis.#remove_punctuation!
          remove_sparse_terms!      0 bytes  TextAnalysis.#remove_sparse_terms!
            remove_stop_words!      0 bytes  TextAnalysis.#remove_stop_words!
            remove_whitespace!      0 bytes  TextAnalysis.#remove_whitespace!
                 remove_words!      0 bytes  TextAnalysis.#remove_words!
                  sparse_terms      0 bytes  TextAnalysis.#sparse_terms
                  standardize!      0 bytes  TextAnalysis.#standardize!
                          stem      0 bytes  TextAnalysis.#stem
                         stem!      0 bytes  TextAnalysis.#stem!
                    stem_words      4 bytes  UInt32
                 stemmer_types      0 bytes  TextAnalysis.#stemmer_types
                strip_articles      4 bytes  UInt32
                    strip_case      4 bytes  UInt32
            strip_corrupt_utf8      4 bytes  UInt32
       strip_definite_articles      4 bytes  UInt32
          strip_frequent_terms      4 bytes  UInt32
               strip_html_tags      4 bytes  UInt32
     strip_indefinite_articles      4 bytes  UInt32
             strip_non_letters      4 bytes  UInt32
                 strip_numbers      4 bytes  UInt32
                strip_patterns      4 bytes  UInt32
            strip_prepositions      4 bytes  UInt32
                strip_pronouns      4 bytes  UInt32
             strip_punctuation      4 bytes  UInt32
            strip_sparse_terms      4 bytes  UInt32
               strip_stopwords      4 bytes  UInt32
              strip_whitespace      4 bytes  UInt32
            tag_part_of_speech      4 bytes  UInt32
                      tag_pos!      0 bytes  TextAnalysis.#tag_pos!
                           tdm      0 bytes  TextAnalysis.#tdm
                          text      0 bytes  TextAnalysis.#text
                         text!      0 bytes  TextAnalysis.#text!
                            tf      0 bytes  TextAnalysis.#tf
                           tf!      0 bytes  TextAnalysis.#tf!
                        tf_idf      0 bytes  TextAnalysis.#tf_idf
                       tf_idf!      0 bytes  TextAnalysis.#tf_idf!
                     timestamp      0 bytes  TextAnalysis.#timestamp
                    timestamp!      0 bytes  TextAnalysis.#timestamp!
                        tokens      0 bytes  TextAnalysis.#tokens
                       tokens!      0 bytes  TextAnalysis.#tokens!
         update_inverse_index!      0 bytes  TextAnalysis.#update_inverse_index!
               update_lexicon!      0 bytes  TextAnalysis.#update_lexicon!

Stay tuned for More! Thanks

 

 

 

Leave a Comment

Your email address will not be published. Required fields are marked *