Get invited to our slack community and get access to opportunities and data science insights


word_ngrams(array words, int minSize, int maxSize]) – Returns list of n-grams for given words, where `minSize <= n <= maxSize` SELECT word_ngrams(tokenize('Machine learning is fun!', true), 1, 2); ["machine","machine learning","learning","learning is","is","is fun","fun"]
Platforms: WhereOS, Spark, Hive
Class: hivemall.tools.text.WordNgramsUDF

More functions can be added to WhereOS via Python or R bindings or as Java & Scala UDF (user-defined function), UDAF (user-defined aggregation function) and UDTF (user-defined table generating function) extensions. Custom libraries can be added on via Settings-page or installed from WhereOS Store.

Related Post