Get invited to our slack community and get access to opportunities and data science insights

normalize_unicode


normalize_unicode(string str [, string form]) – Transforms `str` with the specified normalization form. The `form` takes one of NFC (default), NFD, NFKC, or NFKD

SELECT normalize_unicode(‘ハンカクカナ’,’NFKC’);
ハンカクカナ

SELECT normalize_unicode(‘㈱㌧㌦Ⅲ’,’NFKC’);
(株)トンドルIII

Platforms: WhereOS, Spark, Hive
Class: hivemall.tools.text.NormalizeUnicodeUDF

More functions can be added to WhereOS via Python or R bindings or as Java & Scala UDF (user-defined function), UDAF (user-defined aggregation function) and UDTF (user-defined table generating function) extensions. Custom libraries can be added on via Settings-page or installed from WhereOS Store.

Related Post

Leave a Comment