Location:
Search - ThesaurusAnalyz
Search list
Description: lucene中文分词代码
带有19万字的词典
本分词程序的效果取决与词库.您可以用自己的词库替换程序自带的词库.词库是一个文本文件,名称为word.txt. 每一行一个词语,以#开头表示跳过改行.最后保存为UTF-8的文本.
-Chinese Segmentation Lucene code with 190,000-word dictionary sub-word depends on the effectiveness of procedures thesaurus. You can use the thesaurus to replace its own built-in thesaurus procedures. thesaurus is a text file, name word.txt. one word per line to# Skip diverted at the beginning of that. Finally, save it as UTF-8 text.
Platform: |
Size: 1402880 |
Author: 陈锦 |
Hits: