Description: IKAnalyzer the source package to achieve Chinese word function, (1) using the unique " forward iteration the most fine-grained segmentation algorithm" , with 60 million characters/second high-speed processing capability. (2) The analysis of multi-mode sub-processor support: letters (IP address, Email, URL), numbers (dates, commonly used in Chinese numeral, roman numerals, scientific notation), Chinese vocabulary (name, place name treatment) segmentation processing. (3) optimization of the dictionary storage, a smaller memory footprint. Support the extended definition of the user dictionary (4) optimized for the Lucene full-text search query analyzer IKQueryParser by ambiguity of keyword search algorithm to optimize query permutations and combinations, can greatly improve the Lucene search hits.
File list (Check if you may need any files):
IKAnalyzer3.2.8 source
......................\src
......................\...\ext_stopword.dic
......................\...\IKAnalyzer.cfg.xml
......................\...\org
......................\...\...\wltea
......................\...\...\.....\analyzer
......................\...\...\.....\........\cfg
......................\...\...\.....\........\...\Configuration.java
......................\...\...\.....\........\Context.java
......................\...\...\.....\........\dic
......................\...\...\.....\........\...\Dictionary.java
......................\...\...\.....\........\...\DictSegment.java
......................\...\...\.....\........\...\Hit.java
......................\...\...\.....\........\...\main.dic
......................\...\...\.....\........\...\preposition.dic
......................\...\...\.....\........\...\quantifier.dic
......................\...\...\.....\........\...\stopword.dic
......................\...\...\.....\........\...\suffix.dic
......................\...\...\.....\........\...\surname.dic
......................\...\...\.....\........\help
......................\...\...\.....\........\....\CharacterHelper.java
......................\...\...\.....\........\IKSegmentation.java
......................\...\...\.....\........\Lexeme.java
......................\...\...\.....\........\lucene
......................\...\...\.....\........\......\IKAnalyzer.java
......................\...\...\.....\........\......\IKQueryParser.java
......................\...\...\.....\........\......\IKSimilarity.java
......................\...\...\.....\........\......\IKTokenizer.java
......................\...\...\.....\........\sample
......................\...\...\.....\........\......\IKAnalyzerDemo.java
......................\...\...\.....\........\seg
......................\...\...\.....\........\...\CJKSegmenter.java
......................\...\...\.....\........\...\ISegmenter.java
......................\...\...\.....\........\...\LetterSegmenter.java
......................\...\...\.....\........\...\QuantifierSegmenter.java
......................\...\...\.....\........\solr
......................\...\...\.....\........\....\IKTokenizerFactory.java
......................\test
......................\....\CH_stopword.dic
......................\....\mydict.dic
......................\....\org
......................\....\...\wltea
......................\....\...\.....\analyzer
......................\....\...\.....\........\test
......................\....\...\.....\........\....\CfgTester.java
......................\....\...\.....\........\....\CharacterTest.java
......................\....\...\.....\........\....\DictionaryTester.java
......................\....\...\.....\........\....\IKTokenerTest.java
......................\....\...\.....\........\....\NumberSegmenter.java
......................\....\...\.....\........\....\SegmentorTester.java
......................\....\...\.....\........\....\SimpleQuantifierSegmenter.java
......................\....\...\.....\........\....\StandardAnalyzerTest.java