Description: VC平台下的汉语切分小程序,是计算语言学最基础的,初学者可看一下.-VC platform under the Chinese segmentation small program, computational linguistics is the most basic, beginners can look at. Platform: |
Size: 28692 |
Author:刘志 |
Hits:
Description: 此程序解决的问题:较好的, 并适应短字符串的中文分词算法.根据词库 发现以换行符分隔的众多标题中的 top N 关键字并以此更新词库.是一个分类分词算法
-this procedure to solve the problem : better, and adapt to the short string of Chinese Segmentation. According thesaurus found in the many separate newline heading the top key N this update and the word thesaurus. it is a classification algorithm Word Platform: |
Size: 8751 |
Author:刘红周 |
Hits:
Description: 计算所汉语词法分析系统ICTCLAS.分词正确率高达97.58%(973专家组评测),未登录词识别召回率均高于90%,其中中国人名的识别召回率接近98%处理速度为31.5Kbytes/s。ICTCLAS的特色还在于:可以根据需要输出多个高概率结果,有多种输出格式,支持北大词性标注集,973专家组给出的词性标注集合。-Calculate the Chinese Lexical Analysis System ICTCLAS. Segmentation correct rate of 97.58 percent (973 Expert Group on Evaluation), the recall rate of identification of unknown words were higher than 90 percent, of which China s name to identify the recall rate of nearly 98 percent processing speed for 31.5Kbytes/s. Also features ICTCLAS is: can output a number of high probability that there are a variety of output formats, to support the North-of-speech tagging sets, 973 expert group is given a collection of-speech tagging. Platform: |
Size: 3140608 |
Author:站长 |
Hits:
Description: 利用最大匹配法进行汉语句子的分词 最大匹配算法是最常用的分词算法,简单实用正确率可达到80%以上-the maximum matching method for the Chinese Sentence Word maximum matching algorithm is the most commonly used word segmentation algorithm, simple and practical accuracy rate can reach more than 80% Platform: |
Size: 73728 |
Author:廖剑 |
Hits:
Description: 分词,针对汉语的分词,根据统计来实现的,可以直接使用目录即可,里面针对联合早报进行的测试,分次统计中可以包括任意目录(系统能承受得了就行),这是帮一个同学做的作业:)用asp。net + xml-Segmentation for Chinese word segmentation, according to statistics to be achieved, direct access to the directory can be, which for Lianhe test, sub-sub-statistics can include arbitrary directory (the system can accept the deregulation on the line), which is to help a fellow student to do the operation:) with asp. net+ xml Platform: |
Size: 43008 |
Author: |
Hits:
Description: VC平台下的汉语切分小程序,是计算语言学最基础的,初学者可看一下.-VC platform under the Chinese segmentation small program, computational linguistics is the most basic, beginners can look at. Platform: |
Size: 28672 |
Author:刘志 |
Hits:
Description: 此程序解决的问题:较好的, 并适应短字符串的中文分词算法.根据词库 发现以换行符分隔的众多标题中的 top N 关键字并以此更新词库.是一个分类分词算法
-this procedure to solve the problem : better, and adapt to the short string of Chinese Segmentation. According thesaurus found in the many separate newline heading the top key N this update and the word thesaurus. it is a classification algorithm Word Platform: |
Size: 8192 |
Author:刘红周 |
Hits:
Description: 集成了中科院切词技术的中文切词工具,可以进行文档处理-Integration of the Chinese Academy of Sciences of the Chinese segmentation technology segmentation tool can document processing Platform: |
Size: 2204672 |
Author:hanwangzhang |
Hits:
Description: imdict-chinese-analyzer 是 imdict智能词典 的智能中文分词模块,算法基于隐马尔科夫模型(Hidden Markov Model, HMM),是中国科学院计算技术研究所的ictclas中文分词程序的重新实现(基于Java),可以直接为lucene搜索引擎提供简体中文分词支持。-imdict-chinese-analyzer is a smart imdict Chinese Dictionary smart module segmentation algorithm based on Hidden Markov Model (Hidden Markov Model, HMM), the Chinese Academy of Sciences Institute of Computing Technology of Chinese word segmentation ictclas process re-implement (based on Java ), can be directly provided for the lucene search engine support for Simplified Chinese word segmentation. Platform: |
Size: 3256320 |
Author:王同 |
Hits:
Description: 自己编写的中文分词源程序,用vc++编写,附有完整的文档,以及标准的分词数据库-I have written the source code of the Chinese word segmentation, using vc++ to prepare, with complete documentation, as well as sub-standard speech database Platform: |
Size: 8994816 |
Author:tanyi |
Hits: