Hot Search : Source embeded web remote control p2p game More...
Location : Home Search - Unknown
Search - Unknown - List
计算所汉语词法分析系统ICTCLAS介绍 词是最小的能够独立活动的有意义的语言成分。 但汉语是以字为基本的书写单位,词语之间没有明显的区分标记,因此,中文词语分析是中文信息处理的基础与关键。为此,我们中国科学院计算技术研究所在多年研究基础上,耗时一年研制出了汉语词法分析系统ICTCLAS(Institute of Computing Technology, Chinese Lexical Analysis System),该系统的功能有:中文分词;词性标注;未登录词识别。分词正确率高达97%以上,未登录词识别召回率均高于90%,其中中国人名的识别召回率接近98%处理速度为31.5Kbytes/s。ICTCLAS的特色还在于:可以根据需要输出多个高概率结果,有多种输出格式,支持北大词性标注集,973专家组给出的词性标注集合。该系统得到了专家的好评,并有多篇论文在国内外发表。 计算所汉语词法分析系统ICTCLAS同时还提供一套完整的动态连接库ICTCLAS.dll和相应的概率词典,开发者可以完全忽略汉语词法分析,直接在自己的系统中调用ICTCLAS,ICTCLAS可以根据需要输出多个高概率的结果,输出格式也可以定制,开发者在分词和词性标注的基础上继续上层开发。-calculation Chinese lexical analysis system ICTCLAS introduced the term is the smallest independent of meaningful activities language components. It is Chinese characters written for the basic unit, the word no clear distinction between markers, therefore, the Chinese term analysis of the Chinese information processing infrastructure and key. To this end, we CAS Institute of Computing Technology based on years of research, 976,000 developed the Chinese lexical analysis system ICTCLAS (Institute of Compu Hosiery Technology, Chinese Lexical Analysis System), the system functions : the Chinese word; tagging; Unknown word recognition. Word accuracy rate of as high as 97%, unknown word recognition recall rate is higher than 90%. these names identify the recall rate of nearly 98% for the proce
Date : 2008-10-13 Size : 110.58kb User : 郑昀

计算所汉语词法分析系统ICTCLAS介绍 词是最小的能够独立活动的有意义的语言成分。 但汉语是以字为基本的书写单位,词语之间没有明显的区分标记,因此,中文词语分析是中文信息处理的基础与关键。为此,我们中国科学院计算技术研究所在多年研究基础上,耗时一年研制出了汉语词法分析系统ICTCLAS(Institute of Computing Technology, Chinese Lexical Analysis System),该系统的功能有:中文分词;词性标注;未登录词识别。分词正确率高达97%以上,未登录词识别召回率均高于90%,其中中国人名的识别召回率接近98%处理速度为31.5Kbytes/s。ICTCLAS的特色还在于:可以根据需要输出多个高概率结果,有多种输出格式,支持北大词性标注集,973专家组给出的词性标注集合。该系统得到了专家的好评,并有多篇论文在国内外发表。 计算所汉语词法分析系统ICTCLAS同时还提供一套完整的动态连接库ICTCLAS.dll和相应的概率词典,开发者可以完全忽略汉语词法分析,直接在自己的系统中调用ICTCLAS,ICTCLAS可以根据需要输出多个高概率的结果,输出格式也可以定制,开发者在分词和词性标注的基础上继续上层开发。-calculation Chinese lexical analysis system ICTCLAS introduced the term is the smallest independent of meaningful activities language components. It is Chinese characters written for the basic unit, the word no clear distinction between markers, therefore, the Chinese term analysis of the Chinese information processing infrastructure and key. To this end, we CAS Institute of Computing Technology based on years of research, 976,000 developed the Chinese lexical analysis system ICTCLAS (Institute of Compu Hosiery Technology, Chinese Lexical Analysis System), the system functions : the Chinese word; tagging; Unknown word recognition. Word accuracy rate of as high as 97%, unknown word recognition recall rate is higher than 90%. these names identify the recall rate of nearly 98% for the proce
Date : 2025-12-29 Size : 110kb User : 郑昀

KTDictSeg 简介: KTDictSeg 是由KaiToo搜索开发的一款基于字典的简单中英文分词算法 * 主要功能: 中英文分词,未登录词识别,多元歧义自动识别,全角字符识别能力 * 主要性能指标: * 分词准确度:90%以上(有待专家的权威评测) * 处理速度: 600KBytes/s-KTDictSeg Profile: KTDictSeg english by KaiToo developed a simple dictionary based on English and Chinese word segmentation algorithm* Main function: Chinese and English word, unknown word recognition, automatic recognition of multi-ambiguity, full-width character recognition ability* The main performance indicators:* Segmentation accuracy: 90 (subject to the authority of the expert evaluation)* processing speed: 600KBytes/s
Date : 2025-12-29 Size : 504kb User : tz1985

基于逆向最大匹配算法的分词及基于HMM模型的词性标注系统,包括了未登录词的识别、数据库的添加等内容。(需要手动修改数据库的路径才可以运行)-Reverse Maximum Matching Algorithm Based on the sub-word HMM-based model and part of speech tagging system, including the unknown word identification, such as the contents of the database to add. (Need to manually edit the path of the database can run only)
Date : 2025-12-29 Size : 1.2mb User : 张莉娟

计算所汉语词法分析系统ICTCLAS.分词正确率高达97.58 (973专家组评测),未登录词识别召回率均高于90 ,其中中国人名的识别召回率接近98 处理速度为31.5Kbytes/s。ICTCLAS的特色还在于:可以根据需要输出多个高概率结果,有多种输出格式,支持北大词性标注集,973专家组给出的词性标注集合。这是最新版的API接口文档,有详细的示例。-Calculation of the Chinese lexical analysis system ICTCLAS. Segmentation correct rate of 97.58 (973 Expert Group on Evaluation), unknown word recognition than the recall rate of 90 percent, of which the recognition of China to recall the names of persons close to 98 of processing speed for 31.5Kbytes/s. Also features ICTCLAS is: can output the results of a number of high probability, there are a variety of output formats, to support the North-of-speech tagging sets, 973 expert group is given a collection of-speech tagging. This is the latest version of the API interface documentation, detailed examples.
Date : 2025-12-29 Size : 54kb User : 王同
CodeBus is one of the largest source code repositories on the Internet!
Contact us :
1999-2046 CodeBus All Rights Reserved.