Hot Search : Source embeded web remote control p2p game More...
Location : Home Search - WORD
Search - WORD - List
基于中文分词的bbs源程序。具有很好的全站信息检索功能。-based on the Chinese word of bulletin board source. With good station information retrieval functions.
Date : 2008-10-13 Size : 507.5kb User : sasf

Lucene Web interface, use XML as a lightweight protocol. developer can convert data source (text, DB, MS Word, PDF... etc) into xml format, indexing with lucene engine, and get full text search result via HTTP, with XML format output, user can easily intergrated with JSP ASP PHP front end or use XSLT at server side transform output.
Date : 2008-10-13 Size : 2.76mb User : 张和

小叮呼的分词模块 小叮呼的分词模块-small bite called the Word module called the small bite-term m odule
Date : 2008-10-13 Size : 716.1kb User : 侯沛东

据说是百度以前用的中文分词词典,希望对大家有一点帮助哈,快下快下-allegedly Baidu before the Chinese word dictionaries, we hope to have a bit of help to Kazakhstan, where fast under fast!
Date : 2008-10-13 Size : 406.73kb User : 王国金

Delphi实现的简单中文分词,Delphi实现的简单中文分词-Delphi simple Chinese word Delphi simple Chinese word
Date : 2008-10-13 Size : 493.89kb User : 鸿飞

中文分词java版 基本词典,分次效果很不错的-Chinese word java version of the basic dictionary, graded very good results
Date : 2008-10-13 Size : 1.54mb User : 许文强

本程序实现取词功能,可以获取指定的相关信息,包含示例程序。-program from the word function, the designated access to the relevant information, including sample program.
Date : 2008-10-13 Size : 62.39kb User : 易林

实现了基于词库的nutch中文分词,这一部分是其中的dll文件-realized based on the thesaurus nutch Chinese word, this part is one of the dll file
Date : 2008-10-13 Size : 2.29mb User : 冯凡立

FirteX介绍 功能: 支持增量索引,差量索引,多字段索引,提供了3种前向索引方式; 支持纯文本,HTML,PDF等文件格式; 提供快速中文分词; 从底层到高层,提供了多种索引访问接口,灵活自由地使用索引文件; 提供丰富的检索语法,支持多字段检索,日期范围检索,检索结果自定义排序等。 性能: 在Pentium 4 2.8G 2GRAM的机器上超过200Mb每分钟的索引速度 在近7G的索引文件(100G网页,11G纯文本的索引)上检索,仅使用十几M内存在数毫秒内返回查询结果; 支持Tb数量级的文本索引和检索 -FirteX introduced functions : support incremental indexing, index differential, multi-field indexing, provided to the three types of indexing; supports text, HTML, PDF and other file formats; provide rapid Chinese word segmentation; from the bottom to the top, providing a wide variety of index access interface and the flexibility and freedom to use index files; provide rich search syntax, support multi-field search, date range search, retrieval results from the definition of ranking. Performance : the Pentium 4 2.8G 2GRAM machines over 200 Mb per minute rate of the index in the last seven G. Index (100G website 11G text indexing), retrieval, Use only a dozen M memory in a few milliseconds to return to search results; Tb magnitude support the text indexing and retrieval
Date : 2008-10-13 Size : 13.16mb User : 阮正

互联网词库来自于对SOGOU搜索引擎所索引到的中文互联网语料的统计分析,统计所进行的时间是2006年10月,涉及到的互联网语料规模在1亿页面以上。统计出的词条数约为15万条高频词,除标出这部分词条的词频信息之外,还标出了常用的词性信息。 语料库统计的意义:反映了互联网中文语言环境中的词频、词性情况。 应用案例:中文词性标注、词频分析等。 词性分类: N 名词 V 动词 ADJ 形容词 ADV 副词 CLAS 量词 ECHO 拟声词 STRU 结构助词 AUX 助词 COOR 并列连词 CONJ 连词 SUFFIX 前缀 PREFIX 后缀 PREP 介词 PRON 代词 QUES 疑问词 NUM 数词 IDIOM 成语-Internet thesaurus from the right SOGOU search engines to index the Chinese Internet Corpus statistical analysis, Statistics for the time in October 2006, involving the corpus size of the Internet in more than 100 million pages. Statistics from the entries of about 150,000 high-frequency words, in addition to this part of Article marked the word frequency information, also marked the commonly used POS information. Corpus statistical significance : the Internet reflects the Chinese language environment of the word frequency, POS situation. Applications : Chinese part-of-speech tagging, word frequency analysis. POS Categories : N nouns verbs ADJ V adjective ADV adverb CLAS Classifiers ECHO Onomatopoeia STRU structural particle AU X-particle COOR parallel conjunction CONJ conjunction SUFFIX s
Date : 2008-10-13 Size : 1.2mb User : 17521

搜索引擎技术的研究论文,本文阐述了搜索引擎的基本原理,着重分析了中文分词的设计与实现。-search engine technology research papers, the paper deals with the search engine's basic principles focused on analysis of the Chinese word Design and Implementation.
Date : 2008-10-13 Size : 44.61kb User : 季节

汉化CLucene今天,把CLucene的程序改了一下,可以支持汉字了。1.在vc 6编译 2.还不支持分词,但支持汉字,要索引的文本词与词之间用空格隔开。3.只是匆匆改了一下,见demo/IndexFiles.cpp,有问题可以与我联系。有空时改完善些。 -finished CLucene today, CLucene procedures changed a bit in support of the Chinese characters. 1. In vc 6 2 compiler. Do not support segmentation, and support Chinese characters, to index the text between the word and the words separated by a space. 3. Only hastily changed a bit, see demo / IndexFiles.cpp. problems can contact me. Have time to refine the change.
Date : 2008-10-13 Size : 376.52kb User : lucence12

这个系统是属于自动化的搜索引擎,它可以从一个网址列表开始,自动寻找这些网址的下一级网页。可以让中小网站也有有自己特色的搜索引擎。 适合与于对某一指定领域里的网站进行搜索,比如仅搜索医学网站。 使用sql server 2000做数据库。 网络蜘蛛根据用户设定的入口网址自动收集网页数据 强大完备的后台管理 充分挖掘.net性能,百万数据瞬间搜索 完美的前台web页cc面媲美专业搜索 中文分词接口-The system is an automated search engine, it can start a web site of links, automatically search these sites under one website. Allows small website has its own search engine characteristics. Fit right in with a designated area of the site search, such as searching only medical website. Use sql server 2000 database so. Spider network users set the entrance web site automatically collect data powerful comprehensive management background fully tapped. Net Performance, and millions of instantaneous data search perfect prospects web page cc face comparable professional Chinese word search interface
Date : 2008-10-13 Size : 956.69kb User : your name

lucene中文分词源码,做搜索引擎需要用到的好东西哦-lucene Chinese word source and do search engines need to use the good stuff, oh
Date : 2008-10-13 Size : 1.49mb User : 杨流

一种网页分类中使用的中文分词方法,很有借鉴性,大家可以-a website classification of Chinese word segmentation method, a very useful reference, we can s
Date : 2008-10-13 Size : 204.2kb User : show

中文分词系统最完整库,有志向做搜索的朋友可以参考一下,非常有价值的资料-Chinese word segmentation system for the most complete and aspirations so friends can search reference. Very valuable information
Date : 2008-10-13 Size : 77.55kb User : zyb

.面向搜索引擎设计---使用Url重写技术,增加搜索引擎收录的机会。 2.界面友好,操作简便 性能进行了优化,速度快。 3.Aspx文件全部采用codebehind进行代码分离,界面修改容易。 4.系统管理员可设置默认风格,用户可自由选择系统风格。 5.可设推荐新闻和首页新闻. 6.可设置首页滚动图片,并自动生成缩略图。 7.新闻图片可以方便的在线上传。 8.新闻审核和浏览计数功能。 9.采用(类似Word)的编辑方式,可以方便的进行图文新闻混排。 10.可以对新闻做关键字查询。 11.新闻类别可以动态管理。 12.权限分三级:系统管理员、新闻审核员、新闻输入员,系统管理员可设置新闻审核员和新闻输入员权限。 13.增加,修改和删除友情链接功能. 14.通过简单的修改配置文件就可以轻松修改系统各项信息。
Date : 2008-10-13 Size : 729.4kb User : gfgdf

Lucene Web interface, use XML as a lightweight protocol. developer can convert data source (text, DB, MS Word, PDF... etc) into xml format, indexing with lucene engine, and get full text search result via HTTP, with XML format output, user can easily intergrated with JSP ASP PHP front end or use XSLT at server side transform output.
Date : 2025-12-19 Size : 2.76mb User : 张和

lucene中文分词代码 带有19万字的词典 本分词程序的效果取决与词库.您可以用自己的词库替换程序自带的词库.词库是一个文本文件,名称为word.txt. 每一行一个词语,以#开头表示跳过改行.最后保存为UTF-8的文本. -Chinese Segmentation Lucene code with 190,000-word dictionary sub-word depends on the effectiveness of procedures thesaurus. You can use the thesaurus to replace its own built-in thesaurus procedures. thesaurus is a text file, name word.txt. one word per line to# Skip diverted at the beginning of that. Finally, save it as UTF-8 text.
Date : 2025-12-19 Size : 1.34mb User : 陈锦

web search engine: refactored to search Word, PDF and more
Date : 2025-12-19 Size : 370kb User : rico
« 12 3 4 5 6 »
CodeBus is one of the largest source code repositories on the Internet!
Contact us :
1999-2046 CodeBus All Rights Reserved.