CodeBus
www.codebus.net
Search
Sign in
Sign up
Hot Search :
Source
embeded
web
remote control
p2p
game
More...
Location :
Home
Search - DATA CLUSTERING
Main Category
SourceCode
Documents
Books
WEB Code
Develop Tools
Other resource
Sub Category
Games
SDK
Other
Search - DATA CLUSTERING - List
[
Other resource
]
MyKmeans
DL : 0
实现聚类K均值算法: K均值算法:给定类的个数K,将n个对象分到K个类中去,使得类内对象之间的相似性最大,而类之间的相似性最小。 缺点:产生类的大小相差不会很大,对于脏数据很敏感。 改进的算法:k—medoids 方法。这儿选取一个对象叫做mediod来代替上面的中心 的作用,这样的一个medoid就标识了这个类。步骤: 1,任意选取K个对象作为medoids(O1,O2,…Oi…Ok)。 以下是循环的: 2,将余下的对象分到各个类中去(根据与medoid最相近的原则); 3,对于每个类(Oi)中,顺序选取一个Or,计算用Or代替Oi后的消耗—E(Or)。选择E最小的那个Or来代替Oi。这样K个medoids就改变了,下面就再转到2。 4,这样循环直到K个medoids固定下来。 这种算法对于脏数据和异常数据不敏感,但计算量显然要比K均值要大,一般只适合小数据量。-achieving K-mean clustering algorithms : K-means algorithm : given the number of Class K, n will be assigned to target K to 000 category, making target category of the similarity between the largest category of the similarity between the smallest. Disadvantages : class size have no great difference for dirty data is very sensitive. Improved algorithms : k-medoids methods. Here a selection of objects called mediod to replace the center of the above, the logo on a medoid this category. Steps : 1, arbitrary selection of objects as K medoids (O1, O2, Ok ... ... Oi). Following is a cycle : 2, the remaining targets assigned to each category (in accordance with the closest medoid principle); 3, for each category (Oi), the order of selection of a Or, calculated Oi Or replace the consumption-E (Or)
Date
: 2008-10-13
Size
: 1.35kb
User
:
阿兜
[
Other resource
]
gmeans
DL : 1
gmeans-- Clustering with first variation and splitting 文本聚类算法Gmeans ,使用了3种相似度函数,cosine,euclidean ,KL.文本数据使用的是稀疏矩阵形式. -gmeans clustering with first variation and splitting Gmeans,a text clustering algorithm, uses 3 functions,cosine,euclidean and KL in similarity measuring.Text data are described by sparse matrix.
Date
: 2008-10-13
Size
: 69.89kb
User
:
修宇
[
Other resource
]
CART
DL : 1
数据挖掘算法,K-means聚类算法源代码,用于聚类分析-data mining algorithms, K-means clustering algorithm source code for the cluster analysis
Date
: 2008-10-13
Size
: 1.79kb
User
:
sah
[
Other resource
]
fuzzy_k_means
DL : 1
数据挖掘算法,fuzzy-K-means聚类算法源代码,用于模糊聚类分析-data mining algorithms, fuzzy-K-means clustering algorithm source code for Fuzzy Cluster Analysis
Date
: 2008-10-13
Size
: 1.15kb
User
:
sah
[
Other resource
]
IRIS数据
DL : 0
IRIS数据 用于聚类方法 主要用于模式识别、图像分割等-IRIS data for clustering method used pattern recognition, image segmentation, etc.
Date
: 2008-10-13
Size
: 3.94kb
User
:
王超
[
Other resource
]
dynamic_kmeans
DL : 0
该算法实现数据的聚类,效果较好,在MATLAB7.0中运行通过-the algorithm for data clustering, the effect is better in the running through MATLAB7
Date
: 2008-10-13
Size
: 1.11kb
User
:
yinweidong
[
Other resource
]
DBSCAN-csharp
DL : 0
程序说明: Form1.cs是应用聚类算法DBSCAN (Density-Based Spatical Clustering of Application with Noise)的示例,可以通过两个参数EPS和MinPts调节聚类。 DBSCAN.cs是实现文件,聚类算法的进一步信息请参考“数据挖掘”或者相关书籍 聚类示例数据来自于sxdb.mdb,一个Access数据库。 已知问题及进一步改进建议: 问题:dbscan.cs行64,SortedList不支持重复键,因此若两个数据点距离相同则无法加入集合 解决:采用人为减小一个微小量,使数据点距离不同且不影响聚类结果 上一解决方案的问题:减小double.Epsilon微小量无助于使SortedList认为两点距离以及不同 解决:采用一个指数增长的微小量,连续重试直至SortedList认为距离已经不同 进一步改进建议:可能通过double的强制转型为内存中的byte类型(假设double型转为8个byte) 然后最后一个byte减去0x01可比较漂亮的解决问题,但是……呵呵,C#中我不会这个操作 也可以自己实现一个SortedList,支持重复键,当然,这,好像是微软应该做的工作了 ^_^ Eric Guo <http://www.cnblogs.com/ericguo/> -procedures : Form1.cs clustering algorithm is applied DBSCAN (Density-Based Spati cal Clustering of Application with Noise) example, two parameters can EPS and MinPts regulation clustering. DBSCAN.cs is, the clustering algorithm further information please refer to the "data mining" or books related data clustering example from sxdb.m db, an Access database. Known issues and recommendations for further improvement : : 64 dbscan.cs OK, SortedList not support duplicate keys, and therefore if two data points from the same pool can not be solved by adding : By applying an artificially reduce a small amount of data from different points without clustering results on the impact of a solution of the problem : double.Epsilon small decrease in the amount of helplessness to make that 2:00 S
Date
: 2008-10-13
Size
: 15.29kb
User
:
Huang Yi
[
Other resource
]
clusterinquest
DL : 0
cluster in quest聚类算法是基于密度和网格的聚类算法。对于大型数据库的高维数据聚类集合。-cluster in quest clustering algorithm is based on the density of the grid and clustering algorithm. For large database of high-dimensional data clustering pool.
Date
: 2008-10-13
Size
: 4.34kb
User
:
陈妍
[
Other resource
]
clusterds
DL : 0
用VC++语言实现了基于距离,基于密度和改进的数据聚类算法。-VC language based on the distance, based on the density and improved data clustering algorithm.
Date
: 2008-10-13
Size
: 72.51kb
User
:
lixiaoqing
[
Other resource
]
curec
DL : 0
一个用C语言实现的基于cure的数据聚类源代码。-a C language based on the data clustering cure source code.
Date
: 2008-10-13
Size
: 42.21kb
User
:
lixiaoqing
[
Other resource
]
cluster-hyper-dim
DL : 0
This paper studies the problem of categorical data clustering, especially for transactional data characterized by high dimensionality and large volume. Starting from a heuristic method of increasing the height-to-width ratio of the cluster histogram, we develop a novel algorithm – CLOPE, which is very fast and scalable, while being quite effective. We demonstrate the performance of our algorithm on two real world-This paper studies the problem of categori cal data clustering. especially for transactional data characteri propellant by high dimensionality and large volume. St. arting from a heuristic method of increasing th e height-to-width ratio of the cluster histogr am, we develop a novel algorithm-CLOPE. which is very fast and scalable, while being quite effective. We demonstrate th e performance of our algorithm on two real world
Date
: 2008-10-13
Size
: 105.82kb
User
:
hanzhang
[
Other resource
]
cluster-3.6.5
DL : 0
一种数据聚类算法的源码,可以在模式识别和图像处理中试用。 -a data clustering algorithm source code, in pattern recognition and image processing trial.
Date
: 2008-10-13
Size
: 461.62kb
User
:
刘中华
[
Other resource
]
denoise
DL : 0
I developed an algorithm for using local ICA in denoising multidimensional data. It uses delay embedded version of the data, clustering and ICA for the separation between data and noise.
Date
: 2008-10-13
Size
: 142.77kb
User
:
sunny
[
Other resource
]
Jx_KClustering
DL : 0
K-均值算法图形演示程序,可以设定聚类个数,采用MFC编写,有完善的K-均值类,可以对多维数据进行K-均值处理。-K-means algorithm graphics demo program, the number of clusters can be set using MFC preparation, a comprehensive K-average category, multidimensional data on K-mean treatment.
Date
: 2025-12-20
Size
: 100kb
User
:
[
Other resource
]
DBSCAN-csharp
DL : 0
程序说明: Form1.cs是应用聚类算法DBSCAN (Density-Based Spatical Clustering of Application with Noise)的示例,可以通过两个参数EPS和MinPts调节聚类。 DBSCAN.cs是实现文件,聚类算法的进一步信息请参考“数据挖掘”或者相关书籍 聚类示例数据来自于sxdb.mdb,一个Access数据库。 已知问题及进一步改进建议: 问题:dbscan.cs行64,SortedList不支持重复键,因此若两个数据点距离相同则无法加入集合 解决:采用人为减小一个微小量,使数据点距离不同且不影响聚类结果 上一解决方案的问题:减小double.Epsilon微小量无助于使SortedList认为两点距离以及不同 解决:采用一个指数增长的微小量,连续重试直至SortedList认为距离已经不同 进一步改进建议:可能通过double的强制转型为内存中的byte类型(假设double型转为8个byte) 然后最后一个byte减去0x01可比较漂亮的解决问题,但是……呵呵,C#中我不会这个操作 也可以自己实现一个SortedList,支持重复键,当然,这,好像是微软应该做的工作了 ^_^ Eric Guo <http://www.cnblogs.com/ericguo/> -procedures : Form1.cs clustering algorithm is applied DBSCAN (Density-Based Spati cal Clustering of Application with Noise) example, two parameters can EPS and MinPts regulation clustering. DBSCAN.cs is, the clustering algorithm further information please refer to the "data mining" or books related data clustering example from sxdb.m db, an Access database. Known issues and recommendations for further improvement : : 64 dbscan.cs OK, SortedList not support duplicate keys, and therefore if two data points from the same pool can not be solved by adding : By applying an artificially reduce a small amount of data from different points without clustering results on the impact of a solution of the problem : double.Epsilon small decrease in the amount of helplessness to make that 2:00 S
Date
: 2025-12-20
Size
: 26kb
User
:
Huang Yi
[
Other resource
]
Weka-3-2
DL : 0
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes. 一个可以实现多种方法分类的软件,利用各个 对象的属性。决策树,距离、密度等-Weka is a collection of machine learning al gorithms for data mining tasks. The algorithms can either be applied directly to a dataset or ca lled from your own Java code. Weka contains tool 's for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for d eveloping new machine learning schemes. can be a real Categories are various methods of software, using all the attributes of objects. Decision Tree, distance, density, etc.
Date
: 2025-12-20
Size
: 14.73mb
User
:
马何坛
[
Other resource
]
kmeans
DL : 0
基本的数据聚类算法,可以进行快速有效的数据聚类,可以有效地进行数据挖掘-Basic data clustering algorithm, can be fast and effective data clustering, data mining can be effectively
Date
: 2025-12-20
Size
: 6kb
User
:
盛荣芬
[
Other resource
]
kmeans
DL : 0
k均值聚类方法。 在给定一个有n个对象的数据集,划分聚类技术将构造数据进行k个划分,每一个划分代表一个簇,k小于等于n。-k-means clustering method. Given a set of n objects data, dividing the data clustering techniques to construct k partitions, each partition represents a cluster, k less than or equal n.
Date
: 2025-12-20
Size
: 3kb
User
:
尚云
[
Other resource
]
2014-Science-clustering
DL : 0
2014年发表在Science中的一篇文章Clustering by fast search and find of density peaks,其中还包括了作者用到的数据集和MATLAB源程序-2014 published an article in Science Clustering by fast search and find of density peaks, which also includes the author used data sets and MATLAB source code
Date
: 2025-12-20
Size
: 16.35mb
User
:
张博舒
[
Other resource
]
clustering
DL : 0
R语言聚类算法,数据挖掘,可以直接运行!!欢迎下载-R language clustering algorithm, data mining, can be directly run!!!!!Welcome to download
Date
: 2025-12-20
Size
: 1kb
User
:
cc
«
1
2
3
»
CodeBus
is one of the largest source code repositories on the Internet!
Contact us :
1999-2046
CodeBus
All Rights Reserved.