经典竞争凝聚(CA)算法具有自动寻找聚类总数的特性,避免了预判参数对聚类结果的影响,但在聚类过程中,该算法并未利用样本数据中普遍存在的少量已知信息,而这些已知信息往往能够对整个聚类过程提供有益的帮助;此外算法在相似度度量函数上采用了最为常见的欧氏距离,该距离仅适用于球状的聚类,且存在等划分的趋势,这就制约了算法的应用范围.针对上述问题,通过引入具有半监督学习能力的半监督项,增强隶属度矩阵的划分能力,并利用样本数据的点密度信息,生成距离调节因子修正欧氏距离,最终得到了基于点密度的半监督CA算法.在人造模拟图像和真实图像上的聚类分割结果,以及与其它算法的性能比较,表明了所得算法,能得到较为准确的中心值,有更佳的聚类效果.
The competitive agglomeration(CA)is a very classic algorithm in clustering algorithm.The algorithm has the ability to get cluster number automatically.It j udges and gives up the false clustering centers during iterative process of continuous until the last number of cluster is most appropriate for sample date.Through this way it avoids the influence on the clustering results by anticipating parameters incorrectly,and does not need to set precise clustering number for sample date.But during its clustering,it fails to take into account the known information, which is little but prevalent in the sample data.However those known informations are important for the clustering results.Obviously,making proper use of the information is conducive to improve the clustering rate.Moreover,the algorithm uses the Euclidean distance as the similarity function.Even though the distance formula has the advantages in calculation and is wildly used in common algorithms,the distance is only applic