B IG DATA
the density associated with a point by
counting number of points in a region of
a specified radius around a point. Points
with a density above a threshold are classified as core points, while noise points
are defined as non-core points that don’t
have core points within the specified radius. Noise points are discarded and clusters are formed around core points. This
very idea of density-based identification
of a cluster helps in creating clusters of
various shapes.
CURE (Clustering with Representatives) [2] also does well at capturing clusters of various shapes and sizes, since
only the representative points of a cluster
are used to compute its distance from other clusters. The clustering algorithm starts
with each input point as a separate cluster, and at each successive st \Y\