ClusterOptimizer

A GridSearchCV like object for clustering in sklearn

MIT License

Downloads
48
Stars
0

Cluster Optimizer

This is a simple object simulating the GridSearchCV object from scikit-learn (sklearn), but only for clustering. Instead of estimating predictive performance measures using a test fold, it simply calculates unsupervised scores such as the silhouette_score or davies_bouldin_score.

The object is instantiated with an sklearn cluster algorithm, e.g. KMeans, HDBScan, or similar from from sklearn.cluster and a set of parameter options. Different scoring approaches can be supplied as a list of the scoring functions (silhouette_score, davies_bouldin_score, calinski_harabasz_score from sklearn.metrics ).

Using the ClusterOptimizer.optimize() method will perform a grid search through the supplied parameter space. The scores for all supplied scoring functions are stored for all parameters.

The results can be obtained by ClusterOptimizer.results, which should return a pandas DataFrame.

For one or two parameters, the result DataFrame can be used together with seaborn for visualisation.

Package Rankings
Top 35.79% on Pypi.org
Related Projects