Install required dependencies for matplotlib GUI frontend and all pip other packages for this project ```bash sudo apt install python3-tk python3.9 -m pip install -r requirements.txt ``` CLI to run K-Means clustering algorithm on a set of data. Data can be provided or randomly generated for testing. ```bash python3.9 k-means.py -h usage: k-means.py [-h] [--data [X,Y ...]] [--seeds [X,Y ...]] [--silent] [--verbose] [--random] [--radius [RADIUS]] [--lock-radius] [--file [FILE_PATH]] [CLUSTER_COUNT] [CENTROID_SHIFT] [LOOP_COUNT] K-means clustering program for clustering data read from a file, terminal, or randomly generated positional arguments: CLUSTER_COUNT Total number of desired clusters (default: '2') CENTROID_SHIFT Centroid shift threshold. If cluster centroids move less-than this value, clustering is finished (default: '1.0') LOOP_COUNT Maximum count of loops to perform clustering (default: '3') optional arguments: -h, --help show this help message and exit --data [X,Y ...], -d [X,Y ...] A list of data points separated by spaces as: x,y x,y x,y ... (default: '[(1.0, 2.0), (2.0, 3.0), (2.0, 2.0), (5.0, 6.0), (6.0, 7.0), (6.0, 8.0), (7.0, 11.0), (1.0, 1.0)]') --seeds [X,Y ...], --seed [X,Y ...], -s [X,Y ...] A list of seed points separated by spaces as: x,y x,y x,y ... Number of seeds provided must match CLUSTER_COUNT, or else CLUSTER_COUNT will be overriden. --silent When this flag is set, scatter plot visualizations will not be shown (default: 'False') --verbose, -v When this flag is set, cluster members will be shown in output (default: 'False') --random, -r When this flag is set, data will be randomly generated (default: 'False') --radius [RADIUS] Initial radius to use for clusters (default: 'None') --lock-radius, -l When this flag is set, centroid radius will not be recalculated (default: 'False') --file [FILE_PATH], -f [FILE_PATH] Optionally provide file for data to be read from. Each point must be on it's own line with format x,y ``` Running k-means clustering program ```bash python3.9 k-means.py --file ./input.txt --silent Finding K-means clusters for given data [(1.0, 2.0), (2.0, 3.0), (2.0, 2.0), (5.0, 6.0), (6.0, 7.0), (6.0, 8.0), (7.0, 11.0), (1.0, 1.0), (5.0, 5.0), (10.0, 10.0), (15.0, 15.0), (25.0, 25.0), (20.0, 20.0), (21.0, 21.0), (22.0, 22.0)] Using 2 clusters, 1.0 max centroid shift, and 3 iterations Clustering iteration 0 Updating cluster membership using cluster seeds, radius: ((5.0000, 5.0000), 10.6066) ((20.0000, 20.0000), 10.6066) Outliers present: set() Updated clusters ([(5.0, 5.0), (20.0, 20.0)]) with new centroids [(4.5, 5.5), (20.6, 20.6)] New centroids [(4.5, 5.5), (20.6, 20.6)] shifted [0.7071, 0.8485] respectively Showing final cluster result... Initial cluster at (5.0000, 5.0000) moved to (4.5000, 5.5000) Total shift: 0.7071 Final radius: 11.0365 Initial radius: 10.6066 Initial cluster at (20.0000, 20.0000) moved to (20.6000, 20.6000) Total shift: 0.8485 Final radius: 11.0365 Initial radius: 10.6066 Stopping... Cluster centroids have not shifted at least 1.0, clusters are stable ``` Running k-means clustering program on some random example data shows the following visual output ```bash python3.9 k-means.py --random # Output removed for GUI example ``` ![](screenshot.png)