K-Means clustering algorithm using Apache Flink. Project for the course: Middleware Technologies for Distributed Systems.
This repository has been archived on 2019-08-08. You can view files and clone it, but cannot push or open issues or pull requests.
Go to file
2019-01-24 20:12:14 +01:00
src/main Support other ranges than [0, 1] 2019-01-24 20:12:14 +01:00
.gitignore Basic scaffold 2019-01-23 23:14:35 +01:00
genVectors.py Support other ranges than [0, 1] 2019-01-24 20:12:14 +01:00
plotClassification.py One time calculations 2019-01-24 15:10:28 +01:00
pom.xml Basic scaffold 2019-01-23 23:14:35 +01:00
README.md Basic scaffold 2019-01-23 23:14:35 +01:00

Usage

Compile job data

mvn package

Generate vectors to cluster

./genVectors.py $DIMENSION $NUMBER > $FILE

(example: ./genVectors.py 2 15 > myInput.csv)

Run

flink run target/project-*.jar --input $INPUT --output $OUTPUT