What a Java programmer needs to be able to do and understand in a typical BigData + ML project:
- how to choose features;
- how to recode features;
- how to scale;
- how to clear and fill in the gaps;
- how to evaluate the quality of clusterization;
- what to do if one tree is not enough;
- how to make cross-validation. And all of this in Scala + Spark!
All things listed above will be explained using the popular dataset with Kaggle as an example.
Go to presentation