Posts

Classic spam classification using Spark MLLib
Big DataUsing MLLib naive Bayes for spam classification.

Spark GraphFrames basics
Graph AnalyticsGraphFrames on Spark for the clueless.

Diverse Dataiku tricks
DataikuDiverse things I collected while developing solutions on top of Dataiku.

Basic dataviz with Apache Zeppelin
Data ScienceAbout Zeppeling and the fun/great/useful things you can do with it.

The multi-cultural aspect of Spark
Data ScienceSpark speaks multiple languages and it allows developers to use what's most appropriate to the task at hand.