Archive 2019

Custom data visualisation with d3.js

Published at October 21, 2019 ·  8 min read

When it comes to creating fast and good looking visualisations there's no better package than ggplot (my personal opinion). Implementing the grammar of graphics it's concise and intuitive allowing you to produce advanced plots in only a few lines of code. This is extremely helpful when performing EDA where I tend to produce a large amount of visualisations in order to familiarise myself with the data. If you want something interactive though, you have to turn elsewhere....


XGBoost: prediction contributions

Published at March 10, 2019 ·  9 min read

In my most recent post I had a look at the XGBoost model object. I went through the calculations behind Quality and Cover with the purpose of gaining a better intuition for how the algorithm works, but also to set the stage for how prediction contributions are calculated. Since November 2018 this is implemented as a feature in the R interface. By setting predcontrib = TRUE the predict function returns a table containing each features contribution to the final prediction....


XGBoost: Quality & Cover

Published at March 7, 2019 ·  12 min read

I was going to write a post about how prediction contributions in XGBoost are calculated. But I quickly came to realize that it would be logical to go through a few other things first, namely Quality and Cover. Although this is all well described in the documentation, a practical example is sometimes useful. It has been very helpful for me at least in gaining a better understanding of how the algorithm works....