Some data algorithms

Thu, Mar 17, 2022 tags: [ Julia Programming Datascience ]

In the past half year, I’ve implemented various data analytics algorithms for my own understanding. Mostly on a toy level, they are all implemented in Julia as Pluto notebooks and I hope they are helpful as reference (if only for me). Here’s the line-up:

AdaBoost – how to use enough shitty classifiers to make up a not-so-shitty classifier. (Note: the last plot is supposed to contain a heatmap, which sometimes only shows up after refreshing the page)
Expectation Maximization – k-means clustering, but better
GLM – generalized linear models for multi-class classification
logistic – logistic regression; similar to GLM (but less general)
SVM – the old-but-gold classic, here implemented in four flavors.

I hope they are useful to you (although probably not). Either way, note how easy Julia makes it to implement these (admittedly simple) algorithms, compared to e.g. Python with numpy. For example, being able to take the gradient of a function without further troubles (like frameworks etc.) is often very useful for quickly optimizing some function; and matrix arithmetics are easier to use anyway.