I Found My Toy Problem
More often than not, I find myself tempted to learn a new programming language. Especially if it follows a different paradigm than the languages I already know. Which are, unfortunately, not that many. But learning a new language is a tedious task: you have to read through loads of tutorials, introductions and how-to’s, which are most certainly at a very basic level, and thus utterly boring. Without a somehow clear goal in mind, I usually just give up and visit Reddit to look at pictures of cats.
Without knowing on which problems you might want to apply your new knowledge, it is hard to stick to learning stuff. Finding a problem that needs to be solved, is interesting and is suitable to serve as a first exercise in a newly learned programming language is also tricky. After all, it mustn’t be too advanced and complicated, in order to be able to use a freshly learned tool to solve it, but at the same time, not too simple so that one does not find it too boring and trivial.
I finally found my personal toy problem which I will use to learn new programming languages: Implementing three of the most important algorithms for inference on Hidden Markov Models: the forward algorithm, the forward-backward algorithm and the Viterbi algorithm. For simplicity, I will only consider the case of discrete emission distributions. The code will not contain any kind of error checking whatsoever. It will be just some ugly hack, but I can live with that. Moreover, I will only use the standard libraries of each programming language. Let’s see where this will lead me.
As a starting point, I implemented the algorithms using my current language of choice, Python. I’ll just show the basics and the forward algorithm here, the rest can be found at my GitHub. So, here it goes:
class HMM: def __init__(self, pi, A, B): self.pi = pi self.A = A self.B = B def normalise(l): norm_const = sum(l) return map(lambda x: x / norm_const, l), norm_const
So here we just have the data structure (which is basically just a struct) and a function to normalise values in a list. And now for the forward algorithm:
def forward(model, observations): state_idxs = range(len(model.pi)) log_prob = 0. alphas = [[model.pi[i] * model.B[i][observations] for i in state_idxs]] alphas, nc = normalise(alphas) log_prob += log(nc) for obs in observations[1:]: alphas += [[sum([alphas[-1][j] * model.A[j][i] for j in state_idxs]) * model.B[i][obs] for i in state_idxs]] alphas[-1], nc = normalise(alphas[-1]) log_prob += log(nc) return alphas, log_prob
Simple as that! As I wrote before, check out the rest of the code at my GitHub.
The next language I want to implement the stuff in is Haskell. I started to read the fantastic „Learn You a Haskell For Great Good!“ by Miran Lipovača. I’m really looking forward to using a functional language.