Hey! I paused writing on my blog for some time now, as a lot of other things came up. This article is a small one that just introduces a new library/framework I have released called GeneBench

These last years I did research in the area of bioinformatics as part of my PHD. While doing the work I found I needed something that was unavailable: a way to benchmark my new algorithm against others. To have a proper benchmark we needed a “standardized” dataset and a reliable/reproductible way of testing performance in differential gene expression detection microarray data. Detecting differentially expressed genes is something very important for scientists that are trying to find out the genetic cause of an illness. There was no easy way to compare methods that do this. The library also contains a bunch of new machine learning algorithms and also an easy way to extend everything. It’s in the pip package manager (so you can just pip install genebench) and has samples and tutorials for everything. It also suports R language algorithms. I will not get into the details about how everything works in this blog post, as I have a scientific paper in review. This article is more for people that, like me, were searching for a benchmarking tool/data set for DEG detection on micro-array data and did not find one. It is also for google indexing :D.

You can also check the code and small tutorials on Github.

I was thinking about doing some video tutorials about how to use it, but I don’t know if I will have the time.