Saturday, May 29, 2010

Rrrrrrrrrrrrr

Over three months since my last post, and I'm feeling that sense of inner conflict that says "need to post to express things; can't post because there's too much to express." Since my last post, my life has felt very busy. It's not that I've been working excessively. The class I took required a lot of work, then right at the end of the semester when I was working hard on the semester project, I landed in a research project with an imminent deadline. Fortunately, the research project was enjoyable, and re-emphasized to me how good R is for statistical analysis. Somewhere around 40 lines of R was enough to:
1) Import a csv data matrix
2) Fix missing values
3) Cull out features that don't agree with a target feature
4) Iteratively perform full hierarchical cluster analyses, including calculating quality statistics and graphing results

That's a lot for 40 lines to do! I also used R for the semester project, combining mixture models and boosting, and had a similarly good experience there. So this post has become dedicated to the praise of R in the process of writing it. R, I'm glad you exist.

Using open source projects like R and OpenSceneGraph (which I'm currently looking at for work) has made me want to get involved in one. With what time, I don't know, but I'm going to put that as an objective. Surely there's something I could do!