Snippet: The Popularity of Data Analysis Software

Posted: April 5th, 2011 | Author: | Filed under: Snippets | Tags: , , | 6 Comments »

We’re often asked what our tool stack looks like. Robert Muenchen over at r4stats has a study of the most popular data analysis software.

He looks at factors as varied as traffic on the language mailing lists, number of search results and web site popularity, sales, and finally surveys of use. For example:

mailing list traffic over time

It’s interesting to think which of these factors indicate greater adoption. Don’t let me spoil it for you, but R comes out looking good across the board.


  • Dani Arribas-Bel

    It’s interesting to see Python isn’t even in the chart (yet)… Thanks for the pointing!

  • http://twitter.com/davekincaid Dave Kincaid

    Interesting to think about. I am curious, though, as a fairly new R user how you deal with the performance issues that R has. I’ve recently been playing around with a basic binary classification problem and exploring different algorithms using the caret package. I’m finding R to be extremely slow when training these models. I’ve done just a cursory comparison with Weka and find Weka much, much faster although I had other problems with Weka.

    So I know you use R quite a bit and would be interested in your experiences using R for machine learning problems and how you get around the speed issues. I’m truly interested because I see the power in R and really love the ease of using the caret package.

    Thanks.

  • http://twitter.com/iiijb urology

    Stataっていいのかな。MSKCCの統計の人も使ってるみたいだけど。

  • http://www.consultingstatistics.org Basil

    I wonder how much cost of software plays a part in these trends. It seems SAS dropped signficantly in 2009 which we know correlates to the current depression. Agencies may have started switching program use because of tightening budgets. R seemed to perhaps have a slowing of growth from 2009 to 2010, but we know it’s free…. This defintely makes me want to try out Stata though. Looks like it’s on the rise.

  • http://pulse.yahoo.com/_3NOMIDA6UZ6LNXUGCTEZSWBQOY BuggyFunBunny

    Interesting.  One factor with regard to Minitab use:  it’s commonly used by Six Sigma consultancies and for training in Six Sigma.  That puts it in Big Corp land, of course.

  • http://www.pietutors.com/ Ashish Soni

    R is an Open source software and that makes it easily available to everyone and easy to use. So I would say that it is R that is best when it comes to perform simple or any kind of complex statistical analysis. you can integrate R with tableau to get complex analysis done along with easy to understand visuals.