Jeromy Anglim's Blog: Psychology and Statistics


Monday, February 21, 2011

R versus Matlab in Mathematical Psychology

I recently attended the 2011 Australasian Mathematical Psychology Conference. This post summarises a few thoughts I had on the use of R, Matlab and other tools in mathematical psychology flowing from discussions with researchers at the conference.

I wanted to get a sense of the software used by researchers in mathematical psychology. What was popular? Why was it popular? From the small-n, non-random sample of conference attendees that I spoke to over coffee and cake, I concluded:

  • Many experienced math psych researchers know a bit of both R and Matlab, but most specialised in one.
  • Matlab seemed to be substantially more popular than R in math psych.
  • The general attitude seemed to be that both tools offered similar functionality.
  • Reasons given for using Matlab:
    • Consistency: several researchers commented that functions are highly consistent in Matlab, making it easier to return to coding in Matlab after a break.
    • Superior built-in documentation: There was a sense that Matlab documentation was more user-friendly.
    • Historical precedent: researchers grew up on Matlab and then taught it to their graduate students.
    • Existing packages and models: it seems like Matlab is well established in cognitive psychology where substantial existing code to guide subsequent researchers.
    • University pays: Thus, while R is free, Matlab is effectively free to the academic if the academic's university has a site licence.
    • User friendly IDE: In R it seems that most users pretty quickly start playing around with alternative editors, whether it be ESS, Vim and R, Eclipse, Tinn-R or something else. In Matlab, the built-in IDE seemed popular. While these external editors can be configured to create a really powerful data analytic environment, Matlab users appreciated having something that was productive out-of-the-box.
    • Matlab is user friendly for implementing matrix algebra based calculations.
  • Reasons given for using R:
    • Free (as in beer)
    • Open source: A few people talked about this. However, I got the sense that the ideology of open source technology could be encouraged further.
    • Sweave: Even amongst Matlab users, there was a respect and interest in the idea of Sweave in R
    • R's packages: The sheer number of packages particularly for statistics is one of R's great strengths.
    • Superior graphics
  • A few people also spoke positively of Python (see this summary of useful Python packages for statistics by Christophe Lalanne.
  • All the above links into general discussion of the relative merits of R, Matlab, and Python on SO.

From my discussions, I saw no need for me to personally switch from R to Matlab. Sweave, graphics, and all the R packages are fantastic. The community around R is also one of its great strengths.

Finally, open source just aligns better with science.

  • Open and freely modifiable source code
  • Freely available psychological measurement tools
  • Freely available data
  • Reproducible research documents using technologies such as Sweave
  • Open-access journals

It all combines to support scientific disciplines in sharing and building knowledge through accountability and trust. This applies both to sharing between researchers as well as communicating with the broader community.

I get a bad feeling when I think of researchers and interested community members who can't afford Matlab being excluded from research.

However, it was interesting to consider how issues like user-friendly documentation, development environments, and consistency could be facilitated in a massive and distributed open source project such as R.

...END RANT...

Related Posts

6 comments:

  1. I frequently use both R and Matlab, and have a strong preference for R. The main sticking point is that--other than having packages for just about everything, which you mentioned--R seems to have much better support for everything other than actual numerical computation. E.g., reading in data can be a nightmare in Matlab, whereas read.table() rarely fails me. And similarly for string operations, data type conversion, etc. Not to say you can't manipulate strings or data types in Matlab, just that it always seems to require more work. R is just a better language.

    That said, there are also things that drive me crazy about R, and force me back to Matlab at times. The biggest one is slowness; R just chokes on very large datasets. I know there are ways around this, but out of the box, R can't handle datasets in the gigabytes, whereas Matlab just chugs along. And I much prefer Matlab's style of indexing arrays; R doesn't have nearly as good support for logical indexing. But that's kind of offset by the fact that you can't nest calls very well in Matlab.

    Bottom line, I use R for everything I can, but t least for what I do, Matlab is sometimes necessary. I do think that in the long run, there's a good chance NumPy/SciPy will end up being a better solution than either, because coding in Python is just a much better all-around experience. But I'm not planning to switch over until there's much better stats and plotting support.

    ReplyDelete
  2. Doesn't R have pretty signifcant quantitative psychology support? The package "psych" comes to mind.

    ReplyDelete
  3. Hi Tal,
    Thanks for sharing your first hand experience.
    My knowledge of Matlab is pretty fuzzy.
    So far my data analysis needs haven't required any specialist toolboxes in Matlab (e.g., http://psychtoolbox.org).
    And I suppose if I truly wanted to stay open source, there's always Octave.

    And yes, I'll be have to check out NumPy/SciPy at some point.

    ReplyDelete
  4. @Anonymous
    Yes, R has a lot of support for quantitative psychology, and it's getting better all the time.

    * Psychometrics Task View: http://cran.r-project.org/web/views/Psychometrics.html
    * Social Science Task View: http://cran.r-project.org/web/views/SocialSciences.html

    And of course, a lot of every day analyses of psychological datasets can be done with Base R packages.


    However, for example, I don't think there is equivalent packages for estimating parameters of various computational models. For example, I don't know if there is something like DMAT http://ppw.kuleuven.be/okp/software/dmat/ for R.
    You could write such a package yourself, but that's a whole other level.

    ReplyDelete
  5. Hi Jeromy,

    Thanks for that interesting comparison. I don't use MATLAB that much but I have great respect for it. While I was initially confused by several aspects of R's syntax, MATLAB's seemed very easy and natural from the start. However the mass of packages (recently exceeding 4,000) keeps me in R.

    Have you heard much about Scilab or Octave? They're open source and with syntax similar to MATLAB's. A company named Equalis now offers support for Scilab as Revolution Analytics does for R. That made me think it must be fairly widely used.

    It's interesting that Tal reports such a big difference between R and MATLAB regarding the size of data they handle. They're both limited to holding data in RAM aren't they? (i.e. without resorting to special tricks.)

    Cheers,
    Bob Muenchen
    http://www.r4stats.com

    ReplyDelete
  6. Hi Bob,
    Thanks for the useful ideas.
    I haven't dived into Matlab much.
    I've had a quick look at Octave.
    Its great to see an open source alternative.

    ReplyDelete