The controversy about R: epic fail or epic success?

by Erin Vang on Apr.28, 2010 , under JMP & JSL

Statisticians and data analysts are in a kerfuffle about the recent remarks of AnnMaria De Mars, Ph.D. (President of The Julia Group and a SAS Global Forum attendee) in her blog that the open source statistical analysis tool R is an “epic fail,” or to put it in Twitterese, #epicfail:

I know that R is free and I am actually a Unix fan and think Open Source software is a great idea. However, for me personally and for most users, both individual and organizational, the much greater cost of software is the time it takes to install it, maintain it, learn it and document it. On that, R is an epic fail.

And oh, how the hashtags and comments and teeth-gnashing began!

Nathan Yau’s excellent FlowingData blog recaps the kerfuffle nicely, and his post has accumulated a thoughtful comments thread, as has Dr. De Mars’, to both of which I added my thoughts, expanded here:

To make my prejudices clear, I’ve spent several decades in commercial statistical software development (working in a variety of R&D roles at SYSTAT, StatView, JMP, SAS, and Predictum, and I now do custom JMP scripting, etc., for Global Pragmatica LLC.

I can say with hard-won authority that:

– good statistical software development is difficult and expensive
– good quality assurance is more difficult and expensive
– designing a good graphical user interface is difficult, and expensive
– a good GUI is worthwhile, because the easier it is to try more things, the more things you will try, &
– creative insight is worth a lot more than programming skill

Even commercial software tends to be under-supported, and I’ll be the first to admit that my own programming is as buggy as anybody else’s, but if I’m making life-and-death or world-changing decisions, I want to be sure that I’m not the only one who’s looked at my code, tested border cases, considered the implications of missing values, controlled for underflow and overflow errors, done smart things with floating point fuzziness, and generally thought about any given problem in a few more directions than I have. I want to know that when serious bugs are discovered, the knowledge will be disseminated and somebody’s job is on the line to fix them.

For all these reasons, I temper my sincere enthusiasm about the wide open frontiers of open source products like R with a conservative appreciation for software that has a big company’s reputation and future riding on its accuracy, and preferably a big company that has been in the business long enough to develop the paranoia that drives a fierce QA program.

R is great for what it is, as long as you bear in mind what it isn’t. Your own R code or R code that you find sitting around is only as good as your commitment to testing and understanding of thorny computational gotchas.

I share the apparently-common opinion that R’s interface leaves a lot to be desired. Confidentiality agreements prevent me from confirming or denying the rumors about JMP 9 interfacing with R, but I will say that if they turn out to be true, both products would benefit from it. JMP, like any commercial product, improves when it faces stiff competition and attends to it, and R, like most open source products, could use a better front end.

And now let me make my case for R being an epic success.

I like open source software. I use a bunch of it, and I do what I can for the cause (which isn’t much more than evangelism, unfortunately). For me, the biggest win with open source software is that it makes tools available to me, and others, who don’t need them enough to justify much of a price, but who can benefit from them when they’re affordable or free. When an open source tool gets something done for me, or eases some pain at least, I’m not that picky about its interface, and I’m willing to do my own validation (where applicable).

I can’t say that I love using Linux, but as a long-time UNIX geek and Mac OS X bigot, I am glad Linux is available, I use it for certain things, and I think it’s a whole lot better than Windows and other OSes, especially when Ubuntu builds work out. (I’ve had trouble getting JMP for Linux installed on Ubuntu, but that’s probably due to my own incompetence.) OpenOffice is kind of a pain, but it’s better than paying Microsoft for the privilege of enduring the epic fail that is Office, and it has much better support than Office for import/export of other formats. I love it that any number of open source projects are developing such fabulous tools as bzr version control, which I use daily, and that the FINK project is porting a whole bunch of great open source UNIX widgets to Mac OS X.

I think it’s wonderful that some of the world’s greatest analytical minds are using R to create publicly available routines for power-analysts. I love it that students and people who can’t afford commercial stats software, or who won’t use it enough to justify buying a license, have a high-quality open source option, if they’re willing to work at it a bit. I think it’s great that people who think Excel is good enough can’t make a price objection to upgrading to R.

I believe that democratizing innovation and proliferating analytical competence are good for us all. I count on projects like R and Linux to push commercial developers to make better products, and to force pricing and licensing of those products to remain reasonable. Monopolies are good for nobody, including monopolists.

Long live the proponents of R!

What do you think? Do you trust open source stats code? Do you think R’s interface is good enough? Is JMP’s any better? How heavily do you factor quality of documentation into decisions about software?

:Linux, news, open source, opinion, quality, R, SAS, SPSS, StatView, SYSTAT

RSS feed for this post (comments)

About Global Pragmatica® LLC®
Global Pragmatica® LLC offers custom JMP Scripting Language (JSL) application development including JMP and R integration, JMP Clinical customization, facilitative leadership, and program management services with deep domain expertise in software development, change management, localization, and internationalization. We are pragmatists who emphasize results, efficiency, and sustainability.
Global Pragmatica® is a registered trademark of Global Pragmatica® LLC. The ® symbol indicates Federal trademark registration in the USA.

To subscribe to Global Pragmatism, our blog, click the RSS button above, or click here to subscribe to Global Pragmatism by email.
Erin Vang, PMP

Principal Pragmatist
Contact Global Pragmatica LLC®

email me
+1 415.997.9671
LinkedIn erinvang
Twitter @erinvang
Pages
- Global Pragmatism [Blog]
- Global Pragmatica LLC® [Home]
Blog archive
Blog archive
Browse by tags
Add new tag client communication data conversion disaster DTP economy email signatures executive facilitative leadership finance g11n graphs i18n interface design JMP & JSL l10n l10n technology learn Linux news open source opinion people pro bono products program management project management quality R rant recipe RIP risk sales SAS SPSS StatView suicide sweatshop SYSTAT tips TomTom vendor management yak-shaving
© 2009-20 by Global Pragmatica LLC®

All content © 2009-20 by Global Pragmatica® LLC. All rights reserved worldwide.

Global Pragmatica® LLC is a registered trademark of Global Pragmatica LLC. The ® symbol indicates USA trademark registration.

Logo artwork by Zsuzsi Saper.
Tools

Global Pragmatica LLC®

The controversy about R: epic fail or epic success?

About Global Pragmatica® LLC®

Erin Vang, PMP

Contact Global Pragmatica LLC®

Pages

Blog archive

Browse by tags

© 2009-20 by Global Pragmatica LLC®

Tools