Enter your keyword

post

THE POWER OF R

THE POWER OF R

When do you think the widely-used, free statistical software package we know as R, was released?

Since the domain of big data and analytics itself is an ‘invention’ of this millennium, it seems difficult to believe that its most widely-used statistical package was launched more than two decades back! But, that’s true. R first appeared in 1996, when the statistics professors Ross Ihaka and Robert Gentleman of the University of Auckland in New Zealand released the code as a free software package. The professors wanted technology better suited and easily accessible, for their statistics students, who needed to analyze data and produce graphical models of the information. Most comparable software e.g. SAS had been designed by computer scientists, had expensive licensing fees and were not user-friendly. Lacking deep computer science training, the professors considered their coding efforts more of an academic game than anything else. Nonetheless, starting in about 1991, they worked on R full time.

Today, companies like Google and Pfizer are using the software extensively in their businesses. Google, for example, taps R for help understanding trends in ad pricing and for illuminating patterns in the search data it collects. Pfizer has created customized packages for R to let its scientists manipulate their own data during nonclinical drug studies rather than send the information off to a statistician. At Facebook, the data science team’s data visualizations in R give it the best overview of what kind of data it is dealing with. The data can range from something like News Feed numbers to correlations with the amount of Facebook friends a user has.

Over the years, more than 1,600 specialized packages on R platform have been developed by the user community for various applications. For instance, one package, called BiodiversityR, offers a graphical interface aimed at making calculations of environmental trends easier. Another package, called Emu, analyzes speech patterns, while GenABEL is used to study the human genome. The financial services community has demonstrated a particular affinity for R; dozens of packages exist for derivatives analysis alone.