When I first started getting into data analysis and statistics I didn’t really have a clue with what i was doing, but as then as now the default tool was Excel, must have been Excel 97. Now the spreadsheet model for data analysis isn’t the best but never the less thanks to the ubiquitous nature of Excel it still endures today for all types of data in all types of businesses.
After bumping in to the old Excel limit of 512 columns and 32 000 rows and having used applications and Add Ins built for a specific analysis method, frustration would set in as inevitably the tool wouldn’t do what you wanted, or expected your data in a specific format. This frustration got me into programming with VBA and then tools like R and MATLAB. Now things got interesting as all sorts of analysis methods where now available. Once you realise that if the way you want to look at at your data isn’t possible with your chosen toy you could develop your own method and build a script to do it over and over.
This was fantastic, sophisticated repeatable analysis with publication quality charts, but again there was an issue. That R script that processed the experimental results in minutes rather than the couple of hours in Excel, was difficult for others to use. Typically as soon as someone was shown the command prompt their interest in using R was extinguished.
Now there are GUI based R tools and web based systems for building apps but they all require some configuration and take the analysis out of Excel. This has the potential for confusion as the data and results are spread across separate files.
Typically in Excel you may have a few hundred rows of experimental results, or a few thousand even 10’s of thousand data points sampled from a database. If you are playing with 100 of thousands or more rows in Excel then you really should be thinking about leaving the spreadsheet behind.
R analysis on Excel data
Having used R for some years now and knowing the potential of the amount of packages available I decided that better access to R from Excel could be a very useful tool. This is why Sharp-R was developed. The main purpose is to allow access to R from Excel by any Excel user whether they know about R or not. The user should just be able to select the data and click a button as they would with any other Excel tool.
Rather than building a fixed GUI in Excel I wanted a dynamic system. Any R Script can be embellished with some XML and then anyone with Sharp-R can access R from Excel. No command prompt or transferring data with files..
This approach means that you never have to worry about a change in your data format or need for a new method. Adding a new function file gives access to the new methods without having to install yet another Add In.
The current version is a beta version but is comprehensive enough to be a useful tool. You can download it from here. If you do give it a go please let us know what you think. It will be developed further so any feedback on what works and what doesn’t will be incorporated into future versions.
Currently the documentation is sparse and there is still the possibility of things changing over the next few months, but the above video shows the basic setup. I will be posting some more articles on Sharp-R and what can be done, if you want to see anything in particular let us know. If you want to keep up to date sign up to our newsletter.