Friday, 6 March 2009

Dynamic Languages: R in Ruby

One of the "hot" languages in the past few years is Ruby. My impression actually is that Ruby may have peaked a bit with the buzz for "Ruby on Rails". The good ideas from "Rails" are getting implemented in bunches of other frameworks such as MonoRail, Grails, etc.

But hey, I may be wrong. For Ruby proponents interested in some statistics a cool project is RinRuby. This provides Ruby functionality for invoking R as an embedded statistics engine.

Here's an example creating a graph and printing a correlation value from Ruby:

tally = Hash.new(0)
File.open('gettysburg.txt').each_line do |line|
line.downcase.split(/\W+/).each { |w| tally[w] += 1 }
end
total = tally.values.inject { |sum,count| sum + count }
tally.delete_if { |key,count| count < 3 || key.length < 4 }

require "rinruby"
R.keys, R.counts = tally.keys, tally.values

R.eval <<EOF
names(counts) <- keys
barplot(rev(sort(counts)),main="Frequency of Non-Trivial Words",las=2)
mtext("Among the #{total} words in the Gettysburg Address",3,0.45)
rho <- round(cor(nchar(keys),counts),4)
EOF

puts "The correlation between word length and frequency is #{R.rho}."



One of the cool things about RinRuby is that the implementation uses pure Ruby code, e.g. there's no C code behind the scenes. The Implementation Details explain:

RinRuby is a program which allows the user to run R commands from Ruby. An instance of R is created as a new object within Ruby, which allows R to remain open and running until the user closes the connection. There is no software that needs to be installed in R. Ruby sends data to R over TCP/IP network sockets, while commands and text are passed through the pipe. The pipe avoids compatibility issues on differing operating systems, platforms, and versions of Ruby and R, while the socket can handle large amounts of data quickly while avoiding rounding issues for doubles.

A benefit of this is that it is expected to work in JRuby and IronRuby.

This installed easily for me but didn't detect the location of R. I probably don't have the location of R in my registry. As a testament to the benefits of a scripting language I could fix this by editing the appropriate line in the "rinruby.rb" file to hardcode the location of R for my machine.

It ran properly under the Windows console application "irb". It gave an error under the "fxri" application which provides a console along with help information. I suspect that when running in a GUI the "rinruby" code is fighting it out with the GUI regarding redirection of stdout. Or it might be that a Windows console application has the stdout pipe and a Windows GUI application doesn't.

1 comment:

  1. Hi Charles,

    Did you find a solution to the above problem?

    I am facing the same issue. RinRuby works well in irb mode but fails when invoked using eclipse IDE.

    Kindly post your solution if you have managed to crack it

    Regards
    Radha

    ReplyDelete