the eighteenth letter of the alphabet
the letter before S
a language and environment for statistical computing and graphics
simple procedural programming language with functions
extensible through user-submitted packages
C, C++, and Fortran code can be linked in and called
C and Java code can call into R
your Portfolio Manager procastinated on this week’s topic
interest in combining R’s strength in data manipulation, statistics, and data visualization with big data in Hadoop
there’s clues about how Google and Facebook use R
Revolution Analytics produces a product for connecting R to Hadoop
also available as R packages on GitHub
> x = 3.14 # assign 3.14 to x
> x # print the value of x
[1] 3.14
> x = 3.14
> class(x) # print the class name of x
[1] "numeric" # numer is essentially “double”
> y = as.integer(3.14)
> y
[1] 3
> class(y)
[1] "integer"
> is.integer(y)
[1] TRUE
> t = TRUE
> f = FALSE
> class(t)
[1] "logical"
> t
[1] TRUE
> !t
[1] FALSE
> t & f # t AND f
[1] FALSE
> t | f # t OR f
[2] TRUE
> n = 'Nancy'
> x = as.character(3.14)
> class(x)
[1] "character"
> paste(n, x) # combine strings
[1] "Nancy 3.14"
> sprintf("%s has %d dollars", n, as.double(x))
[1] "Nancy has 3.14 dollars"
an ordered collection of the same type
> c(1, 2, 3)
[1] 1 2 3
> c("a", "bb", "ccc", "dddd")
[1] "a" "bb" "ccc" "dddd"
> c(1, "a", TRUE)
[1] "1" "a" "TRUE"
> a = c(1, 2, 3)
> b = c(10, 20, 30, 40, 50, 60)
> a + b
[1] 11 22 33 41 52 63
an ordered collection of objects
> a = c(1, 2, 3)
> l = list(a, TRUE, "ccc")
> l
[[1]]
[1] 1 2 3
[[2]]
[1] TRUE
[[3]]
[1] "ccc"
> p = c(1.99, 2.99, 4.99)
> c = c(51, 24, 36)
> data.frame(p, c)
p c
1 1.99 51
2 2.99 24
3 4.99 36
> df = data.frame(price=p, count=c)
> rownames(df) = c("squash", "cucumber", "tomato")
visit Graph API Explorer to generate an Access Token
> source('facebook_mining.r')
> source('facebook_friendship1.r')
take some data and use R to derive something
frequent words (Twitter example)
relationships between items (Facebook Example)
possible data sources