McGraw-Hill, ISBN 88-386-6084-0 (in Italian), http://www.ateneonline.it/
LibroAteneo.asp?item_id=1436. ○. P Murrell (2005) “R Graphics” Chapman &
Hall/CRC ...
An Introduction to R A. Dhandapani
Email:
[email protected]
Outline
What is R? Installation Take a Look D t Manipulation Data M i l ti Some Statistical Analysis Web Interface Pros and Cons Further Details
What is R?
R is a statistical computing environment R is an open source software i.e. free Initial developers: Ross Ihaka & Robert Gentleman Developed by scores of volunteers Complete source code is also available
What is R? (Continued)
Available at www.r-project.org R is closely based on S and S-Plus S Plus language Ideally suited for statistical computations such as carrying out simulation studies Ability to write functions
Installation Download latest R Release Go to http://cran.r-project.org/mirrors.html http://cran r-project org/mirrors html Select any mirror site Navigate to bin/windows/base o oad R-2.6.0-win32.exe 60 3 e e (~30 ( 30 MB)) Download For packages, go to bin/windows/contrib do nload packages (in zip download ip format) fo mat) and install them using “install packages from l local l zip i fil files”” iin R menu
Take a look
Take a Look
Data Manipulation Interaction is through Command line Commands are typed at > To enter Multiple lines, lines use + ENTER sends the command to interpreter Eg. >4+4 [1]8
Data types Basic storage: >p p #Print contents of p [1]2 #C t t off p #Content strName strName [1] “IASRI” NOTE: R is cAsE Sensitive; strName & strname are different
Vectors Easiest way to assign a vector is using the function “c” c Eg. >j j [1] 1 1 1 1 >j r r [1]1 1 1 1 1 >strvec t strvec [1]”first” “second” “third” To g get help p on any y function,, type yp >help(seq)
Function c Function c can be used in many ways >a< c(1:3 2:1) >aa [1] 1 2 3 2 1 “:” does the trick Function c is abbreviated form of concatenate ((cat in UNIX))
Matrices Matrices can be created many ways in R >A= matrix(c(1,4,3,2,1,2),nrow=3,ncol=2, +byrow=FALSE) [,1] [,2] [1 ] 1 2 [1,] [2,] 4 1 [3,] 3 2
Matrix Another way (easier?) to create it >H2=rbind(c(1,1),c(1,-1)) b d(c( , ),c( , )) > H2 #print Had2 [,1]] [,2] [, [, ] [1,] 1 1 [[2,] ,] 1 -1 rbind – row bind cbind – column bind
Matrices Multiplication >A= matrix(c(1,4,3,2,1,2),nrow=3,ncol=2, +byrow=FALSE) by o S ) >B=matrix(c(1,0,0,1),nrow=2,ncol=2) >C= A %*% B >C [,1]] [,2] [, [, ] [1,] 1 2 [[2,]] 4 1 [3,] 3 2
Some more matrices Run
matrix.R
sink(filename) sends the output directly to file and sink() to resume normal output Kronecker Product by %x% operator Inverse is obtained by solving the equation Ax = B, equation, B where B is an identity matrix.
factor R behaves differently when you make a vector ecto as factor. acto Eg. >x< x [1] 1 1 1 2 2 2 3 3 3 > summary(x)
Mi 1 Min. 1stt Q Qu. Median M di
1
1
2
Mean M 3rd 3 d Qu. Q
2
3
Max. M
3
factor (contd) >xfactor xfactor [1] 1 1 1 2 2 2 3 3 3 Levels: 1 2 3 > summary(xfactor) 123 ? Frequencies 333
factor Factor can be applied to other variable using tapply. tapply Eg Run factor.R
Data type - Lists Ordered collection of objects Think list as a specialized vector in which components are of different type Eg. > Lst Lst[1] $name [1] "Fred" >Lst$wife [1]”Mary” [1] Mary >Lst$child.ages[1] [ ] [1]4
First element of the list
Access by Name
Yet another way
Data type –time series Data can be given extra information > month_exp month exp = +c(1200,2100,2000,4000,2000,2140) >tsmonth_exp = ts(month_exp,start= ( , ), q y ) + c(2002,9),frequency=12) >tsmonth_exp
2002 2003
J Jan Feb F b Mar M Apr A May M Jun J Jul J l Aug A Sep S O Oct Nov N D Dec 1200 2100 2000 4000 2000 2140
Data type - frames
Most commonly used data type Data is arranged g in rectangular, g , with columns identify variables Eg.
> desig = c("Principal Scientist","Senior
Scientist","Scientist-SS","Scientist") > basicpay = c(16400,12000,10000,8000) > salary_structure = data.frame(designation=desig,basic=basicpay)
Data type - frames > salary_structure designation
basic
1 Principal Scientist
16400
2
12000
Senior Scientist
3
Scientist-SS
4
Scientist
10000 8000
Reading from Files Data can be read easily from files >egframe