School of Mathematics, Statistics and Computer Science University ...

7 downloads 135 Views 197KB Size Report
School of Mathematics, Statistics and Computer Science. University of New England. R - Statistical and Graphical Software Notes. R Murison ...
School of Mathematics, Statistics and Computer Science University of New England

R - Statistical and Graphical Software Notes R Murison, [email protected] Printed at the University of New England, June 18, 2003

Chapter 1 Introduction These notes are a guide for getting started with R . The intention is to help you to the stage where you can recognise the fundamentals of the program in order to follow the examples and exercises in the statistics units you encounter. It is not intended that these notes be a compehensive manual for R . The computing of models is fundamental to statistics but the computing is learnt in conjunction with the statistics. These notes merely augment the unit notes in statistics. At first glance, the volume of new material may appear daunting. The way to learn computing is to practise. Errors are to be expected but they provide the feedback which leads to better understanding. If you try each idea, one-at-a-time, the big picture will soon emerge and increasing familarity will simplify the tasks. For many, it will only be necessary to cover chapter 1 at first. The other chapters might be useful adjuncts when you encounter exercises in your statistics units. You may have to let go some of the obsolete notions with which you are comfortable in order to make progress in modern statistical and computing thinking, in line with the ”Attributes of a UNE Graduate” (Communication Skills, Information Literacy, Problem Solving and Social Responsibility). Full description of such attributes can be found at http://www.une.edu.au/offsect/une_grad_attributes.htm The R code for the examples can be obtained from http://mcs.une.edu.au/~rmurison/Rnotes/Rexercises.html and if you download these, you can compute along with this guide in a self help tutorial. This Chapter (1) gives some of the background of the package, directions for installing the program and a test run to ensure all is working. This is followed in Chapter (2) by a guide to the way R works using functions and assigning the results to an object. The program is more advanced than menu-based software which cannot handle modern statistical modelling. The way to source scripts from a file and to sink output to a file are explained in this chapter. Chapter 3 discusses the objects in an R program, data frames, variables etc and 1

Chapter 4 discusses how R organises its functions and objects. Chapter 5 introduces the use of common functions, Chapter 6 discusses different data types (with examples), Chapter 7 shows how to extract components of objects and Chapter 8 uses examples to illustrate the basics of plotting data. The citation for R is given at [1], resources are located at [2] and you can link with r-help at [3].

1.1

History of R

In the mid 1980’s, statistical software named S was developed at AT&T in New Jersey using the interpretative computer language Scheme. It was written to handle statistical modelling and designed to be extendable without modifications. Although it has expanded manyfold with extra functions and capabilities, it remans in the same form after 2 decades. S morphed into S-PLUS and became a commercial package. In 1994, Ross Ihaka and Robert Gentleman at Auckland University wrote the first version of an S like software package and named it R , continuing in the Computer Science tradition eg C, S. They made their software freely available and this gesture captured the spirit of other software developers (Luke Tierney had developed Lisp-Stat, Martin Maechler had written Emacs Speaks Statistics) whence they joined forces. R continues to grow and is now supported by leading statisticians and computer scientists world-wide. It is open source software and is available freely Whilst R appears similar to S or S-PLUS, it is different. Nevertheless, the book by Venables and Ripley ([4]) is an excellent reference and Dalgaard’s book ([5]) is a specialist introduction to R . The web site [2] has other detailed guides that have more depth than these notes.

1.2

Why use R?

• It is free, see [2], and because leading developers of statistical software are writing functions for this package. Thus competency in R means you can stay up to date with statistics. • It covers statistical applications from the simplest to the complex and would allow you to complete all your statistical training using R. Also, it treats different topics in a consistent way so that the programming you learn for say linear models will also be done the same way for non-linear models. This consistency is convenient but also gives an understanding of statistical modelling. • R has a powerful suite of functions that allow you to use modern statistical methods. Modern statistics has simplified many problems through the use of graphics and computer intensive ideas.

• It has the biggest concentration of statisticians worldwide so you have access to the best and most efficient methods. • The apparent simplicity of software such as MINITAB is superficial as it is limited in the analyses that it can handle.

1.3

R Resources

There are versions for Linux, Mac and Microsoft. You find these at (i) the web site [2], http://mirror.aarnet.edu.au/pub/CRAN, (ii) logon to the UNE maths, Stats & Comp Sc computer called turing point the browser at file:///projects/CRAN/index.html or (iii) you may buy a CD from the School of Mathematics, Statistics and Computer Science at UNE for $10. The CD contains all the software used in the school including LaTeX, SciLab, Ghostview. It sometimes takes about 1 hour of internet time to download at home, about 5 minutes to install. If purchasing the CD, installation is slightly different from that below because the version is older (but not noticeably). The program is downloaded from one of the above sources to a file and then compiled. The following sections explain what to do for Windows or Linux.

1.4

R in the Windows Operating System

In the following sections, the tt font such as in rw1080.exe indicates what is in the computer or what has to be typed into the computer. Denote 2 areas on the C: drive of your computer where you will store files. 1. Where you store the R software: C:\Program Files\RHOME 2. Where you store your program files for analysing data: C:\My Documents\Rwork The software stays separate from the working.

1.4.1

The RHOME folder on your computer

The first step in installation is to create a folder on your C: drive where the program is to saved. Then you copy the R software from one of (i) the CRAN web site, or (ii) the projects web page on turing or (iii) the CD ROM. 1. With Windows Explorer, click on C:\Program Files. 2. Use the File menu in the Windows Explorer task bar to make a new folder named RHOME. This folder now has the full path C:\Program Files\RHOME and the R program will be stored here. 3. Now follow the copying instructions in either section 1.4.2 or 1.4.3.

1.4.2

Copying from the CRAN or turing website

1. Use your internet browser ( eg Internet Explorer, Netscape or Mozilla) to point to : http://mirror.aarnet.edu.au/pub/CRAN or file:///projects/CRAN/index.html if logged on to turing. Under the heading Precompiled Binary Distributions, choose the link Windows. Next heading is R for Windows; choose the link base. 2. Next choose rw1071.exe1 . Download this to the folder C:\Program Files\RHOME. on your PC. When downloading is complete, close or minimize the internet browser. 3. Go to section 1.4.4 to install the program. 1

The version number may change with new releases, eg, rw1072.exe or rw1080.exe.

1.4.3

Copying from CD

1. Start Windows Explorer and, choose the CD drive, say D:. 2. Change directory to the CD drive (Mscs ms(D:)) and choose the directory (Folder) ’R ’. 3. In this directory ’R ’, move to the windows directory, and then choose the base directory. 4. In the base directory, there is a SetupR.exe icon. The tree diagram of Windows Explorer resembles Figure 1.1 Figure 1.1: Files on the CD ROM with Windows Explorer

Copy SetupR.exe by drag-and-drop to C:\Program Files\RHOME 5. Go to section 1.4.4 to install the program.

1.4.4

Installation

The file SetupR.exe in earlier versions is equivalent to rw1071.exe. Note that other information is available in the downloadable files README.rw1071 and CHANGES. 1. In Windows Explorer, open C:\Program Files\RHOME to see the icon rw1071.exe (or SetupR.exe). Double click on this icon to install. 2. Follow the instructions in the wizard dialog boxes. 3. Upon completion, a blue R icon will appear on the desktop.

1.4.5

A test run with R in Windows

Purely interactive Double click the R icon on the Desktop and the R Console will open. wait while the program loads. At the R prompt (>) (in the R console window), type : x

Suggest Documents