Jun 30, 2014 - kinds of financial time series data in R. Base R has limited functionality for handling general time seri
Working with Financial Time Series ) [1] "1970-01-01" > as.Date("January 1, 1970", format="%B %d, %Y") [1] "1970-01-01" > as.Date("01JAN70", format="%d%b%y")
2 3
Spector (2004) gives an excellent overview of the chron, Date, and POSIXt classes in R. Some might say “ripped off” from.
[1] "1970-01-01"
Notice that the output format is always in the form “YYYY-m-d” regardless of the input format. To change the displayed output format of a date use the format() function > format(my.date, "%b %d, %Y") [1] "Jan 01, 1970"
Some date formats provide insufficient information to be unambiguously represented as a Date object. For example, > as.Date("Jan 1970", format="%b %Y") [1] NA
Table 2 below gives the standard date format codes. Code %d %m %b %B %y %Y
Value Day of the month (decimal number) Month (decimal number) Month (abbreviated) Month (full name) Year (2 digit) Year (4 digit)
Example 23 11 Jan January 90 1990
Table 2. Format codes for dates
Recall, dates are internally recorded as the (integer) number of days since 1970‐01‐01. As a result, you can also create a Date object from integer , length.out=61)
The seq() function can also be used to determine the date that is a specified number of days, weeks, months or years from a given date. For example, to find the date that is 5 months away from today’s date use > Sys.Date() [1] "2014-01-10" > seq(from=Sys.Date(), by="5 months", length.out=2)[2] [1] "2014-06-10"
While the above is a clever solution, it is not very intuitive. The lubridate package, described later on, provides a much easier solution.
Plotting Date Objects Given a , freq=TRUE, + main="Distribution of Dates by Month", + col="slateblue1", xlab="", + format="%b %Y", las=2)
The resulting histogram is shown in Figure 1.
Figure 1 Histogram of Date Objects
The POSIXt classes (base R) The POSIXt classes in R are derived from the POSIX system. There are two POSIXt sub‐classes available in R: POSIXct and POSIXlt. The POSIXct class represents date‐time values as the signed number of seconds (which includes fractional seconds) since midnight GMT (UTC – universal time, coordinated) 1970‐01‐01. This is analogous to the Date class with addition of times during the day. The POSIXlt class represents date‐time values as a named list with elements for the second (sec), minute (min), hour (hour), day of the month (mday), month (mon), year (year), day of the week (wday), day of the year (yday), and daylight savings time flag (isdst), respectively. Creating POSIXct Objects You can create POSIXct objects from a character string representation of a date‐time using the as.POSIXct() function. The default format of the date‐time is “YYYY-mm-dd hh:mm:ss” or “YYYY/mm/dd hh:mm:ss” with the hour, minute and second information being optional. > myDateTimeStr = "2013-12-19 10:17:07" > myPOSIXct = as.POSIXct(myDateTimeStr) > myPOSIXct [1] "2013-12-19 10:17:07 PST" > class(myPOSIXct) [1] "POSIXct" "POSIXt" > as.numeric(myPOSIXct) [1] 1.387e+09
If no time zone specification is given in the optional argument tz, then the default value tz=”” specifies the local system specific time zone as given by the Sys.timezone() function > Sys.timezone() [1] "PST"
The time zone specification is an attribute of the POSIXct object > attributes(myPOSIXct) $class [1] "POSIXct" "POSIXt" $tzone [1] ""
Use the optional format argument if the date‐time string is not in the default format > myDateTimeStr1 = "19-12-2003 10:17:07" > myPOSIXct1 = as.POSIXct(myDateTimeStr1, format="%d-%m-%Y %H:%M:%S") > myPOSIXct1 [1] "2003-12-19 10:17:07 PST"
The most common set of format codes for representing character dates under the POSIX standard are listed in Table xxx. These codes, and others, are explained in the help file for the function strptime(). Code %a
Example Mon
Code %A
Description Full weekday
Example Monday
Jan
%B
Full month
January
%d
01
16
%I
234
%m
Decimal day of month Decimal hours (12) Decimal month
%M %S
Description Abbreviated weekday Abbreviated month Locale specific date and time Decimal hours (24) Decimal day of year Decimal minute Decimal second
12 35
%p %U
%w
Decimal weekday
1
%W
%x
Locale specific date 2‐digit year Full time zone name
%X
91
%Y %Z
AM/PM indicator Decimal week of year (starting on Sunday) Decimal week of year (starting on Monday) Locale specific time 4‐digit year Abbreviated Time‐zone name
%b %c %H %j
%y %z
08 07
1991 PST
Because POSIXct objects have an internal representation as the number of seconds from some origin date‐time, you can also create them from numeric ) > myPOSIXct2 [1] "1969-12-31 16:00:00 PST" > as.numeric(myPOSIXct2) [1] 0
Because PST (Pacific Standard Time) is 8 hours earlier than GMT/UTC, the date‐time is displayed as 1969‐12‐31 16:00:00 PST and not 1970‐01‐01 UTC. Although the numeric representation is still 0 (because POSIXct objects are defined as the number of seconds from 1970‐01‐01 UTC), the time zone specification affects how the date‐time is displayed and how numeric calculations with POSIXct objects are evaluated. For example, consider what happens if I add 8 hours to myPOSIXct2 > myPOSIXct3 = myPOSIXct2 + 8*60*60 > myPOSIXct3 [1] "1970-01-01 PST" > as.numeric(myPOSIXct3) [1] 28800
In many situations it is best to define date‐times in GMT (UTC) to avoid time zone complications when manipulating date‐times > myPOSIXct4 = as.POSIXct(0, origin="1970-01-01", tz="UTC") > myPOSIXct4 [1] "1970-01-01 UTC" > as.numeric(myPOSIXct4) [1] 0
You can use Sys.setenv(TZ="UTC") to set the system time zone to GMT (UTC) so that it becomes the default time zone when calling as.POSIXct(). You can also create a POSIXct object directly from numeric ) [1] "Dec 19, 2013"
This provides a handy way of extracting any component of a POSIXct object. For example, to extract the full month name, time zone abbreviation, numeric year value, and numeric second value, use > format(myPOSIXct, format="%B") [1] "December" > format(myPOSIXct, format="%Z") [1] "PST" > as.numeric(format(myPOSIXct, format="%Y")) [1] 2013 > as.numeric(format(myPOSIXct, format="%S")) [1] 7
As with Date objects, you can also use the weekdays(), months(), quarters() and Julian() functions on POSIXct objects. As explained in the next sub‐section, another way to extract components from a POSIXct object is to convert it to a POSIXlt object and then extract the desired list component. 4
You can also use the related ISOdate() function, which sets hour=12, min=0, sec=0, and tz=”GMT” by default.
The format() function also allows you to see date‐times in different time zones > myPOSIXct4 [1] "1970-01-01 UTC" > format(myPOSIXct4, tz="") [1] "1969-12-31 16:00:00" > format(myPOSIXct4, tz="EST") [1] "1969-12-31 19:00:00"
Creating POSIXlt Objects You can create POSIXlt objects using the as.POSIXlt() or strptime() functions (the strptime() function is a C level function) > myDateTimeStr [1] "2013-12-19 10:17:07" > myPOSIXlt = as.POSIXlt(myDateTimeStr) > myPOSIXlt [1] "2013-12-19 10:17:07" > class(myPOSIXlt) [1] "POSIXlt" "POSIXt"
If the input date‐time string is not in the default format, use the optional format argument together with the appropriate format codes from Table 2 > myDateTimeStr1 = "19-12-2003 10:17:07" > myPOSIXlt1 = as.POSIXlt(myDateTimeStr1, format="%d-%m-%Y %H:%M:%S")
Although POSIXlt objects are lists with named components, the component names are annoyingly hidden. > names(myPOSIXlt) NULL
To see them use the unclass() function > names(unclass(myPOSIXlt)) [1] "sec" "min" "hour" "mday" [9] "isdst"
"mon"
You can extract any of the above list components > myPOSIXlt$sec [1] 7 > myPOSIXlt$hour [1] 10 > myPOSIXlt$mday [1] 19 > myPOSIXlt$mon [1] 11 > myPOSIXlt$year [1] 113 > myPOSIXlt$wday [1] 4
"year"
"wday"
"yday"
> myPOSIXlt$yday [1] 352 > myPOSIXlt$isdst [1] 0
Converting POSIXct Objects to POSIXlt Objects and Vice‐Versa You can convert a POSIXct object to a POSIXlt objects and vice‐versa using the as.POSIXct() and as.POSIXlt() functions, respectively > myPOSIXct [1] "2013-12-19 10:17:07 PST" > class(myPOSIXct) [1] "POSIXct" "POSIXt" > myPOSIXlt = as.POSIXlt(myPOSIXct) > class(myPOSIXlt) [1] "POSIXlt" "POSIXt"
Once reason for converting a POSIXct object to a POSIXlt object is to extract certain components of the date‐time. For example, to get the numeric value for the seconds of myPOSIXct use > as.POSIXlt(myPOSIXct)$sec [1] 7
Converting POSIXt Objects to Date Objects and Vice‐Versa
Coercing to Date removes within day time information as well as time zone information Coercing a Date to POSIXt imposes a time zone
You can convert a POSIXt object to a Date object using the as.Date() function > myPOSIXct [1] "2013-12-19 10:17:07 PST" > myDate = as.Date(myPOSIXct) > myDate [1] "2013-12-19" > class(myDate) [1] "Date"
Doing so removes the within day time and time zone information. Similarly, you can convert a Date object to a POSIXt object using the as.POSIXct() or as.POSIXlt() functions > myPOSIXct = as.POSIXct(myDate) > myPOSIXct [1] "2013-12-18 16:00:00 PST" > class(myPOSIXct) [1] "POSIXct" "POSIXt"
To set specific time zones, you must first convert the Date object to a POSIXlt object then to a POSIXct object
> myPOSIXct = as.POSIXct(myDate, tz="GMT") > myPOSIXct [1] "2013-12-18 16:00:00 PST" > myPOSIXlt = as.POSIXlt(myDate, tz="GMT") > myPOSIXlt [1] "2013-12-19 UTC" > myPOSIXct = as.POSIXct(myPOSIXlt) > myPOSIXct [1] "2013-12-19 UTC"
POSIXt Objects and Ultra High Frequency ) > head(dateSeq5sec) [1] "2013-12-23 09:30:00 PST" "2013-12-23 09:30:05 PST" [3] "2013-12-23 09:30:10 PST" "2013-12-23 09:30:15 PST" [5] "2013-12-23 09:30:20 PST" "2013-12-23 09:30:25 PST" > tail(dateSeq5sec) [1] "2013-12-23 15:59:35 PST" "2013-12-23 15:59:40 PST" [3] "2013-12-23 15:59:45 PST" "2013-12-23 15:59:50 PST" [5] "2013-12-23 15:59:55 PST" "2013-12-23 16:00:00 PST" > length(dateSeq5sec) [1] 4681
The yearmon class (Package zoo) Use the yearmon class to represent regularly spaced monthly dates. This class is particularly useful for representing date information associated with monthly economic and financial time series. The yearqtr class (Package zoo) Use the yearqtr class to represent regularly spaced quarterly dates. This class is useful for representing date information associated with quarterly economic time series.
Working with Dates and Times Using the lubridate Package The functions in the lubridate package (available on CRAN), created by Garrett Grolemund and Hadley Wickham, make working with dates and times in R a little easier.5 The functions in lubridate help users (1) identify and parse date‐time ) [1] "2013-12-19 10:17:07 PST"
The above functions also work with numeric inputs. For example, > ymd(20131219) [1] "2013-12-19 UTC"
The current date‐time can be captured with now(), and the current date with today() Setting and Extracting Information Table xx lists the lubridate functions for extracting and setting information from a date‐time object (Date or POSIXt object) Date Component Year Month Week Day of year Day of month Day of week Hour Minute Second Time zone
Extractor Function year() month() week() yday() mday() wday() hour() minute() second() tz()
For example, > myDateTime = ymd_hms("2013 Dec 19 10:17:07") > myDateTime
[1] "2013-12-19 10:17:07 UTC" > year(myDateTime) [1] 2013 > month(myDateTime) [1] 12 > week(myDateTime) [1] 51 > yday(myDateTime) [1] 353 > mday(myDateTime) [1] 19 > wday(myDateTime) [1] 5 > hour(myDateTime) [1] 10 > minute(myDateTime) [1] 17 > second(myDateTime) [1] 7 > tz(myDateTime) [1] "UTC" > wday(myDateTime, label=TRUE) [1] Thurs Levels: Sun < Mon < Tues < Wed < Thurs < Fri < Sat > month(myDateTime, label=TRUE) [1] Dec 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < ... < Dec
The extractor functions can also be used to set elements of a date‐time to particular values > mday(myDateTime) = 20 > myDateTime [1] "2013-12-20 10:17:07 UTC"
You can modify multiple components of a date‐time object using the update() function > update(myDateTime, year=2014, month=1, + day=1, hour=5, min=0, sec=0) [1] "2014-01-01 05:00:00 UTC"
Performing Calculations with Date‐Times and Timespans Handling Time Zones and Daylight Savings Time
The timeDate class (Packages SplusTimeDate and timeDate) To be completed.
Time Series Objects in R
Representing Regularly Spaced , lwd=2, ylab="Adjusted close", + main="Monthly closing price of SBUX") which produces the plot in Figure 1. To plot a subset of the ,col="blue", lwd=2, + main="Monthly closing price of SBUX") Monthly closing price of SBUX
30 Adjusted close
20
10
0 1995
2000
2005
Time
Figure 2 Plot created with plot.ts()
For ts objects with multiple columns (mts objects), two types of plots can be created. The first type, illustrated in Figure 2, puts each series in a separate panel > plot(sbuxmsft.ts)
20 30 20 10
msft.ts
40
50 0
10
sbux.ts
30
sbuxmsft.ts
1995
2000
2005
Time
Figure 3 Multiple time series plot
The second type, shown in Figure 3, puts all series on the same plot > plot(sbuxmsft.ts, plot.type="single", + main="Monthly closing prices on SBUX and MSFT", + ylab="Adjusted close price", + col=c("blue", "red"), lty=1:2) > legend(1995, 45, legend=c("SBUX","MSFT"), col=c("blue", "red"), + lty=1:2)
50
Monthly closing prices on SBUX and MSFT
30 20 0
10
Adjusted close price
40
SBUX MSFT
1995
2000
2005
Time
Figure 4 Multiple time series plot
Manipulating ts objects and computing returns Some common manipulations of time series ) > head(td2) [1] "1993-03-31" "1993-04-01" "1993-05-03" "1993-06-01" "1993-07-01" [6] "1993-08-02" Now that we have a time index, we can create the zoo object by combining the time index with numeric , lty=1, lwd=2, ylim=c(0,50)) lines(msft.z, col="red", lty=2, lwd=2) legend(x="topleft", legend=c("SBUX","MSFT"), col=c("blue","red"), lty=1:2)
> # plot multiple series at once > plot(sbuxmsft.z, plot.type="single", col=c("blue","red"), lty=1:2, + lwd=2) > legend(x="topleft", legend=c("SBUX","MSFT"), col=c("blue","red"), + lty=1:2)
50 30 20 0
10
sbuxmsft.z
40
SBUX MSFT
1995
2000
2005
Index
Manipulating zoo objects To be completed There are several useful functions for manipulating zoo objects
Converting a ts object to a zoo object To be completed.
Importing , sep=",", header=T) > # convert index to yearmon > index(sbux.z2) = as.yearmon(index(sbux.z2)) > head(sbux.z2) Mar 1993 Apr 1993 May 1993 Jun 1993 Jul 1993 Aug 1993 1.19 1.21 1.50 1.53 1.48 1.52
Representing General Time Series with xts Objects
Importing , start="1993-03-01", + end="2008-03-01", quote="AdjClose", + provider="yahoo", origin="1970-01-01", + compression="d", ret) trying URL 'http://chart.yahoo.com/table.csv?s=sbux&a=2&b=01&c=1993&d=2&e=01&f=20 08&g=d&q=q&y=0&z=sbux&x=.csv' Content type 'text/csv' length unknown opened URL downloaded 179 Kb time series ends
2008-02-29
The optional argument origin=”1970-01-01” sets the origin date for the internal numeric representation of the date index, and the argument compression=”d” indicates that daily ) Trying to create a xy‐plot with the dates on the x‐axis creates an error: > plot(sbux.df$Date, sbux.df$Adj.Close, type="l")
Adding Dates as rownames to a data.frame object To be completed 1. Create data.frame with all numeric data and with character dates as rownames 2. Certain advantages a. Allows easy conversion to zoo and xts objects b. Can subset on character dates c. Plotting time series data does become a bit easier
Importing Excel Data into R Core R does not have functions for reading data directly from Excel spreadsheets. Two packages, RODBC and xlsReadWrite, have functions that can be used to read data directly from Excel files. The xlsReadWrite package is easier to use but, unfortunately, it has been removed from CRAN due to GPL licensing issues8.
xlsReadWrite The package xlsReadWrite contains the function read.xls() for reading data from Excel spreadsheets, and the function write.xls() for writing data to Excel spreadsheets.
RODBC The package RODBC contains functions for communicating with ODBC databases, and Excel can be treated as a database.
8
I have posted the xlsReadWrite package on the class R Hints page.