<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>statalgo &#187; Time Series</title>
	<atom:link href="http://www.statalgo.com/category/time-series/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.statalgo.com</link>
	<description>Computational Statistics, Machine Learning, et. al.</description>
	<lastBuildDate>Sat, 19 Nov 2011 17:34:27 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Time Series in R</title>
		<link>http://www.statalgo.com/2010/05/08/time-series-in-r/</link>
		<comments>http://www.statalgo.com/2010/05/08/time-series-in-r/#comments</comments>
		<pubDate>Sat, 08 May 2010 20:25:45 +0000</pubDate>
		<dc:creator>Shane</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[Time Series]]></category>
		<category><![CDATA[fts]]></category>
		<category><![CDATA[its]]></category>
		<category><![CDATA[timeSeries]]></category>
		<category><![CDATA[ts]]></category>
		<category><![CDATA[xts]]></category>
		<category><![CDATA[zoo]]></category>

		<guid isPermaLink="false">http://www.statalgo.com/?p=480</guid>
		<description><![CDATA[There are many time series packages in R, so someone coming from a commercial application (e.g. Matlab or S-Plus) can experience a learning curve (and some amount of frustration) trying to learn the best toolkit. R comes with one object called ts() which is useful for regularly spaced time series, such as daily, monthly, or [...]]]></description>
			<content:encoded><![CDATA[<p>There are many time series packages in R, so someone coming from a commercial application (e.g. Matlab or S-Plus) can experience a learning curve (and some amount of frustration) trying to learn the best toolkit.</p>
<p>R comes with one object called <code>ts()</code> which is useful for regularly spaced time series, such as daily, monthly, or yearly data (see <code>help(ts)</code> for more details).  See<a href="http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf"> "Time Series Analysis with R"</a> for an example of how to work with this.</p>
<p>This is frequently insufficient for our purposes. As such, I will primarily use the <a href="http://cran.r-project.org/web/packages/zoo/index.html"><strong>zoo </strong></a>and <a href="http://cran.r-project.org/web/packages/xts/index.html"><strong>xts </strong></a>packages on this blog.  The other options are timeSeries (which is part of <strong><a href="https://www.rmetrics.org/">Rmetrics</a></strong>), its, or fts (from Whit Armstrong).  I will touch on some of the differences along the way.  You can find more about <a href="http://cran.r-project.org/web/views/TimeSeries.html">the time series package on the CRAN view</a>.</p>
<p><a href="http://cran.r-project.org/web/packages/zoo/index.html"><strong>zoo </strong></a>was created originally by Achim Zeileis in 2005, and it stands for "Zeileis's ordered observations", with many subsequent contributions from Gabor Grothendieck.  One of the nice things about zoo is that it is an S3 class in R, and it works with most of the standard R matrix functions (such as <code>summary</code>, <code>cbind</code>, <code>merge</code>, and <code>aggregate</code>).  Hence it has a relatively small learning curve and the authors put a lot of thought into making it just work as expected.</p>
<p>Here's a quick example creating a dummy multivariate time series, getting a summary of the output, and plotting it:</p>
<p><code>&gt; x1 &lt;- zoo(matrix(rnorm(12), nrow = 6), as.Date("2008-08-01") + 0:10)<br />
&gt; colnames (x1) &lt;- c ("A", "B")<br />
&gt; summary(x1)<br />
     Index                  A                 B<br />
 Min.   :2008-08-01   Min.   :-1.6231   Min.   :-1.3363<br />
 1st Qu.:2008-08-03   1st Qu.:-0.9867   1st Qu.:-0.7071<br />
 Median :2008-08-06   Median :-0.5078   Median :-0.5753<br />
 Mean   :2008-08-06   Mean   :-0.1310   Mean   :-0.1270<br />
 3rd Qu.:2008-08-08   3rd Qu.: 0.6633   3rd Qu.: 0.6533<br />
 Max.   :2008-08-11   Max.   : 1.8866   Max.   : 1.0704<br />
&gt; plot(x1)</code></p>
<p>Read <a href="http://cran.r-project.org/web/packages/zoo/vignettes/zoo.pdf">the zoo vignette</a> for more details.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.statalgo.com/2010/05/08/time-series-in-r/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mosaic time series in R</title>
		<link>http://www.statalgo.com/2010/01/20/mosaic-time-series-in-r/</link>
		<comments>http://www.statalgo.com/2010/01/20/mosaic-time-series-in-r/#comments</comments>
		<pubDate>Wed, 20 Jan 2010 21:00:03 +0000</pubDate>
		<dc:creator>Shane</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[Time Series]]></category>

		<guid isPermaLink="false">http://www.statalgo.com/?p=278</guid>
		<description><![CDATA[I really like this chart as featured on flowingdata.com (from www.weathersealed.com).  Here's my brief attempt to recreate it. It looks to like a multivariate time plot where the area above the lines is filled. My only thought is to use a mosaic chart (as in this post on the Learning R blog), but this was the [...]]]></description>
			<content:encoded><![CDATA[<p>I really like this chart <a href="http://flowingdata.com/2010/01/19/crayola-crayon-colors-multiply-like-rabits/">as featured on flowingdata.com</a> (from <a href="http://www.weathersealed.com/2010/01/15/color-me-a-dinosaur/">www.weathersealed.com</a>).  Here's my brief attempt to recreate it.</p>
<p><img src="http://www.weathersealed.com/wp-content/uploads/2010/01/crayons_big2.png" alt="" width="500/" /> <span id="more-278"></span></p>
<p>It looks to like a multivariate time plot where the area above the lines is filled. My only thought is to use a mosaic chart (<a href="http://learnr.wordpress.com/2009/03/29/ggplot2_marimekko_mosaic_chart/">as in this post on the Learning R blog</a>), but this was the best I could do with a little bit of effort.  I think that using geom_ribbon would be better but I couldn't get the colors to work.</p>
<p><img src="http://www.statalgo.com/wp-content/uploads/2010/01/mosaic.png" alt="" width="500" /></p>
<p>Here's the code.  Is there an easier way to do this?  How can I make the axes more like the original?  What about the white lines between boxes and the gradual change between years?  The sort order is also different.</p>
<pre>
    library(XML)
    library(plyr)
    library(ggplot2)
    theurl <- "http://en.wikipedia.org/wiki/List_of_Crayola_crayon_colors"
    tables <- readHTMLTable(theurl)
    n.rows <- unlist(lapply(tables, function(t) dim(t)[1]))
    crayola <- tables[[which.max(n.rows)]]
    x <- crayola[,c("Hex Code", "Issued", "Retired")]
    colnames(x) <- c("color", "issued", "retired")
    for (i in 1:ncol(x)) x[, i] <- type.convert(as.character(x[, i]))
    x[is.na(x[,"retired"]), "retired"] <- 2010
    x$color <- as.character(x$color)

    years <- min(x$issued):max(x$retired, na.rm=T)
    x2 <- na.omit(ldply(years, function(yr, x) {
      idx <- x$issued <= yr &#038; x$retired >= yr
      x2 <- data.frame(year=yr, color=x[idx,"color"], size=(1/length(which(idx))))
      x2 <- x2[order(x2$color, decreasing=TRUE),]
      x2[,"xmin"] <- rep(0, nrow(x2))
      x2[,"xmax"] <- rep(1, nrow(x2))
      x2[-1,"xmin"] <- cumsum(x2$size[-1])
      x2[-nrow(x2),"xmax"] <- cumsum(x2$size[-nrow(x2)])
      x2
    }, x=x))

    p <- ggplot(x2, aes(xmin = year, xmax = year+1, ymin = xmin, ymax = xmax, fill=color))
    p <- p + theme_bw() + opts(legend.position = "none", panel.grid.major = theme_line(colour = NA),
                panel.grid.minor = theme_line(colour = NA))
    p.rect <- p + geom_rect() + scale_fill_identity()
    p.rect
</pre>
<p><BR><br />
<strong>Further improvements</strong></p>
<p>Well, the R community never ceases to amaze.  I posted this and within hours a vastly improved version was created by <a href="http://learnr.wordpress.com/2010/01/21/ggplot2-crayola-crayon-colours/">the Learning R blog</a> (with some help from Baptiste on the color sorting).  All the code is posted on that site.  A suggestion was also made by Tobias to smooth the image <a href="http://cran.r-project.org/web/packages/Cairo/index.html">with Cairo</a>.  Great work!</p>
<p><img src="http://learnr.files.wordpress.com/2010/01/crayola_colours-017.png" width=400></p>
<p>One crucial difference in his version (besides the vastly cleaner code) is his use of <code>geom_area</code> instead of the <code>geom_rect</code> in my version.  That also allows you to set a white border above the image.  </p>
<p>I would go so far as to say that (with the exception of things like better fonts and other touch ups) this R version is actually better than the original because it is more accurate.  As I said previously, there were no color changes early in the timeline, despite that implication in the original chart.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.statalgo.com/2010/01/20/mosaic-time-series-in-r/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

