<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Open Data Group &#187; dummies package</title>
	<atom:link href="http://opendatagroup.com/tag/dummies-package/feed/" rel="self" type="application/rss+xml" />
	<link>http://opendatagroup.com</link>
	<description>Open Data Group&#039;s Home Page and Blog</description>
	<lastBuildDate>Sat, 04 Sep 2010 00:51:55 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>R: The Dummies Package</title>
		<link>http://opendatagroup.com/2009/09/30/r-the-dummies-package/</link>
		<comments>http://opendatagroup.com/2009/09/30/r-the-dummies-package/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 18:20:47 +0000</pubDate>
		<dc:creator>Christopher Brown</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[dummies package]]></category>
		<category><![CDATA[dummy variables]]></category>

		<guid isPermaLink="false">http://blog.opendatagroup.com/?p=117</guid>
		<description><![CDATA[R-2.9.2 was released in August.   While R can be considered stable and battle-ready, it is also far from stagnation.  It is humbling to see such an intelligent and vibrant community helping CRAN grow faster than ever.   Every day I see a new package or read a new comment on R-Help [...]]]></description>
			<content:encoded><![CDATA[<p>R-2.9.2 was released in August.   While R can be considered stable and battle-ready, it is also far from stagnation.  It is humbling to see such an intelligent and vibrant community helping CRAN grow faster than ever.   Every day I see a new package or read a new comment on R-Help gives me pause to think.</p>
<p>As much as I like R, on occasion I will find myself lost in some dark corner.  Sometimes, I find light.  Sometimes I am gnashing teeth and wringing hands.  Frustrated.  In a recent foray, I found myself trying to do something that I thought exceedingly trivial: expanding character and factor vectors to dummy variables.  There must be some function, but what?   Trying ?dummy didn&#8217;t turn up anything.  Surely some else must have encountered this and provided a package.   I went to the Internet and sure enough the <a href="http://wiki.r-project.org/rwiki/doku.php?id=tips:data-manip:create_indicator">R-wiki</a> was here to save me.  And looking even harder, I found some who had treaded before me on the R-Help archives.  It turns out, it&#8217;s simple.  Expanding a variable as a dummy variable can be done like so:</p>
<p><code><br />
x &lt;- c(2, 2, 5, 3, 6, 5, NA)<br />
xf &lt;- factor(x, levels = 2:6)<br />
model.matrix( ~ xf - 1)<br />
</code></p>
<p>Two problems.  The first problem is that without an external source (Google), I would have never stumbled upon what I wanted.  ( Thanks Google!)  I understand it now, but for what I wanted to do, I would never have thought, &#8220;oh, model.matrix.&#8221;</p>
<p>The second problem is the arcane syntax, <code>wtf &lt;- ~ xf - 1</code>.  I get it now, but it took me some time to figure out what was going on.  I get it, but why not just <code>dummy(var)</code>?  This is what I want to do.</p>
<p>The solution on the wiki wasn&#8217;t quite what I was looking for.  For instance, you can&#8217;t say:</p>
<p><code>model.matrix( ~ xf1 + xf2 + xf3- 1)</code></p>
<p>It turns out, you can only expand one variable at a time.  Well, this is not good.  I know that you could solve this with some sapply&#8217;s and some tests, but next time I might forgot about how to do it.  So with a couple of spare hours, I decided that the next guy, wouldn&#8217;t have to think about it.  He could just use my <a href="http://cran.r-project.org/web/packages/dummies">dummies package</a>.</p>
<p>Like the R-wiki solution, the dummies package provides a nice interface for encoding a single variable.  You can pass a variable -or- a variable name with a data frame.  These are equivalent:</p>
<p><code><br />
dummy( df$var )<br />
dummy( "var", df )<br />
</code></p>
<p>Moreover, you can choose the style of the dummy names, whether to include unused factor level, to have verbose output, etc.</p>
<p>But more than the R-wiki solution, dummy.data.frame offers to something similar to data.frames.  You can specify which columns to expand by name or class and whether to return non-expanded columns.</p>
<p>The package dummies-1.04 is available in CRAN.  Comments and questions are always appreciated.</p>
]]></content:encoded>
			<wfw:commentRss>http://opendatagroup.com/2009/09/30/r-the-dummies-package/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
