<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>John Ramey &#187; R</title>
	<atom:link href="http://www.johnramey.net/category/r/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.johnramey.net</link>
	<description>Don&#039;t think...compute.</description>
	<lastBuildDate>Wed, 07 Jul 2010 20:34:40 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Premature Optimization Is the Root of All Evil</title>
		<link>http://www.johnramey.net/2009/05/28/premature-optimization-is-the-root-of-all-evil/</link>
		<comments>http://www.johnramey.net/2009/05/28/premature-optimization-is-the-root-of-all-evil/#comments</comments>
		<pubDate>Fri, 29 May 2009 02:25:51 +0000</pubDate>
		<dc:creator>johnramey</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[productivity]]></category>

		<guid isPermaLink="false">http://www.johnramey.net/?p=214</guid>
		<description><![CDATA[&#8230;so says the famous quote, my current mantra.  This leads to my primary mission (or possibly, it would be more appropriate to call this my Prime Directive) is to code and get it all out. Brainstorm in code quicker and faster. I have this extremely bad habit of writing 10 lines of code before shelling [...]]]></description>
			<content:encoded><![CDATA[<p>&#8230;so says the <a href="http://en.wikipedia.org/wiki/Optimization_(computer_science)">famous quote</a>, my current mantra.  This leads to my primary mission (or possibly, it would be more appropriate to call this my Prime Directive) is to code and get it all out. Brainstorm in code quicker and faster.</p>
<p>I have this extremely bad habit of writing 10 lines of code before shelling out the rest of the day learning how to improve these lines in R.  Yes, I learn a lot through the process, but graduate students are responsible for more than just learning. Sometimes, there are those that actually have no interest in what I have learned but only my results.</p>
<p>This idea is nothing new to me, but this post is an attempt to enforce a turning of a new leaf. It was ridiculous for me to waste the previous hour intending to optimize TWO lines of code! I cannot and will not do this anymore. Those two lines of code were not important enough to waste that much time.  Before my bedtime, my whiteboard may be plastered with a repetition of &#8220;I will refactor my code later and not now.&#8221;  Unless the code is being published, procrastinated optimization is far better than procrastinated results.</p>
<p>Am I alone in this endeavor?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johnramey.net/2009/05/28/premature-optimization-is-the-root-of-all-evil/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A Little Bit About Data Frames in R</title>
		<link>http://www.johnramey.net/2009/03/14/a-little-bit-about-data-frames-in-r/</link>
		<comments>http://www.johnramey.net/2009/03/14/a-little-bit-about-data-frames-in-r/#comments</comments>
		<pubDate>Sat, 14 Mar 2009 19:49:01 +0000</pubDate>
		<dc:creator>johnramey</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[data frames]]></category>

		<guid isPermaLink="false">http://www.johnramey.net/?p=155</guid>
		<description><![CDATA[Data frames in R are much like DataSets in SAS, SPSS, .NET, etc. Really, they are just spreadsheets that feel like a matrices. We can use these to look at numerical data along with any meta data or characteristics associated with the numbers though numbers are not required. From the R Documentation, &#8220;a data frame [...]]]></description>
			<content:encoded><![CDATA[<p>Data frames in R are much like DataSets in SAS, SPSS, .NET, etc.  Really, they are just spreadsheets that feel like a matrices. We can use these to look at numerical data along with any meta data or characteristics associated with the numbers though numbers are not required.  From the R Documentation, &#8220;a data frame is a list of variables of the same length with unique row names&#8221;, and also it is &#8220;a matrix-like structure whose columns may be of differing types (numeric, logical, factor and character and so on)&#8221;.</p>
<p>Let&#8217;s take a look at an example.  First, we start with generating a 3 x 3 identity matrix and assigning the matrix to the variable, <strong>mat</strong>.</p>
<p>[code]<br />
mat = diag( 3 )<br />
[/code]</p>
<p>By typing <strong>mat</strong>, we can see the output.</p>
<pre><strong>     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    1    0
[3,]    0    0    1
</strong></pre>
<p>Next, we are going to convert this matrix to a data frame called <strong>mat_dataframe</strong> and output it.</p>
<p>[code] mat_dataframe = data.frame( mat )[/code]</p>
<p><strong> </strong></p>
<pre><strong>  X1 X2 X3
1  1  0  0
2  0  1  0
3  0  0  1
</strong></pre>
<p><strong></strong></p>
<p>Notice that the column names are <strong>X1</strong>, <strong>X2</strong>, and <strong>X3</strong> and that the row names are <strong>1</strong>, <strong>2</strong>, and <strong>3</strong>. Say we want to add more columns and rows to our data frame. Let&#8217;s first start by appending a row to the &#8220;mat_dataframe.&#8221;  We do this with <strong>rbind</strong>.</p>
<p>[code]<br />
mat_dataframe = rbind(mat_dataframe, c(2,2,2))<br />
[/code]</p>
<p>We have added a vector of twos to the next row of the data frame.  Here&#8217;s what <strong>mat_dataframe</strong> looks like so far.</p>
<pre><strong>  X1 X2 X3
1  1  0  0
2  0  1  0
3  0  0  1
4  2  2  2</strong></pre>
<p>Now, we should try appending 2 columns to the <strong>mat_dataframe</strong> using 2 different methods. The first line will create a new data frame from the original data frame and append a column called &#8220;City&#8221; with &#8220;Dallas&#8221; as the entry for each row. The second takes this data frame and adds another column called <strong>Color</strong> with entries <strong>blue</strong> and <strong>green</strong>.</p>
<p>[code]<br />
mat_dataframe = data.frame( mat_dataframe, City="Dallas" )<br />
mat_dataframe = cbind( mat_dataframe, Color=c( "blue", "green" ) )<br />
[/code]</p>
<p>Now, the <strong>mat_dataframe</strong> looks like this.</p>
<pre><strong><strong>  X1 X2 X3   City Color
1  1  0  0 Dallas  blue
2  0  1  0 Dallas green
3  0  0  1 Dallas  blue
4  2  2  2 Dallas green</strong></strong></pre>
<p>Notice that once <strong>blue</strong> and <strong>green</strong> were both used, they were both repeated. Before we move on, let me mention a gotcha when adding columns.  On the <strong>City</strong> column, I simply inserted <strong>Dallas</strong> for each row, but under the <strong>Color</strong> column, I added 2 different colors.  What happens if we specify three values? Let&#8217;s try this with a new column called <strong>Country</strong>.</p>
<p>[code]<br />
mat_dataframe = data.frame( mat_dataframe, Country=c( "USA", "Canada", "Mexico" ) )<br />
[/code]</p>
<p>We get the following error&#8230;</p>
<p><strong>Error in data.frame(mat_dataframe, Country = c(&#8220;USA&#8221;, &#8220;Canada&#8221;, &#8220;Mexico&#8221;)) : arguments imply differing number of rows: 4, 3</strong></p>
<p>A rule of thumb: make sure the number of values being assigned divides into the number of rows (or columns) of the data frame.  If our data frame had 6 rows (or 9 or 12 or &#8230; ), we could have used the above code.</p>
<p>Our data frame is essentially a matrix with a couple of attached column vectors containing strings. This may not seem very useful at first, but it is a wonderful data structure, making some statistical methods among other things easier to use. Soon, I will post a basic ANOVA example using data frames.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johnramey.net/2009/03/14/a-little-bit-about-data-frames-in-r/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
