Posted on Mar 14, 2009

A Little Bit About Data Frames in R

Data frames in R are much like DataSets in SAS, SPSS, .NET, etc. Really, they are just spreadsheets that feel like a matrices. We can use these to look at numerical data along with any meta data or characteristics associated with the numbers though numbers are not required. From the R Documentation, “a data frame is a list of variables of the same length with unique row names”, and also it is “a matrix-like structure whose columns may be of differing types (numeric, logical, factor and character and so on)”.

Let’s take a look at an example. First, we start with generating a 3 x 3 identity matrix and assigning the matrix to the variable, mat.

[code]
mat = diag( 3 )
[/code]

By typing mat, we can see the output.

     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    1    0
[3,]    0    0    1

Next, we are going to convert this matrix to a data frame called mat_dataframe and output it.

[code] mat_dataframe = data.frame( mat )[/code]

  X1 X2 X3
1  1  0  0
2  0  1  0
3  0  0  1

Notice that the column names are X1, X2, and X3 and that the row names are 1, 2, and 3. Say we want to add more columns and rows to our data frame. Let’s first start by appending a row to the “mat_dataframe.”  We do this with rbind.

[code]
mat_dataframe = rbind(mat_dataframe, c(2,2,2))
[/code]

We have added a vector of twos to the next row of the data frame.  Here’s what mat_dataframe looks like so far.

  X1 X2 X3
1  1  0  0
2  0  1  0
3  0  0  1
4  2  2  2

Now, we should try appending 2 columns to the mat_dataframe using 2 different methods. The first line will create a new data frame from the original data frame and append a column called “City” with “Dallas” as the entry for each row. The second takes this data frame and adds another column called Color with entries blue and green.

[code]
mat_dataframe = data.frame( mat_dataframe, City="Dallas" )
mat_dataframe = cbind( mat_dataframe, Color=c( "blue", "green" ) )
[/code]

Now, the mat_dataframe looks like this.

  X1 X2 X3   City Color
1  1  0  0 Dallas  blue
2  0  1  0 Dallas green
3  0  0  1 Dallas  blue
4  2  2  2 Dallas green

Notice that once blue and green were both used, they were both repeated. Before we move on, let me mention a gotcha when adding columns. On the City column, I simply inserted Dallas for each row, but under the Color column, I added 2 different colors. What happens if we specify three values? Let’s try this with a new column called Country.

[code]
mat_dataframe = data.frame( mat_dataframe, Country=c( "USA", "Canada", "Mexico" ) )
[/code]

We get the following error…

Error in data.frame(mat_dataframe, Country = c(“USA”, “Canada”, “Mexico”)) : arguments imply differing number of rows: 4, 3

A rule of thumb: make sure the number of values being assigned divides into the number of rows (or columns) of the data frame. If our data frame had 6 rows (or 9 or 12 or … ), we could have used the above code.

Our data frame is essentially a matrix with a couple of attached column vectors containing strings. This may not seem very useful at first, but it is a wonderful data structure, making some statistical methods among other things easier to use. Soon, I will post a basic ANOVA example using data frames.

Posted on Mar 9, 2009

New Blog Theme

Though I loved my old theme featuring one of my favorite books, The Hobbit, I needed more display options. The previous theme only showed my different posts and ignored categories among many other things.

My new theme is as shown…let me know if you like it.

It doesn’t handle my LaTeX plugin very well. I’ll have to work on that later.

Posted on Mar 2, 2009

My Poor Toe

Saturday night, the lights were out, and I was feeling my around the bed as I do every night. Something had changed just enough to throw off my plans. Our bedside stool that was not in its normal position, impeding my path; rather than going around it, I could not see it, and I attempted to plow through it.
In the process, I injured my toe — how much I do not know. I hate going to the doctor, and as much as my wife has encouraged me to do so, I have yet to do it.
Look at it, and tell me if I should go.
img_0307

img_0308

img_0309

img_0310