R Heads-up and Tips for Beginners

R Heads-up and Tips for Beginners

Last updated:
Table of Contents

These are some points that may help you if you are starting out with R (maybe you are taking the Coursera Data Science course).

R vectors are 1-indexed

That means the first index is 1 and not 0 as you may have expected.

Create empty dataframe and add data to it

I'm not sure this is the optimal way to handle dataframes but it gets the job done (using rbind changes the names of the labels)

# create a data frame that will hold two integer values
# with labels "id" and "age"
my_df <- data.frame(id=numeric(0),age=numeric(0))

# now add one row to the dataframe
my_df[nrow(my_df)+1,] <- c(id=10,age=25)

# add another one
my_df[nrow(my_df)+1,] <- c(id=15,age=35)

Generate sequences

For example, to run a command a given number of times.

# this prints numbers from 1 to 20
number <- 20
for(i in seq(number)){
    print(i)
}

Use cat():

cat("foo, bar baz")

Sort dataframe by column value

If you have a data frame(in variable data) that has two columns: "name" and "age", you can sort data using the age attribute like this:

sorted_data <- data[with(data,order(age)),]

If you want to sort your data by age from highest to lowest, just give the column name a minus sign:

sorted_data <- data[with(data,order(-age)),]

more info: Dirk Eddelbuettel on sorting in R

Change column names in dataframe

Use function colnames() to take data frame with unwieldy column names and give it nice and easy names to work with:

> names(my_data_frame)
[1] "Provider.Number" "Hospital.Name"   "Address.1"
colnames(my_data_frame) <- c("number","name","address")
> names(my_data_frame)
[1] "number","name","address"