Useful functions in R

I have listed some useful functions below:


The with( ) function applys an expression to a dataset. It is similar to DATA= in SAS.

# with(data, expression)
# example applying a t-test to a data frame mydata 
with(mydata, t.test(y ~ group))

Please look at other examples here and here.


The by( ) function applys a function to each level of a factor or factors. It is similar to BY processing in SAS.

# by(data, factorlist, function)
# example obtain variable means separately for
# each level of byvar in data frame mydata 
by(mydata, mydatat$byvar, function(x) mean(x))

Please look here for more details. calls a function with a list of arguments, lapply applies a function to each element of the list, list(c(1,2,4,1,2), na.rm = TRUE))
lapply(c(1,2,4,1,2), function(x) x + 1)

More examples here.


more() is a user-defined function that is helpful in printing out a large object. Taken from here.

#to print out an object such as data.frame mydf 20 lines at a time, use:

#where more() is defined as

more <- function(expr, lines=20) {
  out <- capture.output(expr)
  n <- length(out)
  i <- 1
  while( i < n ) {
    j <- 0
    while( j < lines && i <= n ) {
      j <- j + 1
      i <- i + 1
      rl <- readline()
      if( grepl('^ *q', rl, ) i <- n
      if( grepl('^ *t', rl, ) i <- n - lines + 1
      if( grepl('^ *[0-9]', rl) ) i <- as.numeric(rl)/10*n + 1


options() can be used to increase the limit for max.print in R. More info here.


To check which columns in the data frame df have missing values

colnames(df)[colSums( > 0]



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s