Jeromy Anglim's Blog: Psychology and Statistics


Monday, May 10, 2010

Abbreviations of R Commands Explained: 250+ R Abbreviations

The R programming language includes many abbreviations. Abbreviations exist in function names, argument names, and allowed values for arguments. This post expands on over 150 R abbreviations with the aim of making it easier for users new to R who are trying to memorise R commands.

Context

Abbreviations save time when typing and can make for less cumbersome code. However, abbreviations often make it more difficult to remember a command. This is especially true when the user does not know what the abbreviation stands for.

R has been developed by a group of technical experts with backgrounds in Linux and Unix, mathematics, statistics, and statistical computing. With gaining popularity, R is now being used by people with little to none of this background. Abbreviations which are intuitive to the experts are not necessarily intuitive to this broader audience. The R help system does a reasonable job of explaining the abbreviations in R. However, I thought it would be useful to write a post listing some of the common abbreviations along with the expansion of the abbreviation. Whereas R sometimes errs on the side of assuming expertise, I thought I'd err on the side of assuming naivety. Thus, the table includes many abbreviations which are probably obvious to most readers.

Table of R Commands

R Command Abbreviation Expanded Comments
ls[L]i[S]t objectscommon command in Unix-like operating systems
rm[R]e[M]ove objectscommon command in Unix-like operating systems
str[STR]ucture of an object
unz[UNZ]ip
getwd[GET] [W]orking [D]irectory
dir[DIR]ectory
sprintf[S]tring [PRINT] [F]ormatted
c[C]ombine values
regexpr[REG]ular [EXPR]essionWhy "regular"? See regular sets, regular language
diag[DIAG]onal values of a matrix
col[COL]umn
lapply[L]ist [APPLY]Apply function to each element and return a list
sapply[S]implify [APPLY ]Apply function to each element and attempt to return a vector (i.e., a vector is "simpler" than a list)
mapply[M]ultivariate [APPLY]Multivariate version of sapply
tapply[T]able [APPLY]Apply function to sets of values as defined by an index
apply[APPLY] function to sets of values as defined by an index
MARGIN = 1 or 2 in applyrows [1] come before columns [2]e.g., a 2 x 3 matrix has 2 rows and 3 columns (note: row count is stated first)
rmvnorm[R]andom number generator for [M]ulti[V]ariate [NORM]al data
rle[R]un [L]ength [E]ncoding
ftable[F]ormat [TABLE]
xtabsCross (i.e., [X]) [TAB]ulation[X] is the symbol of a cross; [X] is sometimes spoken as "by". Cross-tabulating means to cross one variable with another
xtable[TABLE] of the object [X]
formatC[FORMAT] using [C] style formatsi.e., [C] the programming language
Sweave[S] [WEAVE]The R Programming language is a dialect of S. Weaving involves combining code and documentation
cor[COR]relation
ancova[AN]alysis [O]f [COVA]riance
manova[M]ultivariate [AN]alysis [O]f [VA]riance
aov[A]nalysis [O]f [V]ariance
TukeyHSD[T]ukey's [H]onestly [S]ignificant [D]ifference
hclust[H]ierarchical [CLUST]er analysis
cmdscale[C]lassical metric [M]ulti[D]imensional [SCAL]ing
factanal[FACT]or [ANAL]ysis
princomp[PRIN]cipal [COMP]onents analysis
prcomp[PR]incipal [COMP]onents analysis
lme[L]inear [M]ixed [E]ffects model
resid[RESID]uals
ranef[RAN]dom [EF]fects
anova[AN]alysis [O]f [VA]riance
fixef[FIX]ed [EF]ffects
vcov[V]ariance-[COV]ariance matrix
logLik[LOG] [LIK]elihood
BIC[B]ayesian [I]nformation [C]riteria
mcmcsamp[M]arkov [Chain] [Monte] [C]arlo [SAMP]ling
eval[EVAL]uate an R expression
catcon[CAT]enatestandard Unix command
aproposSearch documentation for a purpose or on a topic (i.e., [APROPOS])Unix command for search documentation;
read.csv[READ] a file in [C]omma [S]eperated [V]alues formati.e., in each row of the data commas separate values for each variable
read.fwf[READ] a file in [F]ixed [W]idth [F]ormat
seqGenerate [SEQ]uence
rep[REP]licate values of xperhaps also [REP]eat
dim[DIM]ension of an objectTypically, number of rows and columns in a matrix
gl[G]enerate factor [L]evels
rbind[R]ows [BIND]
cbind[C]olumns [BIND]
is.na[IS] [N]ot [A]vailable
nrow[N]umber of [ROW]s
ncol[N]umber of [COL]umns
attr[ATTR]ibute
rev[REV]erse
diff[DIFF]erence between x and a lag of x
prod[PROD]uct
var[VAR]iance
sd[S]tandard [D]eviation
cumsum[CUM]ulative [SUM]
cumprod[CUM]ulative [PROD]uct
setdiff[SET] [DIFF]erence
intersect[INTERSECT]ion
Re[RE]al part of a number
Im[IM]aginary part of a number
Mod[MOD]ulo opertionremainder of division of one number by another
t[T]ranspose of a vector or matrix
substr[SUBSTR]ing
strsplit[STR]ing [SPLIT]
grep[G]lobal / [R]egular [E]xpression / [P]rintEtymology based on text editor instructions in programs such as ed
sub[SUB]stitute identified pattern found in string
gsub[G]lobal [SUB]stitute identified pattern found in string
pmatch[P]artial string [MATCH]ing
nchar[N]umber of [CHAR]acters in a string
ps.options[P]ost-[S]cript [OPTIONS]
win.metafile[WIN]dows [METAFILE] graphic
dev.off[DEV]ice [OFF]
dev.cur[CUR]rent [DEV]ice
dev.set[SET] the current [DEV]ice
hist[HIST]ogram
pie[PIE] Chart
coplot[CO]nditioning [PLOT]
matplot[PLOT] colums of [MAT]rices
assocplot[ASSOC]iation [PLOT]
plot.ts[PLOT] [T]ime [S]eries
qqnorm[Q]uantile-[Q]uantile [P]lot based on normal distribution
persp[PERSP]ective [P]lot
xlim[LIM]it of the [X] axis
ylim[LIM]it of the [Y] axis
xlab[LAB]el for the [X] axis
ylab[LAB]el for the [Y] axis
main[MAIN] title for the plot
sub[SUB] title for the plot
mtext[M]argin [TEXT]
abline[LINE] on plot often of the form y = [A] + [B] x
h argument in abline[H]orizontal line
v argument in abline[V]ertical line
parGraphics [PAR]ameter
adj as par[ADJ]ust text [J]ustification
bg as par[B]ack[G]round colour
bty as par[B]ox [TY]pe
cex as par[C]haracter [EX]tension or [EX]pansion of plotting objects
cex.sub as par[C]haracter [EX]tension or [EX]pansion of [SUB]title
cex.axis as par[C]haracter [EX]tension or [EX]pansion of [AXIS] annotation
cex.lab as par[C]haracter [EX]tension or [EX]pansion X and Y [LAB]els
cex.main as par[C]haracter [EX]tension or [EX]pansion of [MAIN] title
col as parDefault plotting [COL]our
las as par[L]abel of [A]xis [S]tyle
lty as par[L]ine [TY]pe
lwd as par[L]ine [W]i[D]th
mar as par[MAR]gin width in lines
mfg as parNext [G]raph for [M]atrix of [F]igures
mfcol as par[M]atrix of [F]igures entered [COL]umn-wise
mfrow as par[M]atrix of [F]igures entered [ROW]-wise
pch as par[P]lotting [CH]aracter
ps as par[P]oint [S]ize of textPoint is a printing measurement
pty as par[P]lot region [TY]pe
tck as par[T]i[CK] mark length
tcl as par[T]i[C]k mark [L]ength
xaxs as par[X] [AX]is [S]tyle
yaxs as par[Y] [AX]is [S]tyle
xaxt as par[X] [AX]is [T]ype
yaxt as par[Y] [AX]is [T]ype
asp as par[ASP]ect ratio
xlog as par[X] axis as [LOG]arithm scale
ylog as par[Y] axis as [LOG]arithm scale
omi as par[O]uter [M]argin width in [I]nches
mai as par[MA]rgin width in [I]nches
pin as par[P]lot size in [IN]ches
xpd as parPerhaps: [X = Cut] [P]lot ? Perhaps D for device
xyplot[X] [Y] [PLOT][X] for horizontal axis; [Y] for vertical axis
bwplot[B]ox and [W]hisker plot
qq[Q]uantile-[Quantile] plot'
splom[S]catter[PLO]t [M]atrix
optim[OPTIM]isation
lm[L]inear [M]odel
glm[G]eneralised [L]inear [M]odel
nls[N]onlinear [L]east [S]quare parameter esetimation
loess[LO]cally [E]stimated [S]catterplot [S]moothing
prop.test[TEST] null hypothesis that [PROP]ortions in several gropus are the same
rnorm[R]andom number drawn from [NORM]al distribution
dnorm[D]ensity of a given quantile in a [NORM]al distribution
pnorm[D]istribution function for [NORM]al distribution returning cumulaive [P]robability
qnorm[Q]uantile function based on [NORM]al distribution
rexp[R]andom number generation from [EXP]onential distribution
rgamma[R]andom number generation from [GAMMA] distribution
rpois[R]andom number generation from [POIS]on distribution
rweibull[R]andom number generation from [WEIBULL] distribution
rcauchy[R]andom number generation from [CAUCHY] distribution
rbeta[R]andom number generation from [BETA] distribution
rt[R]andom number generation from [t] distribution
rf[R]andom number generation from [F] distributionF for Ronald [F]isher
rchisq[R]andom number generation from [CHI] [SQ]uare distribution
rbinom[R]andom number generation from [BINOM]ial distribution
rgeom[R]andom number generation from [EXP]onential distribution
rhyper[R]andom number generation from [HYPER]geometric distribution
rlogis[R]andom number generation from [LOGIS]tic distribution
rlnorm[R]andom number generation from [L]og [NOR]mal distribution
rnbinom[R]andom number generation from [N]egative [BINOM]ial distribution
runif[R]andom number generation from [UNIF]orm distribution
rwilcox[R]andom number generation from [WILCOX]on distribution
ggplot in ggplot2[G]rammar of [G]raphics [PLOT]See Leland Wilkinson (1999)
aes in ggplot2[AES]thetic mapping
geom_ in ggplot2[GEOM]etric object
stat_ in ggplot2[STAT]istical summary
coord_ in ggplot2[COORD]inate system
qplot in ggplot2[Q]uick [PLOT]
x as argument[X] is common letter for unknown variable in math
FUN as argument[FUN]ction
pos as argument[POS]ition
lib.loc in library[LIB]rary folder [LOC]ation
sep as argument[SEP]erator character
comment.char in read.table[COMMENT] [CHAR]acter(s)
I[I]nhibit [I]nterpretation or [I]nsulate
T value[T]rue
F value[F]alse
na.rm as argument[N]ot [A]vailable [R]e[M]oved
fivenum[FIVE] [NUM]ber summary
IQR[I]nter [Q]uartile [R]ange
coefModel [COEF]ficients
dist[DIST]ance matrix
df as argument[D]egrees of [F]reedom
mad[M]edian [A]bsolute [D]eviation
sinkDivert R output to a connection (i.e., like connecting a pipe to a [SINK])
eol in write.table[End] [O]f [L]ine character(s)
R as software[R]oss Ihaka and [R]obert Gentleman or [R] is letter before S
CRAN as word[C]omprehensive [R] [A]rchive [N]etworkAs I understand it: Inpsired by CTAN (Comprehensive TeX Archive Network); pronunciation of CRAN rhymes with CTAN (i.e., "See" ran as in Iran; "See tan")
Sexpr[S] [EXPR]ession
ls.strShow [STR]ucture of [L]i[S]ted objects
browseEnv[BROWSE] [ENV]ironment
envir as argument[ENVIR]onment
q[Q]uit
cancor[CAN]onical [COR]relation
ave[AVE]rage
min[MIN]imum
max[MAX]imum
sqrt[SQ]uare [R]oo[T]
%o%[O]uter product
&& is ampersand meaning [AND]
|| often used to represent OR in computing (http://en.wikipedia.org /wiki /Logical_disjunction)
:sequence generator; aslo used in MATLAB
nlevels[N]umber of [LEVELS] in a factor
det[DET]erminant of a matrix
crossprodMatrix [CROSSPROD]uct
gls[G]eneralised [L]east [S]quares
dwtest in lmtest[D]urbin-[W]atson Test
sem in sem[S]tructural [E]quation [M]odel
betareg in betareg[BETA] [REG]ression
logNatural [LOG]arithmDefault base is e consistent with most mathematics (http://en.wikipedia.org /wiki /Logarithm#Implicit_bases)
log10[LOG]arithm base 10
fft[F]ast [F]ourier [T]ransform
exp[EXP]onential functioni.e., e^x
df.residual[D]egrees of [F]reedom of the [R]esidual
sin[SIN]e function
cos[COS]ine function
tan[TAN]gent function
asin[A]rc[SIN]e function
acos[A]rc[COS]ine function
atan[A]rc[TAN]gent function
deriv[DERIV]ative
chol[Choleski] decomposition
chol2inv[CHOL]eski [2=TO] [INV]erse
svd[S]ingular [V]alue [D]ecomposition
eigen[EIGEN]value or [EIGEN]vector
lower.tri[LOWER] [TRI]angle of a matrix
upper.tri[UPPER] [TRI]angle of a matrix
acf[A]uto [C]orrelation or [C]ovariance [F]unction
pacf[P]artial A]uto [C]orrelation or [C]ovariance [F]unction
ccf[C]ross [C]orrelation or [C]ovariance [F]unction
Rattle as software[R] [A]nalytical [T]ool [T]o [L]earn [E]asilyPerhaps, easy like a baby's rattle
StatET as softwareAnyone know? Statistics Eclipse?
JGR as software[J]ava [G]UI for [R]pronounced "Jaguar" like the cat
ESS as software[E]macs [S]peaks [S]tatistics
Rcmdr package[R] [C]o[m]man[d]e[r] GUI
prettyNum[PRETTY] [NUM]ber
Inf value[Inf]inite
NaN value[N]ot [A] [N]umber
is.nan[IS] [N]ot [A] [N]umber
S3R is a dialect of [S]; 3 is the version number
S4R is a dialect of [S]; 4 is the version number
Rterm as program[R] [TERM]inal
R CMD as programI think: [R] [C]o[m]man[D] prompt
repos as option[REPOS]itory locations
bin folder[BIN]ariesCommon Unix folder for "essential command binaries"
etc folder[et cetera]Common Unix folder for "host-specific system-wide configuration files
src folder[S]ou[RC]e [C]odeCommon Unix folder
doc folder[DOC]umentation
RGUI program[R] [G]rapical [U]ser [I]nterface
.site file extension[SITE] specific filee.g., RProfile.site
Hmisc packageFrank [HARRELL]'s package of [MISC]elaneous functions
n in debug[N]ext step
c in debug[C]ontinue
Q in debug[Q]uit
MASS package[M]odern [A]pplied [S]tatistics with [S]Based on book of same name by Venables and Ripley
plyr packagePL[Y=ie][R]Double play on words: (1) package manipulates data like pliers manipulate materials; (2) last letter is R as in the program
aaplyinput [A]rray output [A]rray using [PLY]r package
daplyinput [D]ata frame output [A]rray using [PLY]r package
laplyinput [L]ist output [A]rray using [PLY]r package
adplyinput [A]rray output [D]ata frame using [PLY]r package
alplyinput [A]rray output [L]ist using [PLY]r package
a_plyinput [A]rray output Discarded (i.e., _ is blank) using [PLY]r package
RODBC package[R] [O]bject [D]ata[B]ase [C]onnectivity
psych package[PSYCH]ology related functions
zelig package"Zelig is named after a Woody Allen movie about a man who had the strange ability to become the physical and psychological reflection of anyone he met and thus to fit perfectly in any situation." - http://gking. harvard.edu/ zelig/
strucchange package[STRUC]tural [CHANGE]
relaimpo package[RELA]tive [IMPO]rtance
car package[C]ompanion to [A]pplied [R]egressionNamed after book by John Fox
OpenMx packge[OPEN] Source [M]atri[X] algebra interpreterNeed confirmation that [Mx] means matrix
df in write.foreign[D]ata [F]rame
GNU S word[GNU] is [N]ot [U]nix [S]
R FAQ wordR [F]requently [A]sked [Q]uestions
DVI format[D]e[V]ice [I]ndependent file format
devel word[DEVEL]opmentas in code under development
GPL word[G]eneral [P]ublic [L]icense
utils package[UTIL]itie[S]
mle[M]aximum [L]ikelihood [E]stimation
rpart package[R]ecursive [PART]itioning
sna package[S]ocial [N]etwork [A]nalysis
ergm package[E]xponential [R]andom [G]raph [M]odels
rbugs package[R] interface to program [B]ayesian inference [Using] [G]ibbs [S]ampling

Concluding Comments

I thank Tom Short for his R reference Card which provided some inspiration for a starting list of R commands. Feel free to reproduce or adapt this table elsewhere. For example, perhaps it could be included in an R Wiki with additional entries. If you spot an error in the table, let me know in the comments of this post.

I might expand the table in the future. At the moment, it's mainly function names with not many arguments or values of arguments. I also haven't put much time into grouping and ordering the functions.

Related Posts

14 comments:

  1. very good idea!

    is 'ggplot' really from graphics? I think its the main function of ggplot2....

    ReplyDelete
  2. Wonderful post - thank you!

    Tal

    ReplyDelete
  3. @Anonymous
    Yes, ggplot is the main function from the package ggplot2. And the "gg" stands for Grammar of Graphics. This is because Hadley Wickham designed ggplot2 to be theoretically grounded in the Grammar of Graphics. Leland Wilkinson wrote the book on Grammar of Graphics.

    ReplyDelete
  4. You list df as "degrees of freedom". That's true when it is used as an argument, but it is also a function, "density of the F distribution".

    ReplyDelete
  5. sprintf is another UNIX one. Like the C library function of the same name; it's a printf to a string instead of STDOUT.

    ReplyDelete
  6. this is so discouraging to me, *nix abbreviations make no sense

    ReplyDelete
  7. Thanks SeanW and Murdoch: made the changes and added a few more.
    @Anonymous(12/5/2010): the aim of the post was to highlight the logic that went into their creation.

    ReplyDelete
  8. Hey can anyone help me out to know, what are the different companies using R programming language.

    ReplyDelete
  9. @Raghu: This probably isn't the best forum for finding out.
    A few options:
    1. Join Twitter and post a question with the #rstats tag
    2. If you phrase your question right (i.e., do a little internet research; say why you want to know), you could post it on R-help or Stack Overflow with the r tag

    ReplyDelete
  10. nice post !
    Can you please tell more about using CAT command for appending output from R script/termial to already existing text file ?

    ReplyDelete
  11. is ggplot the same as R?
    If no,then i am a first yr student studying statistics and need a link or an ebook having an introduction to R.PLease help!!

    ReplyDelete
  12. @Anonymous Welcome to R!

    Google is your friend. just search for "introduction to R".

    My own post is:
    http://jeromyanglim.blogspot.com/2009/06/learning-r-for-researchers-in.html

    ggplot2 is a graphics package that extends the functionality of R.

    ReplyDelete
  13. Thank you so much for this post. It just drives me insane when I don't know where an acronym or abbreviation comes from. I think this is more than just R being written by people who have more background in computers and statistics. I think that math oriented people simply don't care where words or acronyms come from. I think there are some people whose brains are just fundamentally oriented toward words and sounds and not toward numbers. Word people, as I call them in my own mind, just have a need to know where a word comes from or why, in the case of a word that has a prior meaning, why that word is being used in a particular idiosyncratic way within a particular context. I remember in my graduate statistics class asking the very math oriented Mark Hanson why the 'product moment matrix' was called that, i.e., why the word 'moment' was used in that way. I can still remember the look he had on his face. I could tell he just thought that was the most irrelevant and pointless question he could imagine. His answer was naturally along the lines of 'because that is what it is called.'

    By the way, does anyone here know why it is called the product 'moment' matrix?

    ReplyDelete
  14. Distributions have moments.
    http://en.wikipedia.org/wiki/Moment_(mathematics)

    The first moment is the mean; the second moment is the variance.

    Multivariate distributions have covariances which are moments of that distributions.
    Covariances are based on products of the two variables after centering.

    Thus, a covariance matrix is a matrix of moments.

    ReplyDelete