Preface

What this book is about

This book explores the ever expanding universe of R. Thus, it covers a wide range of topics, including:

  • The historical development of the R language, the R environment, and the installation of R (Ch 1)

  • The creation of R objects and their fundamental characteristics (Ch 2)

  • R data storage entities, and the import and export of user data files (Ch 3)

  • Data management approaches using base R (Ch 4) and the tidyverse (Ch 5)

  • R approaches to graphics, including base plotting methods (Ch 6) and the ggplot2 package (Ch 7)

  • R functions (Ch 8) including loops, and the creation of user-defined classes and generic methods.

  • Calling other languages (e.g., C, Fortran, Python) and software environments to and from R (Ch 9)

  • Building user-designed R packages (Ch 10)

  • R Interactive interfaces and web applications including approaches from the packages tcltk, plotly and shiny (Ch 11)

  • The fundamental ways that R interacts with your computer (Ch 12)

While this book covers a lot of ground, clearly many other topics could be considered. Topics explored here include those I have found to be particularly useful or interesting during my 20+ years of using R as a biologist and statistician. Notably, descriptions given here often serve as mere starting points for further exploration, and the reader is directed to additional resources when necessary.

One goal in writing this book was to facilitate recognition of R as an important computer language by considering it within a historical and applied context. While ignored in many phenologies of computer languages (e.g., Boutin et al. 2002), R has clearly undergone evolutionary changes from its progenitor languages (e.g., S, Lisp, Scheme, S) to its current status. Further, links can be established to/from R with respect to other popular languages (e.g., C, C++, Matlab, Perl, Python, SQL) and software (Excel, SPSS, SAS), to the benefit of R users.

Individuals from the natural sciences, particularly biologists, are likely to find this book more useful than individuals from other backgrounds. This is because coding examples and applications in the book are general biological. Non-biologists may find, however, that examples readily extend to other settings.

What this book is not about

Notably, although statistics is the primary focus/purpose of R, the primary focus this book is not statistics. Instead I focus on the R language, and the computational characteristics, capabilities, and extensions of the R environment. I take this approach because: 1) coverage of non-statistical topics is challenging in and of itself, and 2) the responsible introduction of statistical algorithms from any program or language (including R) should be accompanied by detailed information concerning the statistical procedures. Many pedagogic resources exist for the statistical application of R. These include: Aho (2014) (the pedagogic statistical companion to this book), Venables and Ripley (2002), Faraway (2004, 2016), Crawley (2012), and Fox and Weisberg (2019), among others.

It should be noted that a number of other R pedagogic texts have emphasized R underlying mechanisms and extensions, while excluding explicit statistical considerations. For instance, Wickham (2019) admirably emphasizes foundational programming ideas in R, but does not thoroughly consider some important programming extensions, including powerful syntheses with Python and Tcl. It should be emphasized that while this text does not focus on inferential statistical methods, it does emphasize methods for handling, summarizing and displaying empirical data.

Conventions

This document has been created with Windows users of R in mind. In the vast majority of cases, however, instructions and examples will be extendable to other platforms. In cases when this is not true I note steps to address these inconsistencies.

Several conventions are followed throughout the text. R package names and important terms are italicized. R functions and code are written in Courier font. For instance, This is code. Functions and code are often written into chunks whose contents are readily copied to a clipboard using an icon located at the top right of the chunk (bookdown version of text only). For instance:

print("hello world")

The results of an evaluated chunk are often printed immediately below. For instance:

[1] "hello world"

References

Aho, Ken A. 2014. Foundational and Applied Statistics for Biologists Using R. CRC Press.
Boutin, P, B Hailpern, T Proebsting, and G Wiederhold. 2002. “Mother Tongues - Tracing the Roots of Computer Languages Through the Ages.” Wired.
Crawley, Michael J. 2012. The R Book. John Wiley & Sons.
Faraway, Julian J. 2004. Linear Models with R. Chapman; Hall/CRC.
———. 2016. Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models. CRC press.
Fox, John, and Sanford Weisberg. 2019. An R Companion to Applied Regression. Third. Thousand Oaks CA: Sage. https://socialsciences.mcmaster.ca/jfox/Books/Companion/.
Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with S. Fourth. New York: Springer. https://www.stats.ox.ac.uk/pub/MASS4/.
Wickham, Hadley. 2019. Advanced R. CRC press.