Introduction to the tidyverse


Marie-Hélène Burle

The tidyverse is a set of packages which attempts to make R more consistent and more similar to programming languages which were developed by computer scientists rather than statisticians.

You can think of it as a more modern version of R.

Base R or tidyverse?

“Base R” refers to the use of the standard R library. The expression is often used in contrast to the tidyverse.

There are a many things that you can do with either base R or the tidyverse. Because the syntaxes are quite different, it almost feels like using two different languages and people tend to favour one or the other.

Which one you should use is really up to you.

Base R Tidyverse
Preferred by old-schoolers Increasingly becoming the norm with newer R users
More stable More consistent syntax and behaviour
Doesn’t require installing and loading packages More and more resources and documentation available

In truth, even though the tidyverse has many detractors amongst old R users, it is increasingly becoming the norm.

A glimpse of the tidyverse

The best introduction to the tidyverse is probably the book R for Data Science by Hadley Wickham and Garrett Grolemund.

Posit (the company formerly known as RStudio Inc. behind the tidyverse) developed a series of useful cheatsheets. Below are links to the ones you are the most likely to use as you get started with R.

Data import

The first thing you often need to do is to import your data into R. This is done with readr.

Data transformation

You then often need to transformation your data into the right format. This is done with the packages dplyr and tidyr.


Visualization in the tidyverse is done with the ggplot2 package which we will explore in the next section.

Working with factors

The package forcats offers the tidyverse approach to working with factors.

Working with strings

stringr is for strings.

Working with dates

lubridate will help you deal with dates.

Functional programming

Finally, purrr is the tidyverse equivalent to the apply functions in base R: a way to run functions on functions.