Chapter 11 Homework 2
11.1 Overview
11.2 Setup
Create a new R Markdown file with the title “Homework 2” and with you as the author (hint: this information should be in your frontmatter at the top of the file).
For this assignment, you’ll need to have the tidyverse collection of R packages installed.
If you haven’t already, go ahead and install them by running install.packages("tidyverse")
.
In your R markdown file, create a section heading for each of the following parts of your homework:
- Part A. Visualizing data with ggplot2
- Part B. Loading data in R
11.3 Part A. Loading data in R
Find a publicly available dataset that is represented as a .csv file. Your chosen dataset should not be trivial (# rows * # columns should be at least 250 cells), and you may not choose the same dataset that I demonstrate in this week’s lecture material. When choosing a dataset, keep in mind that you need to creating at least three different types of plots using your chosen dataset. You will upload the chosen data set as a .csv file along with your homework submission.
Tell me about your data set. Where did you get it? Who/what created it?
Load your dataset into a data frame (or a tibble, your choice)
Print the number of rows and columns (using R code)
Print the column names (using R)
If your dataset is not tidy, reformat it (using R code) to be tidy.
Use the
mutate
function (from the dplyr package) to add a column to your data frame. It’s okay if the new column isn’t useful.
11.4 Part B. Visualizing data with ggplot2
Using ggplot2, create three different types of plots (use at least three different geoms) using your chosen data set from Part A. Be creative! Feel free to draw inspiration from the R Graph Gallery.
On each of your plots, explicitly label your axes (e.g., using the labs
function).
On at least one of your plots, use either the color or fill aesthetic.