Chapter 28 Homework 6
28.1 Overview
In this assignment, you will apply K-means clustering to a dataset and interpet the results.
For this assignment, use this file as the template for your homework assignment. You should add new code chunks as needed.
28.1.1 Objectives
- Apply K-means clustering using R
- Interpret the resulting clusters
- Apply a different clustering algorithm of your choice to the dataset and compare its results to the K-means clustering results.
28.2 Setup
Load all of the packages you need in the code chunk below (you may need to install some packages that you do not already have installed). Feel free to load any extra packages that you want to use for your homework.
library(tidyverse)
library(cluster)
Set the random number seed to 1
:
# Set the random number seed:
set.seed(1)
28.3 Part A. Running K-means
28.3.1 1. Load the dataset
Download and load the dataset provided on blackboard (data.csv) as a dataframe. There are no missing values, so you don’t need to remove any missing values. Normally, you would standardize each attribute you plan to cluster with, but for this assignment, don’t do any scaling/standardizing for the clustering.
# Your code here.
These data represent people: their age and how they spend their money (media, food, transportation, housing, and pets). In the questions below, You will use the K-means clustering algorithm to cluster these data.
28.3.5 5. In your own words, describe each cluster.
Look at the centroids for each cluster. Based on the centroids, describe each of the five clusters. Are there any obvious details about the people in each cluster that you can see based on the cluster centroids?
- Cluster 1:
- Cluster 2:
- Cluster 3:
- Cluster 4:
- Cluster 5: