Lecture 11
College of Idaho
CSCI 2025 - Winter 2026
dplyr.readr::parse_double() converts strings to numbers, assuming they are just numbers.readr::parse_number() is more flexible and can extract numbers from strings with other text.Let’s work with the parks dataset.
count()dplyr::count() is a quick way to count the number of rows for each unique value of a variable.group_by() and summarize(n = n()).What cities appear in the parks dataset most frequently?
count() to sum up a variable instead of just counting rows.Let’s create a new metric: best city for parks since this was created.
Let’s do some practice!
+, -, *, /, ^.%% (remainder) and %/% (integer division) are useful for modular arithmetic.flights dataset, create new columns for the departure hour and minute.log(), log2(), log10()) are useful for data that spans multiple orders of magnitude.round(x): rounds to the nearest integer, can specify number of digits.floor(x): always rounds down.ceiling(x): always rounds up.cut() divides a numeric vector into a set of discrete bins (a factor).breaks for the bins.Compute the average rank of each city, rounding to 2 decimal places.
dplyr::min_rank() gives ranks, handling ties by giving them the same rank.desc() to rank from highest to lowest.row_number() is similar but gives each row a unique rank.dplyr::lag() gets the previous value in a vector.dplyr::lead() gets the next value.mean(): the average value. Can be sensitive to outliers.median(): the middle value. More robust to outliers.sd(): standard deviation, measures how spread out the data is around the mean.IQR(): interquartile range (Q3 - Q1), measures the spread of the middle 50% of the data.min() and max(): the minimum and maximum values.quantile(x, p): finds the value that is greater than p% of the data.
quantile(x, 0.25) is the 25th percentile (Q1).median(x) is a shortcut for quantile(x, 0.5).parse_number().cut().count() is a powerful tool for quick exploration.min_rank) and offsets (lag, lead) for more complex analysis.mean, median), spread (sd, IQR), and position (min, max, quantile).