Strings

Lecture 12

Dr. Eric Friedlander

College of Idaho
CSCI 2025 - Winter 2026

Introduction

Text data, or “strings”, are very common.
The stringr package, part of the tidyverse, provides a modern and consistent interface for working with strings.
All functions in stringr start with str_.

str_c() combines multiple vectors into a single character vector.
str_glue() from the glue package is great for embedding R code inside a string.
Wrap your R code in {}.
Let’s create a full_name and greeting column using both functions.

tidyr::separate_wider_delim() splits a column into multiple new columns based on a delimiter.
You must provide names for the new columns.

df <- tibble(x = c("a_b_1", "c_d_2", "e_f_3"))
df |> separate_wider_delim(
  x,
  delim = "_",
  names = c("first", "second", "third")
)

# A tibble: 3 × 3
  first second third
  <chr> <chr>  <chr>
1 a     b      1    
2 c     d      2    
3 e     f      3

df <- tibble(x = 1:2, y = c("a,b", "c,d,e"))
df |> separate_longer_delim(y, delim = ",")

# A tibble: 5 × 2
      x y    
  <int> <chr>
1     1 a    
2     1 b    
3     2 c    
4     2 d    
5     2 e

Let’s extract the AR codes from the Naturalization data!

str_length(c("a", "R for data science", NA))

[1]  1 18 NA

x <- c("Apple", "Banana", "Pear")
str_sub(x, 1, 3)

[1] "App" "Ban" "Pea"

str_sub(x, -2, -1)

[1] "le" "na" "ar"

str_to_lower("I am shouting.")

[1] "i am shouting."

str_to_title("a tale of two cities")

[1] "A Tale Of Two Cities"

str_trim() removes whitespace from the start and end of a string.
str_squish() also removes whitespace from the start and end, and reduces any internal whitespace to a single space.

text <- "  this   has  a lot of   whitespace   "
str_trim(text)

[1] "this   has  a lot of   whitespace"

str_squish(text)

[1] "this has a lot of whitespace"

The real power of stringr comes from pattern matching with regular expressions.
We’ll cover these in the next lecture!
Functions include:
- str_detect(): find if a pattern exists.
- str_count(): count the number of matches.
- str_replace(): replace matches with a new string.
- str_extract(): pull out the matching text.

Let’s practice working with strings!

The stringr package provides a consistent set of tools for working with strings.
Create strings with str_c() and str_glue().
Extract data with tidyr::separate_*() functions.
Manipulate letters with str_length(), str_sub(), str_to_*(), and str_trim()/str_squish().
The next step is to master pattern matching with regular expressions.

Read Chapter 14: Strings from r4ds.
Open the Recitation Gem and say “Provide me practice problems for Chapter 14” or work through some of the exercises in the text.
Move on the Lecture 13.