Given a regular expression with capturing groups, extract() turns each group into a new column. If the groups don't match, or the input is NA, the output will be NA.

extract(data, col, into, regex = "([[:alnum:]]+)", remove = TRUE,
  convert = FALSE, ...)



A data frame.


Bare column name.


Names of new variables to create as character vector.


a regular expression used to extract the desired values.


If TRUE, remove input column from output data frame.


If TRUE, will run type.convert with = TRUE on new columns. This is useful if the component columns are integer, numeric or logical.


Other arguments passed on to regexec to control how the regular expression is processed.

See also

extract_ for a version that uses regular evaluation and is suitable for programming with.


library(dplyr) df <- data.frame(x = c(NA, "a-b", "a-d", "b-c", "d-e")) df %>% extract(x, "A")
#> A #> 1 <NA> #> 2 a #> 3 a #> 4 b #> 5 d
df %>% extract(x, c("A", "B"), "([[:alnum:]]+)-([[:alnum:]]+)")
#> A B #> 1 <NA> <NA> #> 2 a b #> 3 a d #> 4 b c #> 5 d e
# If no match, NA: df %>% extract(x, c("A", "B"), "([a-d]+)-([a-d]+)")
#> A B #> 1 <NA> <NA> #> 2 a b #> 3 a d #> 4 b c #> 5 <NA> <NA>