R

StringR – operations on strings

This time we’ll show you some helpful functions from the stringr library that work on strings. Let’s load the library first:

library(stringr)

Let’s create a sample vector with company names:

n = c("ibm","asus","acer","microsoft","lenovo","msi","dell")

We can use the function str_detect to check wheather a particular phrase/letter occurs in a paricular element of the vector:

str_detect(n,"a")

[1] FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE

Str_which in turn will return us the index of elements containing that phrase:

str_which(n,"a")
[1] 2 3

If we want to count how many times a particular phrase occurs in the elements, we use str_count:

str_count(n,"l")
[1] 0 0 0 0 1 0 2

Text fragments can be separated with str_sub:

str_sub(n,start = 1,end = 2)
[1] "ib" "as" "ac" "mi" "le" "ms" "de"

We select the elements that contain the phrase using str_subset:

str_subset(n,"a")
[1] "asus" "acer"

And we extract the length of each element with str_length:

str_length(n)
[1] 3 4 4 9 6 3 4

Unnecessary spaces at the beginning and end of the string are removed with str_trim:

str_trim(" dell   ")
[1] "dell"

Replacing phrases in elements with other other phrases with str_replace:

str_replace(n,"a","A")
[1] "ibm"       "Asus"      "Acer"      "microsoft" "lenovo"    "msi"       "dell"     

We convert strings to uppercase with str_to_upper:

str_to_upper(n)
[1] "IBM"       "ASUS"      "ACER"      "MICROSOFT" "LENOVO"    "MSI"       "DELL"  

As in a sentence we will use str_title:

str_to_title(n)
 [1] "Ibm"       "Asus"      "Acer"      "Microsoft" "Lenovo"    "Msi"       "Dell" 

We will combine two strings with str_c:

str_c(n,str_to_upper(n))
[1] "ibmIBM"             "asusASUS"           "acerACER"           "microsoftMICROSOFT" "lenovoLENOVO"       "msiMSI"             "dellDELL"    

We convert a vector of strings into a single string with:

str_c(n,collapse = ';')
[1] "ibm;asus;acer;microsoft;lenovo;msi;dell"

We sort the strings with str_sort:

str_sort(n)
[1] "acer"      "asus"      "dell"      "ibm"       "lenovo"    "microsoft" "msi"  

Of course, you can find more functions in the documentation. We encourage you to.

Leave a Reply

Your email address will not be published. Required fields are marked *