How to utilize sub() and gsub() functions in R

Here is one possible paraphrase of the introduction:

“In this section, we will provide an introductory overview.”

In R, the sub() and gsub() functions can be used to replace a string or characters in a vector or data frame with a designated string. These functions come in handy for making modifications on extensive sets of data.

In this article, you will discover the utility of using the sub() and gsub() functions in R.

Requirements

To finish this tutorial, you will require the following:

  • R installed locally or on a server.

The format of sub() and gsub() functions

The fundamental syntax for sub() is:

sub(pattern, replacement, x)

The fundamental syntax for utilizing gsub() is:

gsub(pattern, replacement, x)

You need to provide a pattern, a replacement, and a vector or data frame when using the sub() and gsub() functions.

  • pattern: The pattern or the string which you want to be substituted.
  • replacement: A input string to substitute the pattern string.
  • x: A vector or a data frame to substitute the strings.

The pattern can be expressed as a regular expression (regex) as well.

Now that you have grasped the syntax, you can proceed to the execution.

The sub() Function in the R programming language.

In R, the sub() function is used to substitute the string within a vector or a data frame with either the input or a designated string.

The sub() function’s drawback is that it only replaces the initial instance.

Using the sub() function.

In this instance, discover the method of using the sub() function to swap a string pattern with a different string.

# the input vector 
df<-"R is an open-source programming language widely used for data analysis and statistical computing."

# the replacement
sub('R','The R language',df)

Executing this command produces the subsequent result.

Output

“The R language is an open-source programming language widely used for data analysis and statistical computing.”

The sub() function substitutes the occurrence of ‘R’ in the vector with the string ‘The R language’.

In this instance, only one instance of pattern matching took place. Let’s ponder what would occur if there were several instances of pattern matches.

# the input vector
df<-"In this tutorial, we will install R and show how to add packages from the official Comprehensive R Archive Network (CRAN)."

# the replacement
sub('R','The R language',df)

Executing this command produces the subsequent results.

"In this tutorial, we will install The R language and show how to add packages from the official Comprehensive R Archive Network (CRAN)."

In this particular case, you can see that the sub() function has substituted the initial instance of the string ‘R’ with ‘The R language’. However, the subsequent occurrence within the string remains unaffected.

Utilizing the sub() function in combination with a data frame.

The sub() function is also compatible with data frames.

# creating a data frame
df<-data.frame(Creature=c('Starfish','Blue Crab','Bluefin Tuna','Blue Shark','Blue Whale'),Population=c(5,6,4,2,2))

# data frame
df

The data frame that follows will be generated.

      Creature Population
1     Starfish          5
2    Blue Crab          6
3 Bluefin Tuna          4
4   Blue Shark          2
5   Blue Whale          2

Replace the term ‘Blue’ with ‘Green’ in the given statement.

# substituting the values
sub('Blue','Green',df)

Executing this command produces the subsequent result.

Output

“c(\”Starfish\”, \”Green Crab\”, \”Bluefin Tuna\”, \”Blue Shark\”, \”Blue Whale\”)” “c(5, 6, 4, 2, 2)”

You have the option to specify a specific column where you want to replace every instance of ‘Blue’ with ‘Green’.

# substituting the values
sub('Blue','Green',df$Creature)

Executing this command produces the subsequent result.

Output

“Starfish” “Green Crab” “Greenfin Tuna” “Green Shark” “Green Whale”

Every occurrence of the characters ‘Blue’ has been substituted with ‘Green’.

The R function gsub()

In R, the gsub() function is employed to perform replacement tasks by substituting the specified values with the input.

In contrast to the sub() function, the gsub() function performs a global substitution for all occurrences.

1. Employing the gsub() function.

In this instance, discover how to replace a string pattern using the gsub() function.

# the input vector
df<-"In this tutorial, we will install R and show how to add packages from the official Comprehensive R Archive Network (CRAN)."

This data contains multiple instances where ‘R’ is written.

# substituting the values using gsub()
gsub('R','The R language',df)
Output

“In this tutorial, we will install The R language and show how to add packages from the official Comprehensive The R language Archive Network (CThe R languageAN).”

Every occurrence of ‘R’ has been substituted (including occurrences in “Comprehensive R Archive Network” and “CRAN”). The gsub() function identifies each word that matches the specified criterion and substitutes it with the provided word or values.

2. Employing the gsub() function with data frames

The gsub() function can also be used with data frames.

# creating a data frame
df<-data.frame(Creature=c('Starfish','Blue Crab','Bluefin Tuna','Blue Shark','Blue Whale'),Population=c(5,6,4,2,2))

How about we start the values in the Creature column with ‘Deep Sea ‘?

# substituting the values
gsub('.*^','Deep Sea ',df$Creature)

Executing this instruction yields the subsequent result.

Output

“Deep Sea Starfish” “Deep Sea Blue Crab” “Deep Sea Bluefin Tuna” “Deep Sea Blue Shark” “Deep Sea Blue Whale”

In this instance, the gsub() function employs the regex .*^, which represents the pattern indicating the beginning position of the string.

In summary, or To conclude

In this article, you have learned about the utilization of the sub() and gsub() functions in R. These functions allow you to replace a specific string or characters within a vector or data frame. The sub() function replaces only the first occurrence, while the gsub() function replaces all occurrences.

Keep expanding your knowledge by learning about the application of the replace() function in R.

 

More tutorials

Basics of Graph Plotting – Comprehending the plot() Function in R(Opens in a new browser tab)

strsplit function in R(Opens in a new browser tab)

The Command design pattern.(Opens in a new browser tab)

Java Tutorial for beginners(Opens in a new browser tab)

The Java language implements the Prototype Design Pattern.(Opens in a new browser tab)

 

Leave a Reply 0

Your email address will not be published. Required fields are marked *