C++11 regex tutorial part 2
Posted on October 20, 2011 by Paul
The code for this tutorial is on GitHub: https://github.com/sol-prog/regex_tutorial.
In the first part of this tutorial we have used regex_match to verify the user input. The regex_match algorithm in C++11 will return true only if the target string match exactly the regular expression, e.g. [[:digit:]]+ will match exactly "123", "456" and not "123e-05" or "456.22".
If you need a partial match of a given string you could use regex_search which will return true if it finds a partial match in the target string (a match on a substring of the original string). One can also retrieve the result of regex_search in an smatch object.
A practical application of regex_search could be in removing the leading empty spaces from all lines of an input file. Suppose that you have a badly formatted text file which has a lot of empty spaces on each line:
We could use the power of regular expressions to clean this mess from all extra leading spaces and empty lines. Suppose that the above text was placed in a file named "spaces.txt". We could implement a small code that will clean this file:
The above code can be improved by removing the hard coded file names and using argc, argv to retrieve the file names from command line. Also, you should include some error checking code, what will happen for example if the input file is not present ? My purpose here is to exemplify in a simple example the use of regex_search, feel free to improve the code from github.
You could compile the above code with:
regex_04.cpp can be also compiled with Visual Studio 2010, unfortunately at the time of this writing gcc-4.6.1 doesn't have complete implementation of regex functionality.
If you run regex_04, the result is a file with no leading spaces per line and without empty lines.
Another useful algorithm from C++11 regex is regex_replace, this can be used to replace all occurrences of a given pattern with a formatting string. Suppose we have a line of text made of words and numbers in no particular order. We could use regex_replace to extract the numbers or the words from this line:
Let's see the above code in action:
Similar expressions can be constructed for testing any kind of user input. If you want to learn more about regular expressions, the most authoritative source in the filed is the book Mastering Regular Expressions by Jeffrey E.F. Friedl:
If you are interested in learning more about the new C++11 syntax I would recommend reading Professional C++ by M. Gregoire, N. A. Solter, S. J. Kleper 2nd edition:
or, if you are a C++ beginner you could read C++ Primer (5th Edition) by S. B. Lippman, J. Lajoie, B. E. Moo.