October 2, 2024

C# and regular expressions (regex)

Pretty much every modern computer language has some kind of support for regular expressions, either built in to the language, or via some class. In the case of C# it is done via the "Regex" class.

Some people like to say there are two kinds of regular expressions, but this is nonsense and a gross oversimplification. One reference claims there are 418 "regex flavors" and the count is probably growing every day. The good news is that the flavors all are mostly the same, except in details. Even in cases where that is not true, the basic concepts remain the same.

First of all, you will need this near the top of your program:

using System.Text.RegularExpressions;

Matching

The game here is to test strings using a regex and see if they match. One way is like this:
if ( Regex.IsMatch ( line, @"^[^0-9]" ) )
	Console.WriteLine ( "The string matches." );
Another way is like this:
static string pattern = @"^[^0-9]";
Regex rx = new Regex(pattern);
if ( rx.IsMatch ( line );
	Console.WriteLine ( "The string matches." );
The first way uses a class method, with no need to create a Regex object.
The second way does create a Regex object.
The first is short and compact, the second may be more efficient if the same pattern is used to test many lines, as the Regex gets "compiled" when the Regex object gets created.

Substituting

This is the second main use of a regex. We match against a pattern, and when we find a match we replace the matched pattern with something else. Here the method from the Regex class we want is "Replace". Note that it is common to use the "StringBuilder" class along with Regex when doing these kinds of things, but it is not my purpose to talk about that here.
static string original = "I have an ugly dog";
string result = Regex.Replace ( original, "dog", "cat" );
Here the regex is nothing fancy, just the word "dog" -- which gets replaced with the word "cat".

An example

Note that I haven't said a thing here about what a regex is or can be. I have just shown the methods for using them in C#.
using System;
using System.Text.RegularExpressions;

namespace HogHeaven
{
    public class Program
    {
        public static void Main(string[] args)
        {
            string original = "I have an ugly dog";
            Regex rx = new Regex ( "dog" );
            string result = rx.Replace ( original, "cat" );
            Console.WriteLine ( result );
        }
    }
}
This yields the output "I have an ugly cat".

Another example

Here I a taking strings from the command line and feeding them to the Regex "Replace" method. So you could compile the following program as "rego" and then give the command:
rego "he is ugly" he Maynard
And you would get "Maynard is ugly". Or you could use fancier regex such as:
rego "Sally is ugly" u.* nice
And get: "Sally is nice"

It is worth a peek at the following to see how command line arguments are handled. Note that C# does not place the program name as the first argument like C would do.

using System;
using System.Text.RegularExpressions;

namespace HogHeaven
{
    public class Program
    {
        public static void Main(string[] args)
        {
            if ( args.Length != 3 ) {
                Console.WriteLine ( "Usage: rego string regex repl" );
                return;
            }

            string original = args[0];
            Regex rx = new Regex ( args[1] );
            string result = rx.Replace ( original, args[2] );
            Console.WriteLine ( result );
        }
    }
}
This little program would be a way to experiment with C# regex in any way you might want.


Feedback? Questions? Drop me a line!

Tom's Computer Info / tom@mmto.org