Regex examples

Costas

Administrator
Staff member
Use Regex.Replace to create a new list of findings

JavaScript:
//test input
[TD][B][url='javascript:onBang('02faq')']BMW 2002 FAQ[/url][/B] [B]![url='javascript:onBang('02faq')']02faq[/url][/B] [/TD][TD][B][url='javascript:onBang('2channel')']5channel[/url][/B] [B]![url='javascript:onBang('2channel')']2channel[/url][/B] [/TD]


string pattern = "[TD][B]<a title=\"(.*?)\" href=\"(.*?)\">(.*?)</a>[/B](.*?)[/TD]";
Regex.Replace(txtInput.Text, pattern, "$3 - URL : $1\r\n", RegexOptions.Multiline | RegexOptions.IgnoreCase);

/*
outputs :
BMW 2002 FAQ - URL : www.bmw2002faq.com
5channel - URL : find.5ch.net
*/

//is not doing the job, because (.*) is greedy
[TD][B]<a title=\"(.*)\" href=\"(.*)\">(.*)</a>[/B](.*)[/TD]

//use (.*?) instead! is non-greedy
[TD][B]<a title=\"(.*?)\" href=\"(.*?)\">(.*?)</a>[/B](.*?)[/TD]

//explained
https://stackoverflow.com/a/3075150


Use Regex.Replace with MatchEvaluator
JavaScript:
//input : ([OrderDatetime] > '5/11/2020 12:00:00 am') AND ([PaymentAmount] = 3) AND ([DeliverDatetime] > '6/12/2020 11:00:00 am') AND ([Status] = 1)
//.          matches any character -- more medium.com/factory-mind/regex-tutorial-a-simple-cheatsheet-by-examples-649dc1c3f285
//.{50} matches 50 characters -- ^.{5,50}$ a line has between 5-50 chars -- ^.{50,}$ a line has min length 50 chars
string pattern = @"(\d{1,2}/\d{1,2}/\d{4} \d{2}:\d{2}:\d{2})...";
var regex = new Regex(pattern);

string o = regex.Replace(textBox1.Text, delegate(Match match)
{
  DateTime k = DateTime.ParseExact(match.Groups[1].Value, "d/M/yyyy hh:mm:ss", CultureInfo.InvariantCulture);
 
  return k.ToString("yyyy-MM-dd");
});

//outputs : ([OrderDatetime] > '2020-11-05') AND ([PaymentAmount] = 3) AND ([DeliverDatetime] > '2020-12-06') AND ([Status] = 1)


Use Regex.Replace to create a new list of findings
JavaScript:
//read a file with keywords
keywordsList = File.ReadAllLines(fl).ToList();
//use \b to match whole word
keywords = @"\b(" + string.Join("|", keywordsList) + @")\b";
//regex ex.  "\b(fantasies|occasionally)\b"
//this will search any of the words and will replace it to $1, will give a result to highlight the word on HTML
htmlBody = Regex.Replace(htmlBody, keywords, "[B]$1[/B]", RegexOptions.IgnoreCase);


Validate with Regex.IsMatch

JavaScript:
private bool IsCharactersValid(string s)
{
    //Accepts alphanumeric characters in Greek and English, lowercase and uppercase, spaces and the following symbols: /:_().,+-
    return Regex.IsMatch(s, "^[A-ZΑ-Ω0-9 ΆΈΉΗΊΪΌΎΫΥΏ_/:().,+-]+$", RegexOptions.IgnoreCase);
}

private bool IsTelephoneValid(string s)
{
    //Form(..3 - ..15), i.e.(up to 3 characters dash up to 15 characters) example 210-3288000
    return Regex.IsMatch(s, @"^\d{3}-\d{1,15}$");
}

private bool IsMobileValid(string s)
{
    //Form(.+2 - ..15), i.e.(up to 3 characters dash up to 15 characters) example +30-6972222222
    return Regex.IsMatch(s, @"^\+\d{2}-\d{1,15}$");
}

private bool isValidEmail(string s)
{ // src - https://docs.microsoft.com/en-us/dotnet/standard/base-types/how-to-verify-that-strings-are-in-valid-email-format
    return Regex.IsMatch(textBox1.Text, @"^[^@\s]+@[^@\s]+\.[^@\s]+$", RegexOptions.IgnoreCase))
}


OR operator

JavaScript:
//input
210-6042612 210-7042612 210-8042612
5210-6042612
+32-5646545645
210-6042612

string pattern = @"(?m)^(\d{3}|\+\d{2})\-\d{1,15}";

Regex reg = new Regex(pattern);
MatchCollection match = reg.Matches(txtInput.Text);

if (match != null)
{
    foreach (Match item in match)
    {
        Console.WriteLine(item.Groups[0].Value);
    }
}

/*
outputs :
210-6042612
+32-5646545645
210-6042612
*/

ref - https://stackoverflow.com/a/14741118

(?m) = multiline modifier, ^ start of each line
(\d{3}|\+\d{2}) =
<div style="padding-left: 30px;">
\d{3} = 3 chars numeric
| OR
\+\d{2} = the symbol + following 2 chars numeric
\d{1,15} = match any numeric with length 1 to 15
</div>




Case insensitive

JavaScript:
//src - https://stackoverflow.com/a/2440009
Regex:
    a(?i)bc
Matches:
    a       # match the character 'a'
    (?i)    # enable case insensitive matching
    b       # match the character 'b' or 'B'
    c       # match the character 'c' or 'C'

Regex:
    a(?i)b(?-i)c
Matches:
    a        # match the character 'a'
    (?i)     # enable case insensitive matching
    b        # match the character 'b' or 'B'
    (?-i)    # disable case insensitive matching
    c        # match the character 'c'



Not operator

^ (caret) the match must start at the beginning of the string
When used inside [ and ] the ^ (caret) is the not operator.

JavaScript:
//input :
//'sdf-d3f', 'dfdf ', 'dfd fd' , ' dfdfd' , ' dfdfdf', 'df', 'sf9f'
string pattern = @"'[^\s]*'";

//getting the quote groups NOT including space, the first the last and prelast.

//using this, getting those have space inside quotes.
string pattern = @"'\s*\w*\s+\w*\s*'";

\s space character
\w matches a character(a-z, A-z, 0-9 and underscore)
\W matches any character that is not a letter
\s matches any white spaces(space or tab)
\S matches any character that is not white space
\d matches a digit(0-9)
\D matches any character that is not a digit
Using * will catch 0 or more characters
Using + will catch 1 or more characters


Using Lookaround and OR operator
JavaScript:
string val = Regex.Replace(txtHost.Text, @"https|http|(?!\.|-)\W+", "");

/*
https OR http OR any character that is not a letter (\W+) BUT is not (dot OR dash) (?!\.|-)
will be replaced with nothing. (\. escape dot)

ref - https://www.c-sharpcorner.com/article/regular-expressions-in-net/
nice explained - https://stackoverflow.com/a/2973495
*/


MUST READ :
MSDN - Regular Expression Options
MSDN - Quick Reference


JavaScript:
^[a-zA-Z]+$

Matches any alphabetic string (consisting of the letters a-z, large or small) of size 1 or longer.
The ^ and $ characters mean "start" and "end" of the line, so they only match full lines consisting of one "word".
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]

src - https://stackoverflow.com/a/34292286


+once or more
A+One or more As, as many as possible (greedy), giving up characters if the engine needs to backtrack (docile)
A+?One or more As, as few as needed to allow the overall pattern to match (lazy)
A++One or more As, as many as possible (greedy), not giving up characters if the engine tries to backtrack (possessive)
*zero times or more
A*Zero or more As, as many as possible (greedy), giving up characters if the engine needs to backtrack (docile)
A*?Zero or more As, as few as needed to allow the overall pattern to match (lazy)
A*+Zero or more As, as many as possible (greedy), not giving up characters if the engine tries to backtrack (possessive)
?zero times or once
A?Zero or one A, one if possible (greedy), giving up the character if the engine needs to backtrack (docile)
A??Zero or one A, zero if that still allows the overall pattern to match (lazy)
A?+Zero or one A, one if possible (greedy), not giving the character if the engine tries to backtrack (possessive)
{x,y}x times at least, y times at most
A{2,9}Two to nine As, as many as possible (greedy), giving up characters if the engine needs to backtrack (docile)
A{2,9}?Two to nine As, as few as needed to allow the overall pattern to match (lazy)
A{2,9}+Two to nine As, as many as possible (greedy), not giving up characters if the engine tries to backtrack (possessive)
A{2,}<br>A{2,}?<br>A{2,}+Two or more As, greedy and docile as above.<br>Two or more As, lazy as above.<br>Two or more As, possessive as above.
A{5}Exactly five As. Fixed repetition: neither greedy nor lazy.

src - Quantifier Cheat Sheet

Regex Storm online tester

Regex Javascript evaluator, explains the steps

Expresso - regular expression development tool

Visual Guide to Regular Expression




named blocks
JavaScript:
/*
https://docs.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#unicode-category-or-unicode-block-p

https://docs.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions

https://stackoverflow.com/a/2502653
*/

public static void Main()
{
  string pattern = @"\b(\p{IsGreek}+(\s)?)+\p{Pd}\s(\p{IsBasicLatin}+(\s)?)+";
  string input = "Κατα Μαθθαίον - The Gospel of Matthew";

  Console.WriteLine(Regex.IsMatch(input, pattern));        // Displays True.
}

extract values
C#:
string patternLine = @"^(.*?) - - (.*?) ""(.*?)"" (.*?) (.*?)$";

Regex regex = new Regex(patternLine);
Match match = regex.Match(@"167.94.138.60 - - [30/Sep/2024:07:30:49 +0800] ""GET / HTTP/1.1"" 200 20748");

if (match.Success)
{
    foreach (var item in match.Groups)
    {
        Console.WriteLine(item);
    }
}

/* this outputs
167.94.138.60
[30/Sep/2024:07:30:49 +0800]
GET / HTTP/1.1
200
20748
*/

//now using
string patternLine = @"^(?<ip>.*?) - - (?<dt>.*?) ""(?<request>.*?)"" (?<status>.*?) (?<size>.*?)$";

//we can access it as
if (match.Success)
{
    Console.WriteLine(match.Groups["ip"].Value);
    Console.WriteLine(match.Groups["dt"].Value);
    Console.WriteLine(match.Groups["request"].Value);
    Console.WriteLine(match.Groups["status"].Value);
    Console.WriteLine(match.Groups["size"].Value);
}

#regex
 
Top