VBScript program that uses the RegExp object to test a specified regular
expression pattern on a specified string. The program requires two
parameters. The first parameter is the regular expression pattern being
tested. The second is a string that the pattern will be applied to. The
program outputs the number of matches found in the string. For each match
the program outputs the part of string that matched the pattern, the
position in the string where the match begins, and the length of the match.
For example, the regular expression pattern "[a-zA-Z]\w*"
can used to find
valid names (or tokens) in a VBScript statement. You could test how this works
with the following command at a command prompt:
cscript //nologo RegExpMatch.vbs "[a-zA-Z]\w*" "objList.Add objDC.cn, True"
The output that results will be:
Pattern: "[a-zA-Z]\w*"
String: "objList.Add objDC.cn, True"
Number of matches: 5
Match: objList at: 1 (7)
Match: Add at: 9 (3)
Match: objDC at: 13 (5)
Match: cn at: 19 (2)
Match: True at: 23 (4)
Note that both parameters passed to the program are strings, so they must be
enclosed in quotes if they have any embedded spaces. Any quote characters in
a quoted string must be doubled.
RegExpMatch.txt <<-- Click here to view or download the program
This program is based on a script in the book "Windows 2000 Windows Script Host",
by Tim Hill, Macmillan Technical Publishing, Copyright 1999. The program is
called RegExp.vbs (Listing 7.1) on page 197. To help in constructing
regular expression patterns I have collected the following examples:
Pattern | Description |
. | any character |
.. | any two characters |
^abc | string starts with "abc" |
red$ | string ends with "red" |
[0-9] | a digit |
[0-9]* | 0 or more digits |
[0-9]+ | 1 or more digits |
[0-9]? | 0 or 1 digit |
[0-9a-zA-Z] | alphanumeric character |
[^0-9] | not a digit |
[a-zA-Z][a-zA-Z_0-9]* | valid variable name |
a{3} | "aaa" |
a{3,} | at least three "a" characters |
a{3,5} | from 3 to 5 "a" characters |
[0-9]{4} | 4 digits |
\d | any digit, same as [0-9] |
\D | any non-digit |
\w | any word character, same as [0-9a-zA-Z_] |
\W | any non-word character |
\b | word boundary |
\B | non-word boundary |
\s | whitespace character |
\S | non-whitespace character |
\f | form-feed character |
\n | new-line character |
\r | carriage-return character |
\v | vertical tab character |
\onn | octal code nn |
\xnn | hexadecimal code nn |
\5 | repeat previous sub expression 5 times |
(..)\1 | any 2 characters repeated twice, like "abab" |
ab|ac | "ab" or "ac" |
^ab|ac$ | "ab" at start or "ac" at end of string |
^(ab|ac)$ | either string "ab" or "ac" |
[^=]+=.* | command line arguments |
([a-zA-Z]\w*)=(.*) | command line arguments |
(\s*)([a-zA-Z]\w*(\s*)=(.*) | command line arguments with whitespace |
(\s*)/([a-zA-Z]\w*)((=|:)(.*)|) | command line switch |
^(.*?)\1+$ | any repeating pattern in a string |