0 Items | 0.00
Go

Tips on Understanding Microsoft Regular Expressions


Tips on Understanding Microsoft Regular Expressions

Joseph Parlas, CCSI, CCVP, CCNP, CCNA, A+, MCSE

 

Abstract

This white paper focuses on the regular expression process and the syntax used by the Microsoft OCS Expert to create a dial plan and normalization rules that will be properly interpreted and executed.

Introduction

Microsoft has introduced regular expressions for the main purpose of normalizing E.164 numbers and allowing users to dial numbers by a pattern they are accustomed to and to define routes to send to an external gateway for PSTN connectivity. Regular expressions are also used for Address Book translations of numbers in users contact database that would have to be converted to the E.164 format.

This white paper focuses on the regular expression process and the syntax used by the Microsoft OCS (Optional Component Manager) Expert to create a dial plan and normalization rules that will properly be interpreted and executed.

We also will be introducing tool sets that can be used right on your XP or Vista computer to test regular expression constructs without disturbing the corporate production environment.

Let’s start by looking at some of the basic constructs of the regular expression itself by homing on some of the basic symbols used. These examples are from a pdf document that can be downloaded from http://www.addedbytes.com/.

The first building block symbols are ^ and $.

    ^ means the start of the string or “must start here”
    $ end of the string or “must end here

The starting point of your regular expression should be ^$, then add the rest of the constructs between them.
These symbols are also referred to as anchors, and represent the start and end of whatever you are looking for.The next series of symbols to consider, which are part of groups or ranges, are listed below.

( ) The parentheses represent a set or a group reference where (defines the beginning of a group declaration and ) defines an end of a group declaration. An example of grouping is (\d{3}). For now don’t worry about the \d{3} within the group; that will be discussed below. Just understand we create the grouping of our expressions using the () symbols.

After declaring a group, it is referenced on replacement as $1 where 1 represents the first group placement. We will use this in an example to better understand the relationship.

[ ] Brackets represent a range of items you are looking for. Only one item is matched within a range specification. As an example, all North America Numbering codes for service are 211, 311, 411, 511, 611, 711, 811, and 911. I could represent all these variations using the range specification [2-9]11 where the range will fall on 2 - 9.

\ The back slash represents an escape character, which means to escape a meaning of something because you are trying to match it as a character and not use it as an regular expression option. For instance, + means 1 or more in regular expression language; however, I need to match + as part of an E.164 number. To prevent it from being used as a regular expression verb, we add a ‘\’ before the value as in this example : ‘^\+404’. We are looking for +404 as a number and do not want to use the actual noun one or more.

\d This is a character class that is used to represent the number of digits, regardless of the actual value. For instance, if I am looking for 3 digits in the range of 0-9, I could use ‘/d/d/d’ since /d would represent one digit condition. This could be awkward if we want to find, for example, a combination of 9 digits. The alternative is to use a number representation in curly braces {N} where N is an integer value of the number of digits you are trying to match. So in our previous example to match any 3 [0-9] combinations I could write this two ways. The first way, described earlier, is ‘/d/d/d’ but an easier approach would be to use’/d{3}’.

Now let’s look at other symbols that are used routinely in OCS.

Quantifiers

* - means 0 or more digits to follow
+ - means at least 1 or more digits to follow
? - means exactly 0 or 1 more digits to follow
{6} - means exactly 6 digits to follow
{3,} - means at least 3 or more digits to follow
{5,9} - means 5, 6, 7,8 or 9

Note that the quantifiers with {} are typically used after the \d character class.

Assertions

?= - this is a look ahead
?| - negative look ahead
?() - if then condition
?()| - if then else condition
?# - place a comment

Related Courses

Configuring Microsoft Office Communications Server 2007


Copyright © 2012 Global Knowledge (S.A.E). Registered in Egypt with company no. 26800.
RSS. (Srv: 222)