A regular expression can easily check whether a user entered something that looks like a valid phone number. In this case, backreferences to the captured values are used in the replacement text so we can easily reformat the phone number as needed.
The first word boundary is relevant only when matching a number without parentheses, since the word boundary always matches between the opening parenthesis and the first digit of a phone number.
You can allow an optional, leading “1” for the country code (which covers the North American Numbering Plan region) via the addition shown in the following regex: .
When a question mark follows an unescaped left parenthesis like this, it’s not a quantifier, but instead helps to identify the type of grouping.
Standard capturing groups require the regular expression engine to keep track of backreferences, so it’s more efficient to use noncapturing groups whenever the text matched by a group does not need to be referenced later.
Note that although this recipe claims to handle North American phone numbers, it’s actually designed to work with (NANP) numbers.
The NANP is the telephone numbering plan for the countries that share the country code “1”.
This is a textbook example of where we need a backslash to escape a special character so the regular expression treats it as literal input.
As we’ve repeatedly seen, parentheses are special characters in regular expressions, but in this case we want to allow a user to enter parentheses and have our regex recognize them.
More information is available at See Recipe 3.5 for help implementing this regular expression with other programming languages. This regular expression follows the international phone number notation specified by the Extensible Provisioning Protocol (EPP).
The rules and conventions used to print international phone numbers vary significantly around the world, so it’s hard to provide meaningful validation for an international phone number unless you adopt a strict format. # Repeat the preceding group between 6 and 14 times. EPP is a relatively recent protocol (finalized in 2004), designed for communication between domain name registries and registrars.
Note that the first word boundary token appears after the optional, opening parenthesis.