Regular expression reference
In auto-redaction, you can supply a single word or regular expression (regex) that Epiq Discovery matches and automatically redacts in the selected document images. A regex provides a pattern to match instead of matching a literal word. For example, the following RegEx finds email addresses from a specific company (name@epiqglobal.com).
Example: \w+@epiqglobal.com
To find an email address that begins with firstname.lastname, use the following example. This example specifies the following pattern: first name with any number of characters, a literal period (.), last name with any number of characters, a literal at symbol (@), any number of characters, another literal period (.), and three characters at the end.
Example: \w+\.\w+@\w+\.\w{3}
Use the regex elements in the following table to construct patterns for auto-redaction.
| String | Matches text that contains | Example | Possible results |
|---|---|---|---|
| a literal word | the supplied single word. Use alone or combine with other elements. |
Private (HR) department |
Private HR department |
| Metacharacter | Matches text that contains | Example | Possible results |
| . | an alphabetic character, number, or symbol. |
a.z loc....n |
a_z location |
| \d | a number from 0 to 9. |
\d \d{5}(-\d{4}) |
7 66213 (zip codes) |
| \D | a character that is not a number. |
\D\D\D \D |
AbC % |
| \w | a number, letter, or underscore. |
invoice \w-\w\w\w \w\w\w\w |
invoice A-5_1 D234 |
| \W | a symbol, but not a letter or number. |
\W \W\W\W |
$ *-+= |
| \s | a white space character, like tab, space, or carriage return. | a\sb\sc | a b c |
| \S | a character that is not a white space character (tab, space, or carriage return). | \S\S\S | you |
| \ and a symbol | a literal Regular Expression reserved character, such as: \ . { } + ( ) * ? [ ] ^ $ |. Precede the reserved character with a backslash as the escape character. |
a\.c \.\*\? |
a.c .*? |
| Quantifier | Matches text that contains | Example | Possible results |
| * | the preceding character, 0 or more times. |
a*b*c* misspell* |
aaacccc misspel or misspelll |
| + | the preceding character, 1 or more times. |
at+orney \d+\.\d\d
|
attorney 10.00 (two digit, two decimal number) |
| ? | the preceding character, either 0 or 1 times. |
plurals? honou?r |
plurals or plural honor or honour |
| {n} | the preceding character or group for the specified number (n) of times. |
a{3} \d{5} |
aaa 66213 |
| {n,} | the preceding character or group the specified number (n) of times or more. | A{3,} | AAAAAA |
| {n,m} | the preceding character or group the specified number (n) of times, but not more than the maximum (m) times. | \d{2,4} | 19, 198, or 1984 |
| OR | Matches text that contains | Example | Possible results |
| | | text on either side of the pipe symbol, which behaves similar to an OR operand. |
22|33 trade(off|in) |
22 or 33 tradeoff or tradein |
| Group/Choice | Matches text that contains | Example | Possible results |
| [ ... ] | a character or number listed in the brackets. |
[ABC] [123] |
A, B, or C 1, 2, or 3 |
| [n - x] | a character or number in the supplied range, regardless of order. No more than one match can occur. |
[a-z] \+[0-9]{11} |
b +14528281111 |
| [^n] | any character or number other than those listed. |
[^a] [^a-y] |
b z |
| ( ) | all of the supplied characters or numbers. |
a(dmit) ..(465) |
admit br465 |
| Boundary | Matches non-printable characters | Example | Possible results |
| ^ | at the beginning of the extracted text when used outside of square brackets. | ^abc | abc (at the start) |
| $ | at the end of the extracted text. | end$ | end (at the end) |