WORD Operator

Overview of the WORD Operator

Use the WORD operator to search for a specific data pattern. A location will be returned as a match if the data pattern is found in the location.

GLASS Studio WORD Component

Defining a WORD

Using the WORD operator, you can define a word, a sentence, or any string of characters as the data pattern to search for.

GLASS Studio Define WORD Component

The generic GLASS syntax for WORD is:

WORD '<data pattern>'

For example, search for a specific:

  • Word which may appear in a bank account statement (e.g. Account), or
    WORD 'Account'
    
  • String of words that can be found on an employee ID card (e.g. Employee ID), or
    WORD 'Employee ID'
    
  • Number that represents the issuer identification number (IIN) of a payment card issuing network (e.g. 62), or
    WORD '62'
    
  • Set of characters that can be found in the machine-readable zone (MRZ) of machine-readable passports (e.g. <<<<<).
    WORD '<<<<<'
    

Matching a WORD

Matches are not limited by traditional word boundaries (e.g. whitespaces, new lines) and can happen anywhere in a string or data stream, unless pattern boundaries are defined.

WORD Example 1

You write a simple GLASS expression to search for the term ID:

WORD 'ID'

As no pattern boundaries are defined, all the following lines will be returned as match locations by the GLASS pattern matching engine:

1 Employee ID: 1234567890
2 IDENTITY CARD
3 AMOUNT PAID $ 150.33

The equivalent configuration in GLASS Studio Visual Builder mode is:

  • Data type component: WORD
    • Data pattern to search for: ID

Case Sensitivity of a WORD

The NOCASE (No Case) option determines if the GLASS pattern matching engine treats uppercase and lowercase characters as distinct (case sensitive) or equal (case-insensitive) characters.

GLASS Studio WORD Component Case Sensitivity option

By default, GLASS patterns are case sensitive. This means uppercase and lowercase characters are distinct characters. For example, lowercase a and uppercase A are treated as different characters by the GLASS pattern matching engine.

The GLASS syntax to define a case-sensitive WORD is:

WORD '<data pattern>'

If the NOCASE (No Case) option is specified or checked in GLASS Studio Visual Builder mode, uppercase and lowercase characters are equal. For example, lowercase a and uppercase A are treated as the same character by the GLASS pattern matching engine.

The GLASS syntax to define a case-insensitive WORD is:

WORD NOCASE '<data pattern>'

WORD Example 2

WORD NOCASE 'ID'

As the NOCASE option is specified, all the following lines will be returned as match locations by the GLASS pattern matching engine:

1 Employee ID: 1234567890
2 Identity Card
3 Amount Paid $ 150.33

The equivalent configuration in GLASS Studio Visual Builder mode is:

  • Data type component: WORD
    • Data pattern to search for: ID
    • No Case (NOCASE) option: Checked

Decomposing a WORD

If the DECOMPOSE (Decompose) option is specified or checked, the GLASS pattern matching engine matches the search pattern both in its original form and normalized (ASCII) form, where applicable.

This is useful when searching for data patterns that contain language-specific accents or diacritics (e.g. Zürich).

GLASS Studio WORD Component Decompose option

The GLASS syntax to decompose a WORD is:

WORD DECOMPOSE '<data pattern>'

WORD Example 3

Your organization receives a subject access request from an individual named José María. You create a search pattern to find all data storage locations across your organization that contain personal information pertaining to José María.

WORD NOCASE DECOMPOSE 'José María'

Using the custom GLASS expression above, all the following lines will be returned as match locations by the GLASS pattern matching engine:

1 José María
2 josé maría
3 Jose Maria

The equivalent configuration in GLASS Studio Visual Builder mode is:

  • Data type component: WORD
    • Data pattern to search for: José María
    • No Case (NOCASE) option: Checked
    • Decompose (DECOMPOSE) option: Checked

If the DECOMPOSE option is not used in Example 3, Line 3 will not be returned as a match location by the GLASS pattern matching engine.

Pattern Rules for a WORD

You can define the following pattern rules for the WORD data pattern:

GLASS Studio Define Precision Rules for WORD Component

  1. BOUND

Boundary Rules for a WORD

The BOUND precision rule lets you define the list possible characters that the must be found before (BOUND LEFT), after (BOUND RIGHT), or surrounding (BOUND) a WORD for it to be a match.

See BOUND Rule for more information.

Adding a WORD Component

To search for a specific data pattern using the WORD operator in GLASS Studio Visual Builder Mode:

  1. Add a WORD component to the GLASS Studio project.
  2. Define a data pattern.
  3. (Optional) Select the No Case (NOCASE) option to instruct the GLASS pattern matching engine to treat uppercase and lowercase characters as distinct characters. See Case Sensitivity of a WORD for more information.
  4. (Optional) Select the Decompose (DECOMPOSE) option to instruct the GLASS pattern matching engine to match the search pattern both in its original form and normalized (ASCII) form, where applicable. See Decomposing a WORD for more information.
  5. (Optional) Specify additional criteria that must be fulfilled for the search pattern to be returned as a match. See Pattern Rules for a WORD for more information.

Managing the WORD Component

To edit the WORD component:

  1. Left click anywhere in the component to open the base pattern form.
  2. Edit the data pattern, options, or pattern rules, and click Close.

To duplicate the WORD component:

  1. Right click anywhere in the component to open the context menu and click on Copy.
  2. Go to the component which you want to join the duplicated WORD component to.
  3. Right click anywhere in the component to open the context menu and click on:
    • Paste > As Then to join connect the duplicated component in series.
    • Paste > As Or to join connect the duplicated component in parallel.

See Duplicating a Component for more information.

To remove the WORD component along with the corresponding GLASS code:

  1. Right click anywhere in the component to open the context menu and click on Delete, or
    Left click anywhere in the component to open the base pattern form and click on Delete.
  2. In the dialog box, click Delete to confirm the deletion, or click Cancel to cancel the operation. This action can be undone by clicking on the GLASS Studio main menu button > Undo.

To change from WORD to another base pattern:

  1. Left click anywhere in the component to open the base pattern form.
  2. Click on WORD at the top bar and select a different base pattern.
    GLASS Studio Change WORD Base Pattern