Preset Keywords

Overview of Preset Keywords

The GLASS reference language supports a list of predefined keywords that represent commonly used literal ranges. These preset keywords may be used in place of the literal ranges that they represent anywhere in a GLASS pattern or expression.

For example, use:

  • DIGIT to represent any numeral from 0 to 9, or
  • LETTER to represent any uppercase or lowercase character from A to Z.

Supported Keywords

The following is a list of all preset keywords that are defined in the GLASS language and the equivalent literal ranges.

Dropdown Option GLASS Keyword Equivalent Literal Range Description
Alnum ALNUM Matches any ASCII alphanumeric character. "a-zA-Z0-9"
Byte BYTE Matches any byte. When used with RANGE UNICODE, the GLASS engine will attempt to match an entire Unicode Scalar value if possible (in a greedy manner), and a single octet otherwise. "\x00-\xFF" or "^"
Digit DIGIT Matches any ASCII numeral. "0-9"
Horizontal Space HSPACE Matches any horizontal whitespace character. " \t"
Letter LETTER Matches any ASCII alphabet. "a-zA-Z"
Line LINE Matches any new line or carriage return character. "\r\n"
Non-Alnum NONALNUM Matches any non-ASCII alphanumeric character or octet. "^a-zA-Z0-9"
Non-Alpha NONALPHA Matches any non-ASCII alphabet or octet. "^a-zA-Z"
Non-Digit NONDIGIT Matches any non-ASCII numeral or octet. "^0-9"
Space SPACE Matches any whitespace character. " \t\n\v\f\r"
Vertical Space VSPACE Matches any vertical whitespace character. "\n\v\f\r"
Dropdown Option GLASS Keyword Breakdown Description
Graphic GRAPHIC
  • ALNUM
    "a-zA-Z0-9"
  • Graphic non-alnum
    "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~-"
Matches any ASCII character that is not whitespace or a control character.
Printable PRINTABLE
  • ALNUM
    "a-zA-Z0-9"
  • SPACE
    " \t\n\v\f\r"
  • Graphic non-alnum
    "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~-"
  • Byte range
    "0x80-0xF7"
Matches any printable ASCII character (including horizontal and vertical whitespace) and bytes that can cover extended ASCII and code pages.
Printable ASCII PRINTABLEASCII
  • ALNUM
    "a-zA-Z0-9"
  • SPACE
    " \t\n\v\f\r"
  • Graphic non-alnum
    "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~-"
Matches any printable ASCII character (including horizontal and vertical whitespace).
Printable Non-Alnum PRINTABLENONALNUM
  • SPACE
    " \t\n\v\f\r"
  • Graphic non-alnum
    "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~-"
Matches any printable ASCII character (including horizontal and vertical whitespace) but excluding any alphanumeric characters.
Printable Non-Alpha PRINTABLENONALPHA
  • DIGIT
    "0-9"
  • SPACE
    " \t\n\v\f\r"
  • Graphic non-alnum
    "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~-"
Matches any printable ASCII character (including horizontal and vertical whitespace) but excluding any alphabets.
Sameline SAMELINE
  • ALNUM
    "a-zA-Z0-9"
  • HSPACE
    " \t"
  • Carriage return
    "\r"
  • Graphic non-alnum
    "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~-"
Matches any printable ASCII character (including horizontal whitespace) but excluding vertical whitespace.

Preset Keyword Example 1

Preset range keywords can be used in base patterns as the search pattern. For example, the GLASS code to search for any uppercase or lowercase alphabet from A to Z can be written as Line 1 or Line 2 below:

1 RANGE 'a-zA-Z'
2 RANGE LETTER

See RANGE Component for more information.

Preset Keyword Example 2

Preset range keywords can be used to define specific pattern boundaries around a search pattern. For example, the GLASS code to search for a customer ID number (WW301231018313) that is bounded only by non-alphanumeric characters can be written as Line 1 or Line 2 below:

1 WORD 'WW301231018313' BOUND '^a-zA-Z0-9'
2 WORD 'WW301231018313' BOUND NONALNUM

Based on the WORD configuration above, the GLASS pattern matching engine will return Line 1 and Line 2 as match locations:

1 Customer ID: WW301231018313
2 John Doe|WW301231018313|+65 9876 5432|john.doe@example.com
3 AB1234WW301231018313DE5678

However, Line 3 will not be returned as a match location. The data pattern that resembles a customer ID number failed the BOUND pattern rule as it is surrounded by alphanumeric characters (4 and D) on both sides.

See WORD Component and BOUND Rule for more information.