To effectively handle subject access requests (SARs), you are tasked to define
a reusable custom GLASS data type that can identify all
locations across your organization's data storage platforms that contain data
pertaining to a specific individual, Clark Kent
.
Clark Kent
.\t
, whitespace
, comma ,
, or
pipe |
character, and should be matched across line breaks.In Part 1, we use the MAP operator to declare separate namespaces for the subject's first name and last name.
MAP NOCASE 'FIRST_NAME' 'clark'
MAP NOCASE 'LAST_NAME' 'kent'
See MAP Namespace for more information.
In Part 2, we define the base expression that searches for the first name followed by (THEN) the last name of the subject.
MAP NOCASE 'FIRST_NAME' 'clark'
MAP NOCASE 'LAST_NAME' 'kent'
GROUP 'FIRST_NAME' THEN \
GROUP 'LAST_NAME'
See GROUP and THEN and OR for more information.
In Part 3, we add on the GLASS expression for the range of possible characters that join the subject's first name and last name.
MAP NOCASE 'FIRST_NAME' 'clark'
MAP NOCASE 'LAST_NAME' 'kent'
GROUP 'FIRST_NAME' THEN \
RANGE ' ,|\t\r\n\f\v' TIMES 1-5 THEN \
GROUP 'LAST_NAME'
Using the TIMES option for the RANGE operator will allow the following lines to be returned as matches:
1 | Clark, Kent |
2 | Clark | Kent |
3 | clark kent |
In Part 4, we address Requirement #4 by adding boundary rules for the anchor pattern. Using the BOUND operator, we can reduce the number of potential false positive matches by only reporting first name, last name matches if they are surrounded by non-alphanumeric (NONALNUM) characters.
MAP NOCASE 'FIRST_NAME' 'clark'
MAP NOCASE 'LAST_NAME' 'kent'
(GROUP 'FIRST_NAME' BOUND LEFT NONALNUM) THEN \
(RANGE ' ,|\t\r\n\f\v' TIMES 1-5) THEN \
(GROUP 'LAST_NAME' BOUND RIGHT NONALNUM)
()
in the expression do not change the
logic or precedence of operations. They are added for
readability.See BOUND and Preset Keywords for more information.
In Part 5, we define an ALIAS for the expression in Part 4.
MAP NOCASE 'FIRST_NAME' 'clark'
MAP NOCASE 'LAST_NAME' 'kent'
# SAR search - first name followed by last name
ALIAS 'SAR_NAME_ENGLISH_SEQ_1' \
(GROUP 'FIRST_NAME' BOUND LEFT NONALNUM) THEN \
(RANGE ' ,|\t\r\n\f\v' TIMES 1-5) THEN \
(GROUP 'LAST_NAME' BOUND RIGHT NONALNUM)
REFER 'SAR_NAME_ENGLISH_SEQ_1'
See ALIAS and REFER for more information.
In Part 6, we define a similar ALIAS as Part 5 but for the last name, first name sequence instead.
We join both ALIAS expressions using the OR operator so that either sequence will match.
MAP NOCASE 'FIRST_NAME' 'clark'
MAP NOCASE 'LAST_NAME' 'kent'
# SAR search - first name followed by last name
ALIAS 'SAR_NAME_ENGLISH_SEQ_1' \
(GROUP 'FIRST_NAME' BOUND LEFT NONALNUM) THEN \
(RANGE ' ,|\t\r\n\f\v' TIMES 1-5) THEN \
(GROUP 'LAST_NAME' BOUND RIGHT NONALNUM)
# SAR search - last name followed by first name
ALIAS 'SAR_NAME_ENGLISH_SEQ_2' \
(GROUP 'LAST_NAME' BOUND LEFT NONALNUM) THEN \
(RANGE ' ,|\t\r\n\f\v' TIMES 1-5) THEN \
(GROUP 'FIRST_NAME' BOUND RIGHT NONALNUM)
REFER 'SAR_NAME_ENGLISH_SEQ_1' OR REFER 'SAR_NAME_ENGLISH_SEQ_2'
FIRST_NAME
and LAST_NAME
namespaces can be easily modified without impacting
the rest of the GLASS pattern.
1 2 3 4 5 |
Clark, Kent kent|clark <filler text> Kent Clark |
Based on the GLASS expression in Part 6, line 1 and line 2 will be returned as match locations by the GLASS pattern matching engine.
Line 3 and line 4 will be returned as a single match as (Kent\nClark
) as
the expression in Part 3 accounts for last name and first name matches across
line breaks.