Problem
Enterprise Vault (EV) Compliance Accelerator (CA) introduced the NOTWITHIN Search operator in version 14.3. This operator allows for Searches to find items in which the first specified term (on the left of NOTWITHIN) appears outside the context defined in the second phrase (on the right of NOTWITHIN).
Note the following for the terms used in the NOTWITHIN operator:
- The terms are not case sensitive.
- The terms compromising the phrase on the right of NOTWITHIN do not be enclosed in double-quotes.
Under certain circumstances, Searches using the NOTWITHIN operator may provide unexpected results and/or behaviour. Here are some example scenarios:
- There are multiple lines that contain the NOTWITHIN operator.
For example, the following lines are listed in the Content field with the Any Of operator, which do not provide the expected Search hits:
confidential NOTWITHIN Disclaimer: This email and any files transmitted with it are confidential
internal NOTWITHIN This email and any files transmitted with it are internal only
- Unexpected Hit Highlighting seen in Review.
Error Message
None.
Cause
When using the NOTWITHIN operator on multiple lines, the Search Criteria was being incorrectly parsed when compiled for EV Indexing, resulting in unexpected results and/or behaviour.
The issue of unexpected Hit Highlighting when using multiple lines containing the NOTWITHIN operator was due to incorrect RegEx patterns/logic processing of the left and right sides of the operator.
Solution
The issue was first discovered in CA version 14.5.2, and may be present in other versions up to any fix version(s) listed below.
The issue of using the NOTWITHIN operator on multiple lines is fixed in the following release(s), available in the Download Center at https://www.veritas.com/support:
- Enterprise Vault 15.1.2
The issue of unexpected Hit Highlighting when using multiple lines containing the NOTWITHIN operator is under investigation.
There are currently no plans to address this issue through a patch or hotfix in the current or previous versions of the software. However, it is scheduled to be resolved in the next major product revision. Please note that the product engineering team reserves the right to remove any fix from the targeted release if it does not pass quality assurance tests. Our plans are subject to change, and any actions you take based on this information, or your reliance on it, are at your own risk
The fix for the issue of unexpected Hit Highlighting was tested for the following in the Subject and/or Content fields:
- Expected Hit Highlighting for NOTWITHIN Search terms.
- Expected Hit Highlighting for NOTWITHIN Hotwords.
- Hotword Statistics and Navigation for NOTWITHIN Hotwords (Surveillance only).
- Terms using lower case and upper case characters.
Known Conditions When Using Multiple Lines Containing NOTWITHIN Operators
- When multiple lines containing the NOTWITHIN operator use the same term on the left side of the operator, Hit Highlighting may behave inconsistently. Specifically, terms used on the left side of the operator may be highlighted in phrases listed on the right side of the operator that should otherwise not be highlighted.
For example:
Search Term 1: text NOTWITHIN this is a system generated text message
Search Term 2: text NOTWITHIN text reminders
Item Content: text reminders, this is a text sent with importance.
In this case, the item would be returned in the search as the term text occurs outside the NOTWITHIN phrases. However, Hit Highlighting could highlight the term text in the phrase text reminders. This is due to overlapping RegEx patterns where the pattern for one line overlaps the pattern for the other line(s). Attempting to correct this issue programmatically would require combining the RegEx patterns for each line, creating a much longer and more complex RegEx pattern which would require more resources and time to process. This could potentially result in delays and/or timeouts in the item's Preview and could result in errors if the RegEx pattern exceeds the pre-programmed (non-configurable) 500-character threshold.