Searching Documents For Keywords Using PinPoint Auditor's
Contextual Keyword System
The ability to search text based files and documents for user-defined words and phrases has long been a useful weapon in the PinPoint Auditor arsenal. With comprehensive pre-built keyword lists in a number of key categories, PPA made searching within documents for inappropriate, adult or other policy-violating content an easy process.
But it is clear that any system that searches for keyphrases is prone to producing false triggers due to the imprecise nature of language, and the fact that words often have multiple meanings that change based on the words around them. As an example, the word 'naked' can refer to nudity, as in 'naked bodies', or something benign, for example a 'naked flame'.
Introducing PPA's Powerful Contextual Keywords
Today we are proud to announce the introduction of a new contextual keyword system, that takes keyphrase scanning to a new level of accuracy. It allows users to focus on the information they want, by compensating for changes in the meaning of keywords due to the words that surround them.
Accuracy can be vastly improved by using the context keyword system. Users can perform finely targeted searches by adjusting the score of keyphrase 'hits' so that they score higher or lower depending on their context in the surrounding text.
How It All Works
Let's start with an example. Imagine a case where an operator is looking for instances of the word 'naked', to find references to nudity. In the past, any matches would have to be manually screened, to filter out those where nudity was involved, and ignore those where it meant something else. But by using context keywords, these false triggers can be eliminated.
Now, the user can input the parent keyword 'naked', and then specify context keywords that boost or cut the match score, such as 'bodies' and 'flame'. This narrows the scope of search hits for this parent keyword.
In addition, the user can specify how the context keywords should be positioned in context, relative to the parent keyword, and how compensation should be applied if detected. The following options can be used:
- The context word must be the word immediately before (or after) the parent keyword
- The context word must be attached to the front (or end) of the parent keyword
- The context word must be within X characters from the parent keyword (before, after or both)
- What rating (increase or decrease) should be applied if the specified combination is detected
This gives fine control over the positioning of context keywords with respect to the parent keyword, and gives a much greater degree of accuracy when it comes to how keywords are treated during scanning.
Narrowing Overly Broad Searches Iteratively
The new contextual keyphrase system can be used right now to improve the performance of existing keyword configurations. Users can optimize the performance of their keyword searches by adopting an iterative approach as follows:
- Run a keyphrase scan with settings as they are
- Review results, and make a list of false triggers that occur, noting the primary keyphrase and any other context keywords that clarify the meaning of the primary keyphrase.
- Prevent future false triggers for that keyphrase by associating the recorded context words to that primary keyphrase
- Repeat
Great! How Do I Get It?
Existing maintenance customers are invited to download the latest version using the download location provided after purchase. It will work with your existing license. If you don’t have this information,
contact us here and we will forward it to you.
If you are a new customer interested in PinPoint Auditor, please contact us for a free trial.