Searching Documents For Keywords Using PinPoint Auditor's
Contextual Keyword System

The ability to search text based files and documents for user-defined words and phrases has long been a useful weapon in the PinPoint Auditor arsenal. With comprehensive pre-built keyword lists in a number of key categories, PPA made searching within documents for inappropriate, adult or other policy-violating content an easy process.

But it is clear that any system that searches for keyphrases is prone to producing false triggers due to the imprecise nature of language, and the fact that words often have multiple meanings that change based on the words around them. As an example, the word 'naked' can refer to nudity, as in 'naked bodies', or something benign, for example a 'naked flame'.

Introducing PPA's Powerful Contextual Keywords

Today we are proud to announce the introduction of a new contextual keyword system, that takes keyphrase scanning to a new level of accuracy. It allows users to focus on the information they want, by compensating for changes in the meaning of keywords due to the words that surround them.

PinPoint Auditor's Contextual Keywords Dialog

Accuracy can be vastly improved by using the context keyword system. Users can perform finely targeted searches by adjusting the score of keyphrase 'hits' so that they score higher or lower depending on their context in the surrounding text.

How It All Works

Let's start with an example. Imagine a case where an operator is looking for instances of the word 'naked', to find references to nudity. In the past, any matches would have to be manually screened, to filter out those where nudity was involved, and ignore those where it meant something else. But by using context keywords, these false triggers can be eliminated.

Now, the user can input the parent keyword 'naked', and then specify context keywords that boost or cut the match score, such as 'bodies' and 'flame'. This narrows the scope of search hits for this parent keyword.

Editing a keyword in PinPoint Auditor

In addition, the user can specify how the context keywords should be positioned in context, relative to the parent keyword, and how compensation should be applied if detected. The following options can be used:

  • The context word must be the word immediately before (or after) the parent keyword
  • The context word must be attached to the front (or end) of the parent keyword
  • The context word must be within X characters from the parent keyword (before, after or both)
  • What rating (increase or decrease) should be applied if the specified combination is detected
Editing a context keyword in PinPoint Auditor

This gives fine control over the positioning of context keywords with respect to the parent keyword, and gives a much greater degree of accuracy when it comes to how keywords are treated during scanning.

Narrowing Overly Broad Searches Iteratively

The new contextual keyphrase system can be used right now to improve the performance of existing keyword configurations. Users can optimize the performance of their keyword searches by adopting an iterative approach as follows:

  1. Run a keyphrase scan with settings as they are
  2. Review results, and make a list of false triggers that occur, noting the primary keyphrase and any other context keywords that clarify the meaning of the primary keyphrase.
  3. Prevent future false triggers for that keyphrase by associating the recorded context words to that primary keyphrase
  4. Repeat
By repeating this sequence during regular scans, false triggers will be much reduced or even eliminated, and this will improve the user's productivity, since less time will need to be spent reviewing results.

Great! How Do I Get It?

Existing maintenance customers are invited to download the latest version using the download location provided after purchase. It will work with your existing license. If you don’t have this information, contact us here and we will forward it to you.

If you are a new customer interested in PinPoint Auditor, please contact us for a free trial.