Classifying data for external distribution

Posted by

We currently classify our information based on its level of sensitivity. Anything with a confidential label is caught in our email protection software and requires manual release out of the network to ensure that confidential information isn’t being sent arbitrarily outside of our company.

Our problem is that we send out a large volume of RFPs, Pricing schedules, contracts etc. that we don’t want our competitors to have access to, but that we also do not want to have to manually review and release. How do other companies handle this? Some of the options I’ve explored included:

adding an additional classification for this type of information

forcing users to review the document for confidential information before sending it out, email software would prompt them.

Any other ideas?


Have you considered creating an authenticated extranet site to host the documents? That would mean that:
. The content in said site would be authoritative
. It would be a single place where both internal personnel as well as external stakeholders could go to get said authoritative content
. You would be vastly increasing security over the uncontrolled realm of email
. Monitoring and workflows could potentially be created to see how effective you are being with that content (“effective” being defined as whatever KPI’s make sense for your organization)
. Related to monitoring, but with a different purpose, you could also institute auditing
Just a thought.


This is an interesting problem to solve because of the assumed volume. In a nutshell, the use of Semantic Technology and rules based evaluation is how I might tackle this. Essentially you build a library of terms; phrases; synonyms for those terms and phrases of types of sensitive content that might be shared.

An example of this is ‘Pricing Schedule’, ‘Price Schedule’ ,’Schedule A’, ‘Fee Matrix’. They all could mean the same thing right? So we ingest the document and it’s unstructured content. We then build one rule to look for “Price Schedule” in the data. We then build a synonym library for all the variations of “Price schedule” that can be maintained separately outside the application for flexibility. Now when a document is created, all of the unstructured content of the documents through this rule/workflow/process. You then automatically start adding flags; attributes; that might indicate it’s sensitive and route appropriately.

Clear as mud?

RhinoDox 844.RHINODOX

Hi Kate,
Is this, by chance, running in SharePoint? I have implemented many large scale RFI/RFP and procurement solutions both SharePoint on-premise and Office 365.

If you’d let me know the technology, I’m certain I could point you in the right direction.

Microsoft SharePoint MVP

I shared your question with Mike, our Kofax Sales Engineer and here is what he had to say – please let me know if you’d like to have a phone conversation with him. He’s excellent with solution advice!

Hi Nicki,

This is an interesting problem since the documents in question are in electronic form so classification is a manual process as one checks the documents into a repository (ECM).

The easiest method in my opinion is through the document taxonomy. Adding another classification level makes the most sense but its only as effective as the person classifying the document. As long as you have consistent and accurate classification, then the problem is solved.

Forcing a user to review the document prior to hitting send, while a good idea, is only as effective as the diligence of the sender. For somebody sending lots of documents per day, I can see that method quickly falling on its sword.

Password protecting sensitive documents might be an option but there are many problems to that scenario.

Email audit trails are probably the only way to enforce compliance. Identify the culprits that send out sensitive information and act accordingly.


I think the classification of the document would appear relevant. I believe a way to view this is in terms of handling restrictions. These may, or may not, remain the same in any given scenario for the same security marking.

We can consider handling by way of 3 components:
The sender
The security marking
The recipient
You could consider a whitelist of recipient domains, people, roles etc that allow a simpler handling pathway for sensitive information. Handling restrictions allow fine tuning of security classification.


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.