Advanced Data Governance for Office 365

Posted by

Today, Microsoft announced on the //Office blog new capabilities for Advanced Data Governance at the Ignite conference in September 2016. As of April 1, these features have been released into the O365 platform. As with all O365 releases, these are rolling out in waves. These features are only available inside Office 365 at this time.

There are several features released including intelligent data import, threat protection and information governance. Of significant relevance are the new capabilities around information governance policy. Microsoft is providing the ability to assign retention policy to content across the O365 platform, retaining content in-place in each of the application services.

Retention Policies (5–bdoverview found here) can be defined for managing content in Exchange email, SharePoint sites, OneDrive for Business and Office365 Groups. Policies can retain content indefinitely or for a specific time period. This retention is based on creation or modified dates. Upcoming releases will support the definition of event-based (retention triggers) and custom date support. These policies may be applied uniformly to all content in the targets or through advanced settings, and organizations can selectively apply content containing specific words or phrases. Alternatively, content meeting a certain sensitive information type may be used to apply retention. Content that is under policy is retained for the length of the policy specification. Users may still edit, move or delete the content and the O365 platform will ensure that a copy is preserved in a special location (depending on the application).

Classification Labels (doverview found here) may also be defined for automated or manual application on the content. These classification labels are published in policy groups to different locations within the O365 platform. Once published, the classification label is available to the user for tagging the content. Classification Labels have the option to define retention for the content and/or may declare the item as a record. Classification may also be assigned by default at the library and folder level in SharePoint online.

These additions to the O365 service add a great deal of flexibility for information governance and retention. At the marketing level, these features may seem to fill all possible requirements for a compliant information governance initiative. It is important to understand specific customer requirements and use cases that may make some of these features less applicable to a given situation.

The ability to manually apply classification labels to both email and document content, providing consistent retention is a key objective of many organizations. These new labels do not remove the difficult problem of manual tagging of the content, regardless of what the classification label technically does. Automatic policy assignment methods are provided; however, the granularity, level of insight of policy assignment and flexibility may not meet all requirements. Managing retained content in-place may be a new concept to many organizations, and it is important to understand how combinations of these new features and existing capabilities can be assembled into a solution to meet your requirements and use cases.

This is clearly an important new set of governance features that all Office 365 users need to be be aware of. Enterprise content governance takes advantage of the wide variety of tools available. We look forward to discussions regarding use cases with organizations considering ADG.

Thanks Mike, very interesting!
There will probably be a lot of firms and organizations adopting these tools soon (simply because they are easy for Microsoft customers to apply). So I thought it might be informative to list reasons why the Microsoft Advanced Data Governance should NOT be adopted, or why it falls short in significant ways:

USERS DONT HAVE ACCESS TO CONTENT IN PRESERVATION HOLD LIBRARIES. I.e. Microsoft ADG does manage retention but does not do much to improve access to information by most end users.

For this reason, I don’t see why MS is recommending that ADG Retention Policies should replace in-place records management in SharePoint: in place RM allows users to view content declared as records, while the retention policies don’t do that.

LABELS CAN’T BE APPLIED VIA AN EXPIRY DATE/TIME STAMP. Labels could be a way of getting around the problem that content in preservation hold libraries can’t be seen by users; labelled content CAN be seen, since it remains where it is. However there seems to be no way to apply a label based on an expiry date or time stamp; it would be much better if labels could only be applied after a certain elapsed time, thus allowing a document to be reviewed and edited, and then automatically become a record. By contrast, labelling a document as a record, takes effect immediately and is all or nothing.

AUTO-APPLIED LABELS ARE VERY LIMITED. Labels for retention can be auto-applied based on predefined ‘sensitive information’ types but many organizations won’t be satisfied with these alone. They can also be auto-applied based on search keywords, but this probably wont be a reliable method of classifying records, compared to a true functional classification, or even classification according to document/content types.

LIMITATIONS ON # OF POLICIES: you can’t have more than 10 organization-wide policies on a tenant. You can get around this by using non-org-wide policies (i.e. policies that don’t apply to all locations in your O365 tenant), but those policies are themselves limited to no more than 1000 mailboxes and 100 sites. Large organizations will have thousands of sites and tens of thousands of mailboxes.

NOT NECESSARY IF YOU ALREADY HAVE A CAPABLE ORG-WIDE EDRMS SYSTEM.

This seems mainly relevant for O365 users without such a system. I’d be interested in others’ reactions to this (especially re the proposed replacement of in place records management with labels).

——————————
UNICEF
——————————

Excellent counterpoint, Eric!

Lorne Rogers Vice-Chair, ISO Trustworthy Content/Document Management President/Senior Management Consultant Aria Consulting Ltd. http://ariaconsulting.net info@ariaconsulting.net

Eric – This note responds to your very interesting post on the AIIM Community sites, responding to my post. You “wrote”:

“There will probably be a lot of firms and organizations adopting these tools soon (simply because they are easy for Microsoft customers to apply). So I thought it might be informative to list reasons why the Microsoft Advanced Data Governance should NOT be adopted, or why it falls short in significant ways:”

“USERS DONT HAVE ACCESS TO CONTENT IN PRESERVATION HOLD LIBRARIES. I.e. Microsoft ADG does manage retention but does not do much to improve access to information by most end users. For this reason, I don’t see why MS is recommending that ADG Retention Policies should replace in-place records management in SharePoint: in place RM allows users to view content declared as records, while the retention policies don’t do that.”

The “record declaration” is a state of retention in the context of ADG Retention Policies. The philosophy of ADG Retention Policies is to eliminate the interruption in user processing of content caused by policy application; hence the use of a preservation hold library. Users are not hindered by retention policy enforcement on any item. Record Declaration can occur as part of a Classification Label assignment. This manifests in the behavior of a “declared in-place record” in a given library.

Microsoft is providing a number of options for the behavior of items under policy to meet diverse client requirements. The behavior of the use of Preservation Hold Libraries was set with the Discovery Hold model introduced in the eDiscovery Center.

“LABELS CAN’T BE APPLIED VIA AN EXPIRY DATE/TIME STAMP. Labels could be a way of getting around the problem that content in preservation hold libraries can’t be seen by users; labelled content CAN be seen, since it remains where it is. However there seems to be no way to apply a label based on an expiry date or time stamp; it would be much better if labels could only be applied after a certain elapsed time, thus allowing a document to be reviewed and edited, and then automatically become a record. By contrast, labelling a document as a record, takes effect immediately and is all or nothing.”
Auto-classification and event-based retention are two methods that Microsoft is providing to enable labels to be applied to content after a certain amount of time. Labels can also be changed or updated via these methods.

Since what we are seeing is the initial release of the ADG functionality, we believe that upcoming enhancements such as application of Machine Learning, auto-classification and more date-based criteria, specifically event-based application of policy, will satisfy the specific use case that you are referring to.

“AUTO-APPLIED LABELS ARE VERY LIMITED. Labels for retention can be auto-applied based on predefined ‘sensitive information’ types but many organizations won’t be satisfied with these alone. They can also be auto-applied based on search keywords, but this probably won’t be a reliable method of classifying records, compared to a true functional classification, or even classification according to document/content types.” Labels can also be applied through query-based choices.

Many of your points here are valid for many organizations that have deep and broad retention and classification requirements. One should consider such classification mechanisms as Sensitive Information types as extensions to traditional manual classification mechanisms. Why should organizations trust that users will classify a report containing sensitive content correctly when the technology is available to automatically do that for them reliably?
Auto-application of labels based on search keywords or property values is anticipated to be as reliable as any auto-classification capabilities on the market today as third-party add-ins.

Many organizations today are relying on auto-application methodologies to get to “just good enough” records management. “True Functional Classification”, as you state, results in the highest quality result. But at what cost? Many organizations strive for high quality classification only to see the resulting repository become an abandoned information silo. Placing the burden of functional classification on all content is historically a recipe for failure. Microsoft isn’t creating capabilities that necessarily replace or contradict traditional methodologies for classification, but providing capabilities that can be used as part of an overall strategy for Information Governance and Compliance. Your strategy may be weighted to more complex functional classification mechanisms; however, don’t dismiss the applicability of new methodologies because they don’t fit 100% of your organization’s use cases.

“LIMITATIONS ON # OF POLICIES: you can’t have more than 10 organization-wide policies on a tenant. You can get around this by using non-org-wide policies (i.e. policies that don’t apply to all locations in your O365 tenant), but those policies are themselves limited to no more than 1000 mailboxes and 100 sites. Large organizations will have thousands of sites and tens of thousands of mailboxes.
” Your specification of the published limits are correct; however let’s look at the specifics of these limits.

10 organization-wide policies within a tenant. An organization-wide policy is one that includes ALL sources within the O365 service. This may seem like a limitation to many organizations; however is it really reasonable to believe that a large number of policies should be applied across Email, OneDrive, SharePoint sites, and O365 Groups. One perspective is that a policy published across all of these locations would, by definition, be very broad or very complex. The limitation of 10 organization-wide policies should be perceived as an initial limitation imposed to ensure a high-performant feature while Microsoft gathers data on practical applications of these organization-wide policies.

1000 Mailboxes and 100 sites. You may have misinterpreted this limitation, as the documentation seems to be confusing. Our interpretation of the limitation is that for a specific policy, up to 1000 mailboxes and 100 sites may be specified as “included” or “excluded”. While we could easily see how this could be interpreted as a limitation; however, the identification of 1000 or more mailboxes or hundreds of sites would create an interface nightmare. Consider the enumeration of 100 users, much less 1000. This would be difficult to manage and result in user input errors. We believe that with the initial release, Microsoft has set very high limits in areas for which they have limited visibility into the exact usage profile.

Microsoft balances a number of things when releasing new capabilities. Of the most significant is that they cannot provide unlimited specification of processing requirements as this could significantly impact the overall service in some way. We have seen limitations in the past and also experienced the release of ever-increasing parameters of those limits as more data is collected by Microsoft and the integrity of the service is maintained.

“NOT NECESSARY IF YOU ALREADY HAVE A CAPABLE ORG-WIDE EDRMS SYSTEM.
This seems mainly relevant for O365 users without such a system.” What we have seen is that there is no organization-wide EDRMS system with the global cloud support that Office 365 provides. Managing content in Office 365 enables an organization to avoid the headaches associated with cross-domain authentication, identity management, latency, and synchronization of content. The core benefit of ADG will be to enable most large organizations who use Office 365 to classify, archive and manage the retention of their content in Office 365 instead of deploying expensive EDRMS solutions to duplicate this functionality. This will be especially true for email. We have replaced multiple on-premises EDRMS solutions from the big three ECM vendors with Office 365.

——————————
Gimmal Group
——————————

Mike, thanks for a very detailed response and the inputs directly from Microsoft. In fact it is exactly the kind of response I was hoping for, and you’ve allayed my concerns in most areas.

I’ll structure my reply by my original points:

USERS DONT HAVE ACCESS TO CONTENT IN PRESERVATION HOLD LIBRARIES. You shared a vital clarification: record declaration via application of a label can effectively be used as in-place records management.

IF I HAVE UNDERSTOOD CORRECTLY, then this means that lack of end user access to content in the preservation hold library is not necessarily an issue.

LABELS CAN’T BE APPLIED VIA AN EXPIRY DATE/TIME STAMP You’ve indicated that upcoming enhancements will address this requirement…that’s good to know. (And hopefully it will be soon.)

AUTO-APPLIED LABELS ARE VERY LIMITED. For me the most important points in your response are that end-user manual classification is also limited (in reliability), and that for many organizations ‘just good enough’ auto-classification is actually the best in terms of costs and benefits. I do agree that true functional classification requires rigorous application to yield real benefits, and that this includes risk, such as (as you put it) creating abandoned information silos. Its refreshing to hear a voice from the industry make this point clearly.

LIMITATIONS ON # OF POLICIES. I take your point that org-wide policies are likely to be complex anyway, so perhaps more than 10 is unneeded. Regarding non-org-wide policies, more clarity is needed as your interpretation is different from mine, and (as you say) the Microsoft documentation is unclear.

Thanks again for taking the time to research and respond.

——————————
UNICEF
——————————

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.