I am starting to see a convergence of two major trends in the marketplace: information governance and Big Data. We are coining the term “Big Data Governance” to reflect this emerging trend. I define Big Data Governance as the formulation of policy to optimize, secure, and leverage Big Data as an enterprise asset by aligning the objectives of multiple functions.
Here is the framework that I have developed to establish the scope of information governance:
- Master Data Governance
These include a single view of customers, materials, vendors, employees and chart of accounts. Each data domain has specific attributes that need to be fit for purpose. For example, phone number is an important attribute for the customer data domain, because it is important for an enterprise to have valid contact information in case of need.
- Reference Data Governance
These include data that is relatively static such as codes for countries, states or provinces, currencies, industries and customer segments.
- Big Data Governance
These include social media (Twitter feeds, blogs, Facebook pages, LinkedIn profiles), cell phone GPS data, sensor data, weather data, etc. These data tend to be operational in nature and meet the three “V” criteria – volume, velocity, and variety.
Most of my clients are implementing information governance programs today. These programs focus on the governance of master data and, to a lesser extent, reference data. Based on my conversations, I expect that clients will increasingly focus on the governance of big data in the next 12-18 months.
Big Data Governance programs need to focus on issues that are similar to other information governance initiatives. For example, these programs need to address the following:
- Information Lifecycle Management – Big Data programs need to ensure that storage costs do not spiral out of control.
- Data Quality – Organizations need to establish what level of data quality is “good enough” because of the high volume and velocity of Big Data.
- Metadata – Big Data Governance needs to create sound metadata to avoid situations such as where a company bought the same dataset twice because it was named differently within two different repositories.
- Privacy – Enterprises need to be very specific about adherence to privacy concerns, such as leveraging social media analytics.
All said and done, 2012 should be a breakout year for Big Data Governance programs.