Whether the enterprise uses dozens or hundreds of data sources for multi-function analytics, all organizations can run into data governance issues. Bad data governance practices lead to data breaches, lawsuits, and regulatory fines — and no enterprise is immune.
Everyone Fails Data Governance
In 2019, the U.K.’s Information Commissioner’s Office fined Marriott International over £99 million ($136 million) for violating the General Data Protection Regulation (GDPR), a European law governing data safety. The U.K. regulatory fine, as well as lawsuits filed in the U.S. stemmed from a 2018 data breach, when the global hotel chain’s 339 million customers’ data was exposed. It’s believed the source of the breach was Marriott’s Starwood subsidiary and Marriott might not have done due diligence when merging its newly acquired subsidiary’s data into its own databases.
In 2017, Anthem reported a data breach that exposed thousands of its Medicare members. The medical insurance company wasn’t hacked, but its customers’ data was compromised through a third-party vendor’s employee.
In the 2020 O’Reilly Data Quality survey only 20% of respondents say their organizations publish information about data provenance or data lineage internally. This means that the vast majority of enterprises aren’t tracking and auditing data along its flow, a critical requirement for data governance.
Even the COVID-19 pandemic and the acceleration to digital transformation — when data and data insights became two of the main business drivers — haven’t improved the situation. A recent Experian survey found that 55% of business leaders don’t trust their data assets, hindering their ability to be fully data driven. Data users in these enterprises don’t know how data is derived and lack confidence in whether it’s the right source to use.
From Bad to Worse
Data analytics and machine learning can become a business and a compliance risk if data security, governance, lineage, metadata management, and automation are not holistically applied across the entire data lifecycle and all environments. If data access policies and lineage aren’t consistent across an organization’s private cloud and public clouds, gaps will exist in audit logs. Inconsistent data access policies may also mean a data practitioner is making decisions on incomplete or out-of-date information.
And that leads to problems: Marriott wasn’t consistent with checking its data from different sources, while Anthem failed to safeguard its data processed through a vendor, and both of them ended up with lawsuits, regulatory fines, and tarnished reputations.
Data quality and lineage issues also lead to inconsistent insights and, with that, decisions that impact the business’ ability to innovate and differentiate. According to the Experian survey, 36% of business leaders say that poor quality data damages the reliability of analytics, 32% — that it negatively affects customer experience, and 32% — that it negatively impacts reputation and customer trust (32%). .
While most IT and business leaders understand the value and importance of good data governance, it’s a complicated process, encompassing data quality, lineage, security, and much more. It also requires considerable investments and many enterprise leaders fear to tackle that expense and effort. Afterall, retrofitting good governance is a momentous task. Just like retrofitting security, the end result might never be as good as had it been done from the beginning.
Better Data Governance: A Hybrid Solution
Considering the very real danger of data breaches, lawsuits, and damaged reputation, giving up is not an option. One possible solution is to adopt a hybrid cloud strategy.
86% of Experian survey respondents’, for instance, are prioritizing moving their data to the cloud in 2022. 91% are committed to improving data quality and 90% — to implementing or improving data governance.
With a hybrid cloud strategy, data security and governance for individual systems is no longer a concern; it has to be independent of the deployed system and infrastructure. Establishing and maintaining consistent data context (security and governance policies) needs to be a fundamental part of your hybrid cloud strategy.
Cloudera SDX does that with a common user interface, regardless of where the data is sourced, migrated, or replicated across your hybrid cloud. It delivers transparent data security and governance policy management as well as enforcement. Administrators set policies once and have them consistently applied everywhere, enabling safe, secure, and compliant end user access to data and analytics.
What’s more, SDX provides access to the lineage, metadata, and metrics associated with data utilization across environments. The propagation of data classifications — automatically gleaned through profiling — along the lineage ensures data access policies are consistently and demonstrably enforced, even as data is moved or derived.
When faced with a potential compliance nightmare as a result of bad data governance practices, it’s more effective and cost-efficient to keep data clean, safe, and up-to-date over time than do nothing. Find out more about unlocking the potential of your data in this whitepaper.