Managing the Dynamic Datacenter

Datacenter Automation

Subscribe to Datacenter Automation: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Datacenter Automation: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Datacenter Automation Authors: Elizabeth White, Yeshim Deniz, Pat Romanski, Liz McMillan, Glenn Rossman

Related Topics: Virtualization Magazine, Desktop Virtualization Journal, Datacenter Automation


How Data Virtualization Improves Data Quality

Innovative data virtualization approaches save time and money

Data Quality Is Key to Business Success
Poor data quality costs businesses billions every year.  It causes business leaders to make poor decisions.   It increases customer dissatisfaction and churn to reduce revenues.  It increases costs to perform remediation.  It discourages employees by setting a low performance bar.

Yet, even with these significant issues, enterprises struggle to improve data quality.

Improving Data Quality Is More Difficult Than Ever
Data quality, once the focus of just a few data stewards, has become a business and IT challenge of unprecedented scale.  Not only must business users be more confident than ever in the data they use, IT must now address data quality everywhere in the enterprise, including:

  • Systems of Record – Transaction and other source systems
  • Consolidated Data Stores – Purpose-built data warehouses, marts, cubes and operational data stores
  • Virtualized Data – Shared views and data services
  • Visualization and Analysis Solutions – Business intelligence, reporting, analytics, etc.

Data Quality Strategy Reaches Far Beyond ETL and the Data Warehouse
Traditionally, data quality efforts have been focused on the consolidated data alone using a number of tools and techniques to do batch “clean up” of the data on the way into the warehouse.  And while data quality tools market is extensive at nearly three quarters of a billion dollars annually and forecasted double digit growth a year over the next five years, these data quality tools investments are proving necessary, but not sufficient tor today's challenges.

Data virtualization can complement the data quality strategies, processes, and tools you use for your systems of record, and consolidated data stores, and directly address virtualized data as well as visualization and analysis solutions.

Data Virtualization Improves Quality of Virtualized Data
Data virtualization embeds a number of important data quality improvement mechanisms and techniques that complement and extend data quality tools.

Data virtualization can easily support data validation, standardization, cleansing and enrichment, and more.  These rules are emdeded in the view and data service definitions from the start.  And at runtime they are automatically invoked.  This means better data quality, not only for virtualized data, but also for the visualization and analytic applications that leverage data virtualization as their real-time data source.

Data Virtualization Eliminates Many Root Causes of Poor Data Quality
In his white paper, Effecting Data Quality Improvement through Data Virtualization David Loshin, president of Knowledge Integrity, Inc and a recognized thought leader and expert consultant in the areas of data quality, master data management, and business intelligence describes how data virtualization helps overcome the four major causes of poor data quality including:

  • Structural and semantic inconsistency – Differences in formats, structures, and semantics presumed by downstream data consumers may confuse conclusions drawn from similar analyses.  Data virtualization lets you transform sources so your consumers get normalized data with common semantics, eliminating confusion caused by structural and semantic inconsistencies.
  • Inconsistent validations – Data validation is inconsistently applied at various points in the business processes, with variant impacts downstream.  Data virtualization lets all your applications share the same validated, virtualized data so you get the consistency you need.
  • Replicated functionality – Repeatedly applying the same (or similar) data cleansing and identity resolution applications to data multiple times increases costs but does not ensure consistency. Data virtualization lets you develop and share common data quality rules so you don’t have to reinvent the wheel with each new requirement.  And by centrally controlling these rules, data virtualization lets you avoid building extra governance systems to automate the controls that data virtualization solutions provide automatically.
  • Data entropy – Multiple copies of the same data lead to more data silos in which the quality of the data continues to degrade, especially when levels of service for consistency and synchronization are not defined or not met.  Data virtualization reduces the number of data copies required thereby mitigating data entropy.

Take Advantage of the Data Virtualization Opportunity
By providing a new enablement option and eliminating many of the root causes of poor data quality, data virtualization enables you to meet your data quality goals more effectively, saving both time and money.  Take advantage.  Your enterprise will be glad you did.

More Stories By Robert Eve

Robert Eve is the EVP of Marketing at Composite Software, the data virtualization gold standard and co-author of Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility. Bob's experience includes executive level roles at leading enterprise software companies such as Mercury Interactive, PeopleSoft, and Oracle. Bob holds a Masters of Science from the Massachusetts Institute of Technology and a Bachelor of Science from the University of California at Berkeley.