The Academy Awards provided a great example of the challenges of data integration.
The business output of the data integration processes in the award ceremony is the announcement of a winner in a specific category. The data should be collected, transformed, validated, conform to rules, quantified and categorized. The data collected and the processes used to produce a winner in each category needs to be documented and understood by all the participants in delivering the results. An underlying information catalog capability should allow everyone involved to know exactly what information is being delivered. When you don’t have that capability, somebody ends up saying “There’s a mistake, Moonlight, you guys won Best Picture. This is not a joke.”
The steps of data integration
To successfully achieve data integration, the below capabilities need to be realized:
- Firstly, you need a tool to analyze data.
- Then you need tools that can extract, transform, and aggregate data from all sources whether known or discovered.
- All the data then can be standardized and governed by data quality rules. Those capabilities can run in parallel if required, in your environment (on-premise), in the cloud, or in hybrid mode.
- Lastly, you need to ensure that these capabilities share information (metadata) with the other capabilities so that we can trust the information provided by the data integration process.
Putting the data integration process together
There are a lot of choices for each capability in the data integration process. Every organization can choose a different tool for each step and then do the integration work for themselves. The difficulty with that approach is that every step needs to be reviewed, if any of the above steps are changed. Each change in the data integration process requires impact analysis across every step, in a manual fashion. A robust data integration system has a set of processes that are integrated at the metadata level and a set of tools to manage the overall process, so any change is automatically reflected in the entire process.
Providing the right business result at the right time
The Academy’s data integration process starts when they deliver the ballots to the Academy voters. Do they have the right contact information? Is the right information on the ballots? This process continues as they receive the ballots from the voters. Did they vote on the categories they were eligible for? Have they captured the information correctly? Once the information is validated they produce a business result which is contained in a red envelope. Those envelopes being handed to a famous presenter are still part of the integration process. What could possibly go wrong?
Changing the envelope should not cause a process to fail, if you have the underlying metadata data capabilities to clearly label the envelopes. Any human error can undermine a well-defined and automated process, but the more we can link data and process together in an automated fashion, the more we can trust the information provided.
Trusted information is a requirement for every business
The Academy Awards loves headlines but they would prefer headlines that did not highlight mistakes. Today’s business requirements demand trusted information from the data they have and data sources that did not exist 10 years ago. How can they acquire that trusted information and make solid business decisions on that information? They need a data integration tool that automates all the capabilities and provides the right information to all the users of the trusted information. See how IBM is helping clients achieve these data integration capabilities.