Opening the doors to greater data value with data catalogue

‘Alexa, What’s the weather going to be like today?’ 

For many of us, as consumers, it’s common to start our day by asking Alexa – or our other smart assistants for information we need for the smooth functioning of our everyday lives. In fact, we take it for granted that the information we want is readily accessible. But that’s not always the case.

In the workplace, for example, information is not as readily accessible as it is in our personal lives. Why? Because up until now, data for the enterprise digital transformation is not as complete Yet, with the growing demand for data to support growth and problem solving, we’re seeing more organisations focused on identifying and accessing high-quality enterprise data. 

Data fuelled decisions

Data is typically scattered across hundreds or even thousands of cloud and on-premises systems, from legacy transactional databases and spreadsheets to cloud-based marketing systems and data lakes. Adding to the complexity is the influx of newer data sources and applications, such as the internet of things (IoT) and artificial intelligence (AI).

If data is an essential element in digital transformation, the difficulty for employees to find the right data when they need it is a major reason why so many organisations fall short in their digital transformation initiatives. 

Whether the goal is improving the customer experience, delivering analytical insights for decision-making, or migrating operations to the cloud, success depends on the ability of employees to track down relevant data and understand its quality and provenance. And, with experts forecasting that the amount of enterprise data will double every two years (if not more), this challenge is getting more complex. 

The result is that much of the data that could be valuable in launching ambitious digital transformation efforts is vastly underutilised, if it’s used at all. 

“Most organisations use only a small percentage of the data they have access to — in my experience less than 5% — even though they continue to collect and store terabytes of data,” says Shervin Khodabandeh, partner and managing director at Boston Consulting Group.[1]

Managing data 

Ideally, business and IT users could search for enterprise data as easily as running a Google search. And have access to ratings and reviews on the data from other users to guide them just like we use Yelp to guide our personal choices. 

To achieve this, enterprise information needs to be catalogued and classified in a logical fashion — “democratised” for use by business users, data scientists, application developers, and other stakeholders across the organisation. Nontechnical business analysts would have self-service access via semantic search, similar to the way consumers filter retail products by brand, color, and other attributes. They would have the context needed to understand and trust the data — where the data is coming from, who uses it, what other data is it related to, and the quality of the data.

That search effort would deliver relevant results no matter where data resides in the enterprise because it’s powered by an intelligent data catalogue, a technology layer to inventory data across the cloud and on-premises to make it accessible. AI and machine learning capabilities make data catalogues “intelligent,” for auto tagging with extreme accuracy, analysing data similarity and defining lineage, and, above all, empowering speed, scale, automation, and insights enterprise data management needs in the digital era.

“Managing data in today’s world without a data catalogue is ill advised and impractical,” says a report by Eckerson Group, a research and consulting firm.[2] “We’re moving rapidly to an era where communication, collaboration, and crowdsourcing are the mainstays of data management.” 

The data challenge 

If data isn’t consistent, comprehensive, and accurate, digital transformation efforts may fall short of objectives in a wide range of areas, such as: 

Laying the foundations for advanced analytics. Data scientists often spend 80% of their time searching for data, and just 20% on actual AI/ML and modeling. A data catalogue reverses the equation by providing quick data discoverability and access to relevant information. That lets data scientists and business analysts use trusted data to deliver the insights needed for data-driven decision-making.

Developing a 360 degree customer experience. Because customer data exists in so many corners of the enterprise, it’s essential for organisations to have a holistic 360-degree view across all sources if they are to truly understand customers as individuals. By identifying all key sources of customer data, a data catalogue provides the foundation for more personalised engagement and improved customer experience.

Supporting and accelerating smooth cloud data migration. Now that old-time myths about security and cost are effectively debunked, most organisations — even in hold-out industries such as healthcare and government — have embarked on journeys to the cloud. But moving an on-premises data warehouse to a cloud-based alternative, such as Amazon Redshift, Google Big Query, Microsoft Azure SQL Data Warehouse, or Snowflake, isn’t as simple as flipping a switch. Cataloguing data enables architects to first understand the data landscape, assess data quality, select the right data for migration, understand downstream impacts, and, ultimately, accelerate cloud data warehouse modernisation.

Putting data governance and data privacy at the heart of every decision. Meeting the demands of existing and looming data security and privacy regulations can’t be accomplished if organisations are in the dark about what data they have, where it exists, and how they are permitted to use it. A data catalogue supplies the data discovery capabilities critical to identifying and managing data under governance controls and establishing digital trust with customers, employees, and other critical stakeholders.

There’s no escaping that technology is changing the way we work. And one thing we can be sure of is that catalogue-based, smart data management sits at the centre of a successful digital transformation initiative. We can all be a bit hesitant of change at first but ultimately it opens up the path to success for your business. 

[1] Forbes Insights, “Intelligent Data Catalogs: At the Forefront of Digital Transformation,” April 2018.

[2] Eckerson Group, “The Ultimate Guide to Data Catalogs,” March 2018.

About the Author

Ronen Schwartz is SVP & GM, Data Integration, Data Engineering, and Cloud at Informatica. Informatica is the only proven Enterprise Cloud Data Management leader that accelerates data-driven digital transformation. Today’s intelligent enterprise is fueled by data and that data must be properly managed, governed, protected, and utilized – this is where Informatica drive§s next-generation innovation with its Intelligent Data Platform™.

Featured image: ©Issaronow