A Bridge Over Troubled Data: Giving Enterprises Access to Advanced Machine Learning

From banks to healthcare, almost every organisation wants to implement advanced AI and machine learning-powered applications that transform efficiency and create new services and business opportunities

They want more intelligent applications for significant use cases such as real-time fraud prediction, a better customer experience, or faster, more accurate analysis of medical images.

The problem facing most organisations is they store data in different forms and locations, each of which may belong to a business unit or department. Making this data usable by advanced applications is demanding.

Before the advent of the new paradigm – the smart data fabric – the approach would have been to create a data lake or warehouse, using the relatively low cost of storage and compute. The organisation also likely then using time-consuming ETL processes to normalise the data.

The traditional data lake is slow and ever-more swamp-like

This approach, which is still in widespread use, has had its victories but creates a centralised repository that leaves data difficult to analyse and often fails to provide consistent or fast answers to business questions. It tends to bring the data to the query rather than the query to the data, creating latency and often causing significant and unnecessary duplication.

This makes it very difficult to accommodate new data sources in response to changing business requirements, undermining organisational agility. It is also unable to meet today’s demand for clean data fit for new composite applications which are AI-enabled and use machine learning (ML) and integrate with massive, pre-existing datasets.

In truth, almost all organisations still struggle to provide a consistent, accurate, real-time view of their data. The vast majority still keep data in distinct silos, with only perhaps five per cent capable of using data less than an hour old. That will not, for example, enable a significant move from relatively simple fraud detection into prediction, capable of identifying and tracking money laundering activity in hugely complex financial flows.

Organisations make too many decisions using out-of-date information, overwhelmed by the variety of data sources and the complexity of unifying them. Global research earlier this year by InterSystems found almost every participating financial organisation (98%) has data and application silos and significantly more than a third (37%) said their biggest data challenge is the length of time it takes to access that data. Like so many organisations, these financial businesses need the ability to see into their complex, heterogeneous data and to receive fast and consistent answers to their business questions. They need an architecture built around what the business requires, rather than a vast and complicated data warehouse or lake that becomes just another rigid silo.

This will enable businesses to use the ML algorithms they know will bring them big advantages. But advanced analytics and AI depend on clean, harmonised data, which is hard to achieve in a repository. It is why the level of innovation in ML models currently outstrips the rate and scale of deployment. The absence of dependable data makes it impossible to embed these models in the operational applications that generate it. In the meantime, the volume and complexity of data continually grows.

Bring the query to the data

Thankfully, the smart data fabric concept removes most of these data troubles, bridging the gap between the data and the application. The fabric focuses on creating a unified approach to access, data management and analytics. It builds a universal semantic layer using data management technologies that stitch together distributed data regardless of its location, leaving it where it resides. A fintech organisation can build an API-enabled orchestration layer, using the smart data fabric approach, giving the business a single source of reference without the necessity to replace any systems or move data to a new, central location.

Capable of in-flight analytics, more advanced data management technology within the fabric provides insights in real time. It connects all the data including all the information stored in databases, warehouses and lakes and provides the vital and seamless support for end-users and applications.

Business teams can delve deeper into the data, using advanced capabilities such as business intelligence. The organisation can deploy tools using machine learning algorithms that enable next-generation applications.

This is a paradigm shift, bringing two worlds of legacy and new data together for advanced, ML-powered use cases. This is critical, enabling a single view of data across what may be a complex organisation like a financial institution that has a great many legacy silos. The technologies that comprise the fabric transform and harmonise data from multiple sources on demand, making it usable for varied business applications.

Organisations need the smart data fabric to build bridges across all their many types of data in different locations and sources, so they achieve seamless, real-time access and can deploy the next generation of AI-powered applications. It is not, in fact, about the technologies, but about execution and how the fabric serves business agility, future-proofing enterprises and bringing revenue-augmenting transformation within their grasp.

About the Author

Saurav Gupta is Sales Engineer at InterSystems. InterSystems is the engine behind the world’s most important applications in healthcare, business and government. Everything we build is designed to drive better decisions, actions, and outcomes for the people who stake their lives and livelihoods on our technology. We are guided by the IRIS principle—that software should be interoperable, reliable, intuitive, and scalable.

more insights