In this Q&A Matt Thomson, Director EMEA Product Specialists at Databricks talks about the growing value of data.
He also discusses how public concerns over data sharing can be assuaged – particularly as the rise of open source and open standards.
1. Data has been predicted to become the currency of the future. Do you agree with this claim?
Absolutely. I’ve always been fascinated with data – I’ve had several jobs where I’ve helped clients develop their big data capabilities, and have taken them into a much more data-driven mindset. And that’s because I believe in its value, as we all should.
Today, data really underpins everything. It’s the key to digital transformation and to improving business performance – allowing teams to identify trends, opportunities, and problems in business operations. Crucially, it’s no longer the domain of just the data science or tech teams. Now, it’s used across the business, and teams like HR, sales and marketing all look to data in increasingly sophisticated ways to help inform key decisions.
How companies use data really can mean the difference between success and failure today, and allows them to remain competitive. It should be every organisation’s goal to become more data-driven.
2. What impact has the pandemic had on changing attitudes to data sharing and AI tools?
The pandemic changed things for almost every business out there. The world sped up with the huge demand for digital, creating pressure for technical teams. This pushed lots of data teams to use AI and ML tools to automate repeatable tasks, freeing up time for data innovation and problem solving to drive business performance. For some, AI and ML tools were a real life line – enabling companies to make critical decisions with their data, on the spot, as they pivoted their entire business models.
Crucially, AI also plays a role in allowing organisations to ask questions about future scenarios. The shock of the pandemic really highlighted the importance of this, and of being ready for the next crisis. Using AI in the long run allows companies to make more accurate predictions and forecast potential issues – from staffing shortages or skills deficits in a workforce, to supply chain disruptions and predicting when key equipment might fail. So its value in a post-COVID world really is huge.
COVID-19 also made a strong case for data sharing tools. With pressure to deliver services quickly even with remote teams, the importance of sharing information quickly – whether with internal teams or external partners – became really crucial. Just look at how governments had to operate, sharing data between multiple departments and even internationally to make snap decisions on how to respond to the crisis at every level.
3. The public has concerns over access to data. Are they right to have concerns, and what can be done to overcome them? How can companies ensure that the data they use is secure?
I think that, as data breaches become more prolific, it’s natural that people will have concerns on how their data is stored, used and handled. This is especially true as hackers are always becoming more sophisticated. Organisations handling customer data must try to stay one step ahead, ensuring that governance and security are built in at the very core of their data and analytics platform. And individuals should expect organisations to hold themselves accountable to this.
There are also extensive ethical considerations when it comes to AI – avoiding biases towards particular groups, for instance. Organisations must stay on top of AI and ethics to maintain customer confidence, ensuring that decisions made in these models are fair and representative. I absolutely believe that regulators must play a role here, ensuring that society stays ahead of this very fast moving field to avoid any unintentional consequences
4. Databricks places a lot of importance on open source and open standards. Why is this so?
Open source tools have become the de facto standard for building and deploying AI and machine learning applications at scale. As such, at Databricks we place huge value on open standards – standards which have been driven from a combination of research, community development and technology companies.
There is, of course, the cost benefit of this approach; open source technology usually comes at little to no cost and has usually been vetted by experts within the ecosystem. This means teams are building on reliable, proven solutions – which reduces the likelihood of risks further down the road. The open source approach also discourages teams from building overly-complex solutions in-house, saving resources from being taken up unnecessarily.
But there’s much more to be said of the open approach. A modern, open and collaborative platform – such as the lakehouse – ensures teams are working collaboratively from a single source of truth, allowing for faster iteration and improvement. This allows organisations to foster a truly data-driven culture, relieving the grind of siloed systems and unreliable data and laying the stage for AI and ML to be brought in.
Open source technology also gives full transparency and visibility into source code, offering data teams a connection to the wider open source community. There’s a lot to be said for peer-to-peer upskilling in technology, which acts as a limitless resource for inspiration, troubleshooting and tech talent. Of course, there are challenges – open technologies move fast and have to be carefully managed to ensure security standards are maintained, for instance. But their value far outweighs the drawbacks.
Featured image: ©Tostphoto