Data is hard to do well
Building a data platform, managing data as an asset and delivering value through analytics, visualisation and democratisation. These things are all part of the nirvana that is the data-driven organisation. Still, they are also really difficult, especially in combination. And they’re big. They take ages. If you get them wrong, you can be set back months or years of progress. It only takes a quick google to see headlines like “85% of big data projects fail” or “Why most big data analytics projects fail”.
From experience at BJSS, delivering a range of data programmes into large companies, we’ve compiled five key recommendations that will drive successful outcomes. Implemented, they will bring about repeatable, efficient delivery of data, with reduced risk, on a solid technology foundation, designed for users and the business.
Focus on delivering value throughout
Organisations invest time and attention in data because they want to generate value from it. The great potential benefit promised often prompts teams to embark on large, slow initiatives aimed at producing perfect results. Strong foundations are critical, but you don’t have to wait for them to be completed to start realising a return on investment.
We believe you can and should support the undertaking of analytics projects in parallel with an in-development platform or managed data initiative. This is especially important in the early stages of your data journey. The idea that you need to spend many months getting data into a fit state before you can reap the rewards of data science is a fallacy.
Instead, support your analytics teams to deliver alongside the platform build initially, with a roadmap to bring their solutions into the fold as you develop. Select early use cases that don’t require sensitive data, that work from data sources you can already access or require reduced effort to onboard.
Steel thread your delivery
A mechanism to ensure that value is delivered early is steel threading. More common in agile software development, this approach identifies a minimum effective thread of functionality, data, governance and exploitation, end-to-end, and delivers this into production use first. You build foundational aspects of the technology, onboard key initial data sources, provide a use case which generates business value and have a platform upon which to deliver further use case iteratively – each time building out parts of another thread.
Think holistically: data as a business service
Successful data journeys combine aspects including technology, governance, privacy, security, quality and exploitation. We see success when these are considered with a human-centred, service-oriented lens. Through this viewpoint, you can start to ask:
· What questions are people in the business trying to answer? Why?
· What kind, quality, volume and frequency of data might people need to help them answer these questions?
· How will people turn data into information, what context do they need, and how will this be provided?
· How will people turn this information into action – how will it improve outcomes?
If you are starting a data project, don’t start with the technology, start with – and speak with – the people. Ask questions, frame the problems you’re trying to solve, and capture user needs before exploring how to solve them.
A tool we find invaluable is the Service Blueprint. This captures the end-to-end user journeys of people interacting with the platform, expresses their needs at each significant step, and maps out holistically what needs to be in place to meet those needs for example from a people, technology, process, policy and partnerships perspective.
User serverless data technologies
Technology in the big data space has been genuinely tricky for years. Hadoop, while a turning point for big data, was a massive peak in complexity – one significantly underestimated. Recently, however, there has been a significant reduction in complexity through the use of serverless cloud data services.
The benefits of reduced complexity in data platforms cannot be stressed enough. While there is enormous potential for value in an enterprise data platform, it is no use if you can’t easily build upon it or operate it in production effectively – a trend we saw with on-prem Hadoop Data Lakes from vendors such as Cloudera and Hortonworks. We highly recommend the use of such serverless capabilities where possible – the benefits of reduced complexity often outweigh any potential trade-offs.
Apply engineering rigour with DataOps
Considering that data initiatives are generally large, strategically important programmes, it’s surprising how little engineering rigour tends to be applied to the technical delivery aspects of data.
Delivery of effective data programmes is more akin to software development than ever. Continuous delivery, secure development, automated testing, monitoring, alerting, infrastructure as code and a collaboration culture are all applicable here. Multidisciplinary teams are a must. The tailoring of these to the delivery of data and analytics will significantly reduce the cycle time to deliver insight from data.
Adopting the practices described here will help any organisation exploit data faster and with greater confidence. In my experience, those who follow this path will find their data capabilities maturing rapidly and contributing clear, recognised advantages to everyone in the business.
About the Author
Carl Austin is CTO at BJSS. BJSS is the leading technology and engineering consultancy for business. As the winner of a Queen’s Award for Enterprise, we work with the world’s largest organisations, delivering the IT solutions that millions of people use every day.
Featured image: ©Suebsiri