The Case for Building Trust in Network Automation

Today’s multi-cloud enterprise-class networks are incredibly complex and every network equipment manufacturer and network security device company has its own specific way that their tools must be monitored, managed, and protected.

Network operations and network security professionals struggle to address these requirements and keep pace with the speed of network updates. The simplest configuration change or even a typo can sometimes have a ripple effect that results in costly network downtime. In just the first month of 2023, a multinational technology and collaboration company, a top blockchain platform, and a large Canadian health organization all experienced publicly reported network outages that disrupted critical services for their customers. Said more simply, the more complexity there is, the greater the chance of unforeseen issues such as security breaches or human error when managing these networks, and this is an issue that needs to be addressed.

Distrust and skepticism prevail

It’s clear that network automation is becoming increasingly necessary for network operations and network security professionals to ensure business continuity. So, we weren’t surprised to find in our recent survey that nearly every organization is on a path to network automation.

Unfortunately, several factors are impeding their progress, most notably distrust and skepticism which 80% of all respondents cite as top barriers to increasing their use of network automation. Respondents point to numerous reasons but the most prevalent include previous negative experiences (35%), distrust of more automated solutions (38%) and skepticism by leadership (33%).

When asked if they completely trust their current approach to automating network changes, only 24% of network operations and security professionals say they do. And only 20% are completely confident in their ability to rapidly restore their network from backup within a few minutes of an outage or misconfiguration.

While 93% of respondents who haven’t invested deeply in automation say they often address network issues by fixing the immediate problem without addressing the root cause, the fact that 60% of respondents who have invested deeply in automation say the same underscores a lack of confidence and trust in automation.

Lack of automation is costly

Downtime is costly in terms of real dollars. Uptime Institute’s 2022 Outage Analysis report points to network-related issues as the single biggest cause of all IT service downtime incidents with costs ranging from at least $100,000 in total losses to upwards of $1 million, and nearly 30% lasting more than 24 hours, up from 8% in 2017.

However, the greater concern is how long it can take to restore service and the nearly incalculable reputational damage as the clock continues to tick. Network teams also feel the pain. Many have targets of five or six nines of availability – which equates to five minutes of downtime or less per year – and are often at least partially bonused based on network reliability. When only 20% say they can restore their network within a few minutes, the pain is real for teams, organizations, and customers.  When networks are down, business stops.  When business stops, the downstream impact on vendors and customers is incalculable.

Confidence-building capabilities to look for

So, what can you do to help your organization increase network automation and improve business continuity? Consider a network automation platform that offers a range of capabilities and supports best practices proven to help teams and leadership gain confidence and trust in network automation.

Look for a solution that enables automated backups of all devices on the network – even discovering and backing up new devices as they are added. Users should be able to schedule and store any number of configuration backups for as long as needed in a central and secure location. We recommend creating backups daily as well as before and after changes and making sure that companies can recover from network outages in minutes with single-click recovery. Pre- and post-checks with automated reporting of device backups that failed and the ability to automatically retry unsuccessful backups ensure all devices are covered and enable teams to minimize manual work by quickly ferreting out devices that truly need human attention.

In many cases, outages don’t occur immediately after a configuration change so getting to the root cause is challenging. The ability to store a long history of backups within an autoscaling, fault-tolerant data store enables teams to access the data they need to go back in time to correlate past configuration changes during root cause analysis. Successful troubleshooting instills confidence that the problem has been fully remediated and service has been fully restored. Automation capabilities build confidence and trust in automation outcomes, clearing a path to improved business continuity, and enabling network operations teams to focus on more higher-level, proactive projects.

About the Author

Andrew Kahl, CEO of BackBox. BackBox is a market leader in network automation, security and management solutions. We help companies worldwide automate and streamline complex tasks, ensure network health and performance, achieve business continuity, and do more with fewer resources.

Featured image: ©Yingyaipumi