Network outages have become serious and need smarter mitigation

As the IT industry has become more virtualised with migration to the cloud and increased connectivity, networks have become more complex and prone to outages from internal and external sources.

The costs of such IT downtime remains difficult to quantify, but is certainly very high. Many commentators rely on a 2014 Gartner figure of $540,000 per hour, while acknowledging there are significant variations between organisations.

Outages can be triggered by anything from floods, storms and fires to mistakes by employees or construction workers. They can result from the actions of malicious insiders and of course, from cyber-attacks, which are on the increase. The 2022 Verizon Data Breach Investigations Report reveals ransomware attacks increased by 13 per cent last year, accelerating an upward five-year trend.

Ransomware attackers enjoy paralysing organisations, using encryption of data assets to extort payment with the added threat of exposure on the web. While their effect varies, this year’s IBM Cost of a Data Breach report found the average total cost of a ransomware attack to be $4.5 million, excluding the ransom payment. Tactics used by ransomware gangs to gain access include phishing, the use of stolen credentials and email compromise.

The Verizon report also found denial of service attacks compromising the availability of networks and systems are still very common. Misconfigured cloud storage also continues to be one of the causes of cyber events. While the researchers found external attacks more likely to cause data compromises, “internal sources” were also behind 18 per cent of incidents. The effect of cloud migration is also apparent. In the IBM Cost of a Data Breach report, 45 percent of breaches were cloud-based.

What is certain is that outages, whatever their source, are financially and reputationally costly and hugely disruptive. Whenever they surface, an organisation has to act fast, rapidly deploying engineers and technicians at all hours, often over long distances. The focus is on the job of mitigation and remediation, leaving lessons learned for later.

Barriers to effective remote mitigation

It is clear, however, that there are problems with the task of dealing with outages. Organisations lack remote capabilities because the networks they use for remote access are the same as the production networks that are disrupted. What they need is a fully independent management plane so remote mitigation is effective and swift, eliminating the need for time-consuming and costly journeys by engineers.

Traditionally, network ‘oversight’ has been handled centrally from an organisation’s main data centre or NOC (network operation centre). But the IT technicians may struggle to get new systems working immediately, let alone monitor them remotely and intervene when required. When the primary network fails or internet connectivity is unstable, the limits of standard remote maintenance are quickly reached.

These “in-band” management approaches use protocols such as https, web browser, telenet or SSH, administering the network through the LAN. Data and control commands travel across the same network route, which makes them all equally vulnerable to the same outage. The upshot is that engineers are locked out of the management plane. If an organisation is reliant upon its production network to manage its everyday network, it will lose control of critical devices when any significant disruption occurs. The organisation may then seek remote access through copper telephone lines and modems with all their disadvantages and costs.

The effectiveness of out-of-band

Businesses should now be using “out of band” approaches to get round these significant difficulties. These provide a realistic alternative path of remediation through serial console servers, also known as terminal servers. This provides a separate management plane, often via 4G LTE cellular connectivity. Using this plane, technicians and engineers gain access to all critical network devices remotely. It is not complex, merely requiring console servers at each location which are connected to routers, switches and other key hardware. These devices are brought under control without accessing each production IP address. Using this network, which is separate from the production network, engineers can continually monitor and manage devices without using the data plane.

When an adverse event or disruption occurs, out-of-band technology, especially Smart Out of Band solutions, have a secure alternative pathway to management of their devices, with 4G LTE failover giving the necessary bandwidth for the duration of the event.

Mitigating ransomware attacks

It is also apparent, however, that in the case of ransomware attacks, network visibility continues to be a problem for many organisations. They need a solution that works in conjunction with console servers and out-of-band approaches to provide full oversight of distributed networks from a central hub. This gives a head-start once an incident is revealed. During the event, an engineer using the hub can disable access to affected network equipment to provide isolation, shutting down server-access until remediation is confirmed. The organisation can isolate a branch’s WAN, and if network assets are still beyond control, an engineer can power off via remote PDU control.

Network outages are not new, and in 2021, downtime managed to impact some of the largest enterprises in the world. Even social media giant Meta was impacted globally for almost six hours over two days in October 2021, affecting access to Facebook, Instagram and WhatsApp. While detrimental to the technology firm, it also had ramifications for the people and businesses around the world that rely on its services.

For businesses without the vast IT footprint and resources of a tech giant, there remains a definite imperative to implement technologies that aid mitigation through safe and secure remote access to essential network devices. This is not the sole answer to the problems of unexpected outages, because nobody can prevent acts of God or extreme weather for the time being. Certainly, more dynamic staff cyber training, improved and active incident response practices and investment in cutting-edge cyber-security solutions will reduce the chances of a successful ransomware attack. But a Smart Out of Band approach to network resilience will make a very significant improvement in the speed and effectiveness of mitigation even for the most remote of networks. With the cost of outages now so high, no business should be complacent.

About the Author

Alan Stewart-Brown is VP of EMEA at Opengear, with responsibility for overseeing all Sales, Channel Development, Marketing events and SE activities across the EMEA region. Alans’ primary focus is the development and execution of sales strategies, talent development and channel initiatives that will ensure the accelerated growth of the Opengear business across the region. Alan brings 25 years of sales leadership experience gained across the technology sector, including Wireless LAN, Enterprise Software, BI Analytics and e-Commerce. Before joining Opengear Alan held Senior Pan-European Sales Management positions at Xirrus, Fiserv, AIM Technology, eColor and Phoenix Technologies. Alan holds a Bachelor of Science degree from Imperial College, London.