Alois Reitbauer, Chief Technology Strategist at Dynatrace, asks how automation and artificial intelligence can help manage performance of today’s ever-complex digital infrastructure…

The digital ecosystems supporting today’s always-on economy are becoming increasingly complex, creating a major conundrum for those tasked with infrastructure and application performance management.

The shift towards more dynamic technologies, such as microservices, cloud, IoT and software-defined data centres, is providing the agility that organisations need to keep up with their customers and competitors. However, it has also made it impossible for human operators to manage their digital services with traditional, siloed, manual approaches.

There are now thousands of intricate dependencies between the physical and virtual components that make up the applications and infrastructure that support our digital services. Making things even more complicated, you also have to factor in that software-defined data centres are constantly evolving to optimise the service delivery chain, so IT infrastructure is different from one second to the next.

Ultimately, there is now a seemingly infinite number of moving parts within the application landscape and the infrastructure layer that underpins it. It would take a human operator a lifetime to map and fully understand their IT ecosystem at any fixed point in time, making the task completely impossible when you consider that the environment is changing constantly. Troubleshooting problems becomes akin to hunting in the dark, within a turbulent, constantly shifting universe that we do not know the start or end of.

Cost of failure

Failing to face up to the realities of these dynamic IT ecosystems leaves businesses exposed to a far greater risk of costly IT outages. Given the prevalence of e-commerce and the digital economy, the cost of IT service failures is rising as businesses feel more of a direct revenue impact. According to the Ponemon Institute, the average cost of an outage within a data centre had risen 38%, from $505,502 in 2010 to $740,357 by the beginning of last year. Gartner has put its own estimate on the cost of data centre downtime at $5,600 per minute.

Just as we see with self-driving cars, we will see AI manage software infrastructures autonomously.

The problem is twofold. As well as having more that can go wrong within the IT delivery chain, it could also take IT systems managers much longer to trace the root cause of any performance problems that arise within their digital ecosystem. As a result, if businesses do not change the way that they manage IT performance, the frequency of IT outages and the duration of the downtime they cause will skyrocket, along with the resulting costs.

The role of AI

Artificial intelligence (AI) has become crucial to managing performance in modern digital ecosystems. It is the only way to automate the complex number-crunching needed to identify the solution to any problems before end users and customers feel any impact.

AI can instantly analyse the nearly infinite permutations of interdependencies that exist between an organisation’s physical and virtual infrastructure layers. This allows it to instantly trace the source of any performance degradations right back to the root cause, offering the business a solution on how to remediate the issue quickly, regardless of how complex their ecosystem is becoming.

Just as we see with self-driving cars, we will see AI manage software infrastructures autonomously. While it might sound futuristic, it is actually already on the horizon (only five years) just like the wide rollout of self-driving cars.

Businesses simply cannot afford to manage infrastructure and application layers as they have done traditionally

AI capabilities within digital performance management systems can also be hugely beneficial in supporting collaborative efforts for DevOps. Firstly, the technology is able to quickly reveal the complete picture behind any IT performance degradations; identifying the impact on operations and the steps that development teams need to take to remediate the situation. This gives it the ability to provide the answers that both teams need in order to work together to address any performance problems, supporting collaboration.

Taking this a step further, AI also paves the way for the use of a virtual digital assistant that can become a bit like Siri for IT teams. Having a virtual interaction service between the business and the IT performance system it is using to manage its digital ecosystem enables operations and development teams to easily ask the questions that are most pertinent to them and get the answers they need in terms that are most meaningful to them.

In light of these opportunities, businesses simply cannot afford to manage infrastructure and application layers as they have done traditionally. AI and automation tools will play a critical role in performance management over the next decade – untangling the new complexities created by our evolving digital infrastructures.