The software meltdown on 19 July, when millions of computer screens across the globe were hit by the "Blue Screen of Death" -- a critical error message displayed by Microsoft Windows when the system encounters a problem it cannot recover from -- disrupted critical systems across numerous industries.
This unprecedented outage was triggered by a faulty update from cybersecurity technology company CrowdStrike (CRWD) and hit government services, emergency operations, transport, payment systems and financial markets worldwide.
But there are ways to mitigate the risk of such disasters from happening again, according to Symphony CEO Brad Levy.
On this week's episode of Yahoo Finance Future Focus, Levy discussed strategies that could prevent such outages from recurring and how we might build more resilient global software systems.
Read more: Turning assets into tokens on blockchain is $15tn market, says analyst
Levy said the CrowdStrike fiasco exposed a fundamental vulnerability in how modern software systems are structured, noting that while Symphony remained operational during the crisis, it was a stark reminder of how dependent society has become on interconnected systems.
"These incidents have happened before in various industries," he explained, "but this one stood out because of its scale and the critical systems it impacted, from healthcare to banking and travel."
CrowdStrike's faulty update to its Falcon Sensor security software crashed approximately 8.5 million Windows systems, impacting airlines, banks, hospitals, and government services like emergency response systems and websites.
Levy said it was indicative of a broader issue, raising questions over the robustness of current software systems. While it's crucial to build more resilient systems, the real issue lies in how interconnected these systems are. "It's less about any one system and more about how these systems are connected," he said.
Failures in one area can ripple outwards. To mitigate this risk, Levy stressed the importance of containment and compartmentalisation. Although these strategies may add complexity and slow down processes, they are necessary to increase the robustness of critical infrastructure.
Read more: BlackRock 'leading tokenisation of real-world assets on blockchains'
Levy said sacrificing some efficiency might be necessary to improve security, a key point of concern for businesses operating in fast-paced environments. "Sometimes slow is a feature. The Navy SEALs often talk about doing things slowly and deliberately, which eventually leads to faster execution over time."
In the world of software, this might mean being more deliberate in how systems are developed, ensuring that security measures are baked in from the start rather than retrofitted after a crisis hits.
Levy said that while interoperability between software platforms is essential, it also presents significant challenges. He cited major players like Apple (APPL), Microsoft (MSFT), and Google (GOOG), whose software "stacks" operate independently but still need to work together in certain areas, particularly when it comes to cybersecurity.
"It's that working together that's important," Levy said, explaining that while companies may prefer to keep everything self-contained within their own verticals, they need to be open to incorporating external solutions where necessary, particularly in areas like cybersecurity.
AI could accelerate challenges related to software outages
AI could accelerate outage risks -- because it increases automation and complexity -- but also offer predictive solutions.
The key, according to Levy, is to keep AI platforms somewhat independent from other systems, particularly in critical areas like cybersecurity, while still ensuring they are connected enough to offer valuable insights.
Read more: AI could spark 'complete annihilation of humankind' unless regulated
Levy also said it is important to avoid single points of failure in software systems, a lesson painfully learned from the CrowdStrike incident. He said that while simplicity in systems might seem appealing, it often limits innovation and creates vulnerabilities.
On the other hand, having too many systems working in tandem can lead to what he called "choice paralysis." The solution, he suggested, lies somewhere in the middle, with companies needing to strike a balance between simplicity and complexity, always with an eye toward evolving threats and technologies.
Levy emphasised the potential of distributed systems and blockchain technology in reducing the risk of major software outages.
Symphony uses a distributed model, drawing parallels to blockchain's decentralised structure. Central to this vision is robust identity management, which Levy sees as crucial for future systems to ensure clear identification of all entities within a network -- whether a person, company, or object like a bond or an application programming interface (API.)
"We need what I call the Trident -- the triangle of identity management," he said, referring to the seamless integration of these entities across systems. Levy believes this will be key to building secure and efficient software ecosystems.
"Digital institutional identity, combined with individual and object identity, will underpin access to the digital networks and systems necessary for commerce and communication," he added.