Amazon’s cloud services unit AWS was struggling to recover last night from a widespread outage that knocked out thousands of websites along with some of the world’s most popular apps – Snapchat and Reddit – and disrupted businesses globally.
The turmoil marked the largest Internet disruption since last year’s CrowdStrike malfunction hobbled technology systems in hospitals, banks and airports, and highlights the vulnerability of the world’s interconnected technologies.
After more than nine hours of disruptions, some applications were gradually coming back online. But AWS acknowledged that elevated errors were still affecting several AWS services and that it was working on recovering connectivity.
AWS was down for more than 9,300 users as of 1pm ET, according to outage tracking website Downdetector.
That figure is higher than the earlier peak of about 5,800 reports at 3.48am ET.
Lambda, one of AWS’s computing services, was experiencing errors due to issues with an internal subsystem, AWS said in the latest update on its status page. “We are taking steps to recover this internal Lambda system,” it said.
AWS said earlier the root cause of the outage is an underlying subsystem that monitors the health of its network load balancers used to distribute traffic across several servers to ensure improved performance and capacity.
The issue, AWS said, originated from within the ‘EC2 internal network’.
EC2 refers to Amazon’s ‘Elastic Compute Cloud’ service, which provides on-demand cloud capacity within AWS.
Businesses use EC2 to run virtual servers that they need to develop, launch and host applications and can scale up or down on capacity as required.
While some apps like Reddit and Roblox had largely stabilised, according to outage tracking website Downdetector, others, including Snapchat, PayPal’s Venmo and Duolingo were showing a resurgence in issues seen earlier in the day.
AWS provides computing power, data storage and other digital services to companies, governments and individuals and is the world’s largest cloud provider, followed by Microsoft’s Azure and Alphabet’s Google Cloud.
Disruptions to its servers can cause outages across websites and platforms - ranging from food delivery apps to gaming platforms and airline systems – that rely on its cloud infrastructure.
AWS said on its status page that the outage originated at its US-EAST-1 location in northern Virginia, its oldest and largest for web services. The site suffered outages in 2021 and 2020.
According to documentation on the AWS website, the US-EAST-1 site is often the default region for many AWS services.
Asked for comment, AWS directed Reuters to its status page. Amazon did not respond to a request for comment.
In Britain, Lloyd Bank, Bank of Scotland and telecom service providers Vodafone and BT were also facing issues, according to Downdetector’s UK website, as was UK tax, payments and customs authority HMRC’s website.
The problem highlights how interconnected everyday digital services have become and their reliance on a small number of global cloud providers, with one glitch wreaking havoc on business and day-to-day life, experts and academics said.
Amazon’s own services, including its shopping website, Prime Video and Alexa, were also hit, although Downdetector last showed a decrease in severity.