A misconfigured router at Swedish infrastructure provider TeliaSonera was responsible for sending vast amounts of European web traffic – including packets from AWS, WhatsApp, Slack, CloudFlare and Reddit – to Asia on Monday 20th June.
Global CDN provider CloudFlare was the first to report on the consequences of the error, noting a temporary but significant previous incident of packet loss on its backbone network from Telia, which occurred on the 17th of June and presaged the larger diversion.
‘Because transit providers are usually reliable, they tend to fix their problems rather quickly. In this case, that did not happen and we had to take our ports down with Telia at 12:30 UTC. Because we are interconnected with most Tier 1 providers, we are able to shift traffic away from one problematic provider and let others, who are performing better, take care of transporting our packets.’
After the second incident CloudFlare’s chief executive officer Matthew Prince tweeted ‘Reliability of Telia over last 60 days unacceptable. Deprioritizing them until we are confident they’ve fixed their systemic issues.’
Being a Tier 1 network provider, even a brief outage from infrastructure outfits at the scale of TeliaSonera have a significant ripple effect, and though the router problem was fixed relatively quickly, the subsequent outage inspired widespread complaint at the consumer and business level.
It is notable how widespread the damage of the misconfiguration was allowed to become, with one commenter on the incident noting the lack of any safety routines in downstream infrastructure that might have been expected to catch the erroneous external route.
Amazon Web Services also reported the incident during the downtime, with a representative stating: “Between 5:10am and 6:01am PDT an external provider outside our network experienced an issue which impacted internet connectivity between some customer networks and the EU-WEST-1 Region.”