Locked out and totally down: Facebook’s scramble to fix a massive outage

0
134

A prolonged, global outage of Facebook’s apps sent the company’s engineers scrambling to fix the issue at one of its data centers in California, according to two people familiar with the situation.

The outage, which began around 11:40AM ET on Monday, brought down all of Facebook’s apps — including Instagram and WhatsApp — globally, affecting billions of users and millions of advertisers. Inside Facebook, the outage also broke nearly all of the internal systems that employees use to communicate and work. As of 6PM ET, it appears that most of the services are back online.

Several employees told The Verge they resorted to talking through their work-provided Outlook email accounts since Facebook mainly runs on an internal version of the social network that is currently not accessible. While employees could email each other, they can’t send or receive emails from external addresses.

Since Facebook requires employees to log in with their work accounts to access tools such as Google Docs and Zoom, those services also weren’t working, leading some employees to use alternative services like Apple’s FaceTime and Discord. Employees who were already authenticated with non-Facebook tools like Google Docs before the outage began still had access.

Facebook has yet to detail the cause of the outage internally or externally, though outside experts are saying it relates to the company’s networking architecture suddenly going offline. Facebook engineers were sent to one of its main US data centers in California to try and fix the problem, meaning the fix couldn’t be done remotely. “We understand how disruptive this is to everyone,” CTO Mike Schroepfer said in an email to employees that was seen by The Verge.

Further complicating matters, the outage broke the ability for some employees to access company buildings and conference rooms with their badges, according to The New York Times, which first reported that engineers were being dispatched to the data center.

A Facebook spokesperson pointed to a tweet by Schroepfer saying the company was experiencing “networking issues” and that employees “are working as fast as possible to debug and restore” its systems.

Update October 4th, 6:33PM ET: Noted that the outage is ending as Facebook and its other services are coming back online.