Ironically, A Cyber Attack Was Not to Blame for Last Weeks AWS Outage


The Internet might seem like it runs itself, and to a certain extent, it does — at least, that’s the case when everything is programmed correctly and there are no major cyber attacks. But a simple human error has the capacity to take down an entire website. Less frequently, a small error can take down much of the web. That’s what happened with last week’s Amazon Web Services (AWS) error which, surprisingly, had nothing to do with a cyber attack.

Here’s what you need to know about last week’s Internet catastrophe, why it’s relevant to you, and how many errors, not just security breaches, can undermine web security and stability.

Last Week’s AWS Error: What Went Wrong

If you tried to get online last week to visit one of many popular websites, you probably found that the site was running slowly, failing to load entirely, or riddled with errors. The culprit, as it turns out, was a problem with Amazon Web Services (AWS). AWS hosts a large array of websites with its Simple Storage Service (S3). Information hosted on the cloud with this service was inaccessible throughout the day, leading to trouble for thousands of companies.

This leads us to the question — what was behind the error? Not a massive cyberattack. Not a threat from Russia or a denial-of-service (DDoS) attack either. The culprit, it seems, was nothing more than a typo.

Here’s how it happened:

An engineer pushed the wrong button. Rather than taking a few servers off line, this took all of the company’s servers off line. It turns out that Amazon hadn’t completely restarted its entire system in years. So getting the ill-fated servers back online took longer than Amazon anticipated.

Experts say this could happen to any web hosting service, and that Amazon is still a reliable hosting partner. The problem is that the popularity of AWS means that a disproportionate number of businesses were affected. In other words, many companies use Amazon because it works so well. That means when it fails, so too do they.

AWS is probably more secure than ever thanks to the error. That doesn’t mean we can’t learn anything from the error. One thing we know for sure is that even giants like Amazon are vulnerable to errors that put user content in jeopardy.

Human Error’s Massive Costs: Why It Could Happen to You

This isn’t the first time a simple human error has had catastrophic effects. Amazon’s brief outages in the past have taken out web giants such as Vine and Instagram. Far from being the only company vulnerable to such a disaster, Amazon is joined by companies such as Google and Microsoft, which have also gone offline. Joyent suffered a similar outage to Amazon’s back in 2014.

Human error also extends to cyber attacks. Most cyber attacks are, at their core, attributable to some type of human error. That includes insecure passwords, data leaks, accidental insertion of malicious code, and pure laziness.

Many cyber criminals prey on our tendency toward laziness. With threats everywhere, it’s easy to become desensitized to the very real threats apps and websites face. That’s precisely where vulnerability begins. That piece of code you never checked, the developer you never vetted, and the update your users don’t install… these are all the very omnipresent human errors that put your site or your application in danger. Criminals look for these vulnerabilities and then exploit them. It’s exhausting to stay on top of every potential attack, and they use this to their advantage.

The Role of Human Error in Website and Application Issues

Major web outages and cyber attacks typically send businesses scrambling to fix that specific error. A malicious piece of code will quickly disappear when it instigates a high-profile attack. But this reactive posture can obscure a larger truth: websites and applications are vulnerable to a host of issues. Many of them are due to human error.

Rather than focusing on the specific typo that briefly took down AWS, it’s important to consider the larger scope of human error. Some of the many ways a simple mistake can undermine site security include:

  • Cutting and pasting code found online. Bad actors know that this is common practice, and routinely insert malicious code into free code posted online.
  • Typographical errors in string of code. The wrong word, the wrong letter, a bad symbol, or an inappropriate break in a string of code can take a site from perfect to perfectly nonfunctional. Occasionally, incorrect code actually causes a site to do something significantly different than the coder intended. These errors are harder to detect than those that render the site nonfunctional.
  • Testing errors: In 2001, a technician’s errors in testing a development system disrupted trading on NASDAQ. In the same year, Microsoft experienced an outage that lasted almost a full day due to human error while configuring a name resolution system.
  • Errors in error communications themselves: confusing error messages can make it difficult to ascertain the source of an error, delaying the process of fixing an error.
  • Errors that play on human cognitive biases: For example, confirmation bias is the tendency to see evidence supporting that which you already believe. Testers may not see errors in their own work as a result, or may be less adept at detecting errors when they think the code is well-designed or attractive.

How You Can Protect Your Site

Human error is inevitable, and sooner or later, it’s likely to affect your site or application. There’s still much you can do to reduce the likelihood that errors will trigger a massive outage or data breach. Try the following:

  • Invest in security testing you trust. Appsolid offers industry-leading security to mobile app developers, including standard-setting binary protection. The right security protection can protect against a range of liabilities, from human mistakes to cyber attacks from criminals.
  • Back-up web hosting cross-regionally: This may help protect against outages like the one that plagued AWS. Try reproducing S3 objects across geographic locations.
  • Save content with Cloudfront: This allows you to access objects if S3 or another host goes out.
  • Test and retest coding, security, and other vital information.


Leave a Reply

%d bloggers like this: