

We still use stairs today that were constructed thousands of years ago and have rarely, if at all, been “unavailable.” The historic “uptime” of that set of stairs is excellent. With some simple maintenance and upkeep, and barring some dramatic change to their environment, those stairs will basically last forever. Imagine a set of concrete steps in your neighborhood. With great complexity comes great responsibility These are just a few of the questions we’ll aim to answer here in our guide to high availability. So what is it that makes four nines so hard? What are the best practices for high availability engineering? And why is 100% uptime so difficult? Consider how many people rely on web tools to run their lives and businesses. A step above, 99.99%, or “four nines,” as is considered excellent uptime.īut four nines uptime is still 52 minutes of downtime per year. The industry generally recognizes this as very reliable uptime. Amazon, Google, and Microsoft’s set their cloud SLAs at 99.9%. Most cloud vendors offer some type of Service Level Agreement around availability.

Most services fall somewhere between 99% and 100% uptime. High availability is measured as a percentage, with a 100% percent system indicating a service that experiences zero downtime. High availability refers to a system or component that is operational without interruption for long periods of time. Web uptime is more important than ever, and it’s critical that these services we all rely on are up and running as often as possible. We’ve talked about the increasingly-interconnected nature of cloud tools and the domino-goes-crashing-down effect that can happen when just one critical service has downtime.

This is a really simplified version of the problem web developers face when aiming to build high availability services. Here’s a way to build a bridge that never fails: Drain the river and fill it in with concrete.Įxpensive, ugly, and stupid.
