There are at least two critical things that we didn’t do in the previous section, but we should.
Not using Cluster Autoscaler
First of all, we should have created our Kubernetes cluster with Cluster Autoscaler so that it automatically scales up and down depending on the traffic. Not only would our cluster scale up and down to accommodate an increase and decrease in the workload, but when a node goes down, it would be recreated by Cluster Autoscaler. The cluster would also figure out that there is not enough capacity. Cluster Autoscaler itself would solve fully (or partly) the problems that we could have encountered if we continued running the previous experiment and continued deleting nodes.
Running a zonal cluster
The second issue is that we are running a zonal cluster. If you followed my Gists, your cluster is running in a single zone, which means that it is not fault-tolerant. If that zone (data center) goes down, we’d be lost. So, the second change we should have done to our cluster is to make it regional. It should span multiple zones within the same region. It shouldn’t run in different regions because that would increase latency unnecessarily. Every cloud provider, at least the big three, has a concept of a region, even though it might be named differently. By region, I mean a group of zones (data centers) that are close enough to each other so that there is no high latency, while they still operate as entirely separate entities. Failure of one should not affect the other. At least, that’s how it should be in theory.