Once load starts to increase on a system, it may be necessary to re-architect the system to handle the new load. There are generally two ways to scale a system demanding more resources:
Vertical scaling: Add a more expensive or beefier machine than the one on which the current system runs. Consider a MySQL server that runs slowly on the current machine. If we replace it with a new machine with ten times the memory and computing power than the current one, the database server will be able to process a much larger number of queries.
Horizontal scaling: Horizontal scaling refers to distributing load across several smaller machines. There’s a ceiling to vertical scaling as machines can only become so powerful before they run into the limits of physics. This is assuming you don’t hit the budget limits of your company in the first place. Scaling horizontally brings complexity with it, especially for stateful services. A distributed database is not a trivial service to maintain when spread across several machines. However, stateless services are much easier to scale horizontally.
In practice, a hybrid approach is usually selected to address scaling issues which uses more powerful machines that also spread load horizontally. Note that there is no one-size-fits-all architecture for distributed applications operating at scale, as each application can have unique load parameters. Access patterns may be different and so too the SLA’s response time requirements.