Horizontal scaling basically means splitting the load between different servers. Horizontal scaling simply adds more instances of machines without changing to existing specifications. By scaling out, you can share the processing power and load balancing across multiple machines.
Horizontal scaling means adding more machines to the resource pool,
rather than simply adding resources by scaling vertically. Vertical scaling gives you the ability to zoom into add more servers to your network, but it also requires you to zoom out by adding a bit more power, CPU, and RAM to the existing infrastructure.
Scaling horizontally gives you scalability but also reliability because you will have more redundancy and mostly its the preferred way to scale in distributed architectures.
When splitting into multiple servers, then we need take into consideration if you have a state or not. If your services are stateless, then its easy and the best practice to Scaling horizontally. We can just put the different services and have a load balancer that will split the load traffic to different servers.
But if we have a state like database servers, than we need to manage more considerations like CAP theorem. We will discuss later that how to scale state-ful services in upcoming articles. But now lets focus on state-less services that are Scaling horizontally.