Scalability generally means a system can support more users when adding more resources. It can be vertical meaning adding more memory, CPUs or horizontal meaning more computers. Following are commandments that I have learned to make scalable systems such as large traffic sites, travel sites and e-commerce sites:
1. Divide and conquer – Design a loosely coupled and shared nothing architecture by breaking a large system into a set of smaller shared services. The services should be stateless that can deployed on any number of machines. You can use Theory of constraints to find bottlenecks because these will limit how many servers or requests you can perform.
2. Use messaging oriented middleware (ESB) to communicate with the services. This avoids temporal dependencies that are inherent in RPC style services.
3. Resource management – Manage http sessions and remove them for static contents, close all resources after usage such as database connections. Avoid expensive resource management protocols such as two phase commits, which can be scalability killer. Instead use optimistic locking or compensating transactions.
4. Replicate data – For write intensive systems use master-master scheme to replicate database and for read intensive systems use master-slave configuration. Use queues for all database writes so that replication don’t block incoming requests and make sure writes go through your cache system so that reads know about them. For user facing requests use high availability design and for background processes use high consistency design (CAP).
5. Partition data (Sharding) – Use multiple databases to partition the data. Use some naming services to find the data. You can also denormalized data to partion the data.
6. Avoid single point of failure – Identify any kind of single point of failures in hardware, software, network, power supply.
7. Bring processing closer to the data – Instead of transmitting large amount of data over the network, bring the processing closer to the data. May be some kind of mobile agent technology can help here to automatically move processing units.
8. Design for service failures and crashes – Write your services as idempotent so that retries can be done safely. Have a decentralized logging and monitoring. However, have a centralized way to monitor or gather logs for services. Keep an eye on performance. Invest in tools for monitoring and managing services.
9. Dynamic Resources – Design service frameworks so that resources can be removed or added automatically and clients can automatically discover them. You will need some kind of smart load balancer here. Again, all this should be connected to a centralized monitoring system.
10. Smart Caching – Cache expensive operations and contents as much as possible. Precompute your contents to reduce load on the application server. Unfortunately, caching on the same server as application server can reduce memory for application server, so I suggest doing caching on separate servers (caching farm). You can add version to cache and set time out so that you don’t have to invalidate cache. Also, use multi-level caching and store contents on disks so if the machine crashes it can restart quickly and cache warmup is quick.