Load distribution
Load distribution is a term in computers and telecommunications networks, which refers to the partitioning of a workload M over N resources with capacity less than M. It is often closely related to multihoming and fault tolerance, and those goals often are integrated. Load distribution, however, may be done purely for performance and capacity reasons.
It is perfectly normal to find multiple load distribution mechanisms used in the same network service, for different protocols and for different applications. Some general terms apply to all forms.
Resource selection principles
Resource selection may be based on deterministic, statistical estimation, or measurement. The classic deterministic method is round-robin, where each successive unit of work — a packet in routing, a connection at the end-to-end level, a transaction at the application level — is assigned to the next sequential resource in a set of resources. When the last resource is selected for a unit of work, the next unit of work goes to the first resource in the set, and the cycle repeats.
Resource assignment
Resource assignment is the method by which the work is actually sent to the resource. It can be indirect, in which the resource is selected during an association phase when the resource selection takes place in the association manager. Associations certainly include connections, but also connectionless functions such as Domain Name System response, as well as path setup that does not commit resources, as in the Resource Reservation Protocol. Association managers tend to be considered part of the network rather than a server, although they may be a proxy for the server and not visible to the client.
A subset of indirect assignment is done by the application server rather than a network element or directory. Application protocols including the Hypertext Transfer Protocol (HTTP), Telnet, and the File Transfer Protocol can redirect the client to a different address or port than the server originally contacted.
Direct resource assignment is "on the fly", in which packet addresses are changed in transmission.
Packet and frame level
At the most basic, frame-level bridges and packet-level routers can only make decisions about sending traffic over N directly connected transmission link resources.
While early routing researchers thought the routing system could be made dynamically load-aware, this simply has not worked well in practice, causing much overhead but never really reflecting current load. For example, the now obsolete Cisco Interior Gateway Routing Protocol (IGRP) had the capability of using the utilization ratio of a link as a factor in its link cost computation. A 100% utilized 10 Mbps link would be less attractive than a 2% utilized 64 Kbps link. In practice, however, adjustment based on link utilization led to oscillation: successive packets would avoid the more heavily utilized link, until it became lightly used — and again became the preferred outgoing link.
Link utilization also could select a locally lightly loaded link, for which the next link was extremely congested. No global load tracking method was devised that had a reasonable balance between the overhead of tracking and the constant resetting of routes.
What has been effective is to use routing to find paths,and then assign end-to-end paths at the entry point to the network.
Round-robin
In the first data networks, a simplifying assumption could be made that all links had the same speed, and round-robin assignment was used. Assume N=3. The first packet would go to link 0, the second to link 1, the third to link 2, and the fourth to link (last + 1) modulo N, or back to zero. This worked acceptably with slow links, where computing time and memory were much more available than bandwidth. As bandwidth increased, round robin became less attractive, because, for each destination in the forwarding information base, state had to be maintained: the last link used, which had to be updated for every unit of data sent.
A fundamental Internet design concept is that routers, and by extension bridges, are stateless with respect to individual forwarding decisions. In reality, they retain a certain amount of state in their forwarding tables, but these are in the control plane, not the forwarding plane, and are updated at a much slower rate than of data forwarding.
Per-destination
The next approach was to recognize each new destination address, and, when it was first seen, associate it to the next available link, and always send traffic to that destination over that link. With a large number of destinations, such that workload per link averages to roughly the same amount, this can work. With a small number of destinations, as, for example, on a set of links purely internal to an enterprise, "pinhole congestion" may occur, where one link is assigned to a destination receiving far more traffic than any other.
Flow-based
Pinhole congestion was much more easily avoided by hashing the combination of source and destination address (i.e., flow to a link. Where per-destination would bind all the traffic for the Web or DNS server to a single link, source-destination hash would spread the load from different clients onto different links.
Some flow hashing techniques also consider payload information, such as the protocol type field in an IPv4 header.
End-to-end
Several methods are used at association time, usually directing flows to particular paths or servers. When dealing with network resources, connection admission control may be used, the data network analogy of a "all circuits busy" telephone signal that all resources are in use.