Computer Networks Notes 6: Summary of Protocol Design


1. Basic Isssues

1.1 Which problem are we trying to solve?

End-to-End communication

The most basic problem to be solved by a network is generally end-to-end communication of packets of data. To achieve such communication, solutions are typically needed on at least the physical and data-link layer, and on the network layer if multi-hop communication is necessary.

To transmit any data at all, it is necessary to have physical-layer communication: a way to send the bits, usually using voltages or currents on copper wires, light through fibers or air, or radio through air ("ether"), though many other means are possible, with RFC 1149 providing an extreme (and impractical) example.

Although some data communication can proceed in the absence of frames, most data communication occurs in terms of groups of bytes, variously called frames, cells, packets, datagrams, PDUs, etc. The data link layer typically addresses issues of framing as well as Medium Access Control (MAC) and possibly collision or congestion avoidance and recovery.

If the end-to-end transfer requires more than a single hop, then a network layer protocol is usually needed to provide data forwarding functions, including routing and, if supported, multicast, resource allocation. The network layer can also provide congestion control, a special form of resource allocation.

Security may be added at any of these levels. This includes hardware security (spread spectrum was originally invented to make it hard to detect a transmitting radio) and various forms of cryptography to provide encryption and authentication. In general, issues with security include concealing information from attackers (encryption), preventing attackers from delivering invalid data that cannot be distinguished from valid data (authentication), and managing all the secrets that are a typical part of any security system (key management and key distribution). Careful thinking about threats, defenses, and scenarios is a necessary (but often not sufficient) component of a good security design.

A frequently overlooked requirement for data communication is that it is almost always necessary to have two-way communication. If this is not supported at the physical layer, a two-way network can be built from one-way links by the network layer, although this would require modifying many current protocols. Existing routing protocols are particularly sensitive in requiring two-directional link-layer communication, but the Ethernet collision detection also requires it (as well as limits on the time between the transmission of a bit and its reception). In contrast, radio-based protocols and particularly Aloha make heavy use of alternating one-way communications.

Reliable End-to-End communication

There are two general strategies to achieve reliability: forward error correction, in which additional data is transmitted to try to correct for errors, and acknowledgement-based error correction, in which additional data is transmitted in the reverse direction to give feedback about the success of prior transmissions.

There are many techniques for forward error correction, from the very simple (every two packets send a third that is the XOR of the preceding two), to more complex, e.g. sending m check packets every n data packets, and computing the bits of each check packet based on some mathematical code. Note that both these examples require each packet to be the same size. This is easy to achieve by padding, but padding is inefficient if packet sizes vary a lot. Finally, forward error correction can be adaptive: the sender may choose to send more check information if the underlying network has been dropping more packets, a choice that requires feedback of some kind from the network or from the receiver.

Acknowledgement-based error correction returns feedback to the sender from the receiver about packets that have and have not been correctly received. In general this is both a more adaptive technique than forward error correction, and also potentially a slower technique, since retransmission can take substantial time on some connections. In addition, it is potentially wasteful because a packet may need to be retransmitted if its ack is lost, while the packet was actually delivered. Cumulative acks do get around this issue, but do not allow selective acknowledgement and retransmission.

An often sought-after goal of network design is to provide a network that is sufficiently reliable that no end-to-end reliability is needed. While this is theoretically possible, we do not live in a theoretical world, and the practical implications are generally unfavorable to such a design. A carefully thought-out paper lays out the End-to-end arguments in System Design, and is a must-read on this subject.

Real-time and constant-bit-rate communication

Because acknowledgement-based reliable transmission, most notably as implemented in TCP where it is combined with congestion control, can impose almost arbitrary delays, and because forward error correction has generally not been sufficiently adaptive in the past, designers of real-time protocols have typically chosen to implement unreliable real-time communication. That is, any protocol that wishes to provide real-time communication must accept the possibility of packet loss. While this seems to be true in the current (2004) standards world and on the current Internet, there seems to be no fundamental reason why this should be so.

Real-time communication is generally concerned with either the absolute delay of packets, or the maximum delay variation among packets. The former is interesting for interactive communications, especially interactive voice (including VoIP) and video. The latter is necessary if the receiver is a playback device and must know how much to buffer to provide an uninterrupted playback.

Connectionless and Connection-Oriented communication

Connection-oriented communication stores state at the end systems, and perhaps at the intermediate nodes, in order to make communication more efficient or to enable functions (such as reliability) that are harder without connections. Stored state must be managed, which usually requires additional communications. A special case of stored state is caching, in which the establishment of the state may be managed explicitly, but the removal of the state follows a timeout.

As a very general statement, connections that transfer lots of data are advantageous -- because the work to set them up is small compared to the amount of data transferred -- and connections that carry little data are not -- in the extreme, a single packet might transfer the desired information, without the overhead of setting up a connection. Along the same lines, it may be advantageous to set up state if the data rate for a connection is high, but this may not be worthwhile if the data rate is very low, though it must be said that with current prices and availability of memory, in the vast majority of cases state storage is essentially free.

These tradeoffs are very application-dependent: for example, on the Internet, the amount of data transferred per connection decreased markedly with the introduction of the World-Wide Web, in which a request often uses a connection to transfer just a few hundred bytes. Because application behavior can substantially affect the efficiency of the choice between connection-oriented and connectionless, knowing the intended applications makes it much easier to choose rationally.

The costs of setting up a connection includes:

If the connection exchanges a lot of data, the first two costs are usually negligible compared to the cost of transferring the data. If the connection is long-lived, the third cost may increase to the point of becoming significant. Caching of connection state (i.e., removing connection state after a timeout) can reduces the second and third costs, and may reduce the first cost if there is no need for a confirmation before sending data, but may increase the first cost if the connection state must be refreshed frequently. Caching is pervasive in modern protocols, particularly as a way to add the benefits of state without requiring that end-systems implement the complexity of state management and providing a sort of automatic reliability, as in routing protocols and ARP caches.

1.2 What mechanisms do we use?

Caching is but one mechanism used throughout network design.

The most fundamental mechanism used within modern networks is Time Domain Multiplexing, or TDM, the idea that the same physical layer can be used to transmit data for several different senders as long as only one set of data is transmitted at a time. At the physical layer, this is known as Time Domain Multiple Access, or TDMA, usually in the form of reserved slots accessed on a cyclical basis. Unlike old-fashioned telephone networks, TDM maximizes usage of what is typically the most expensive component in the system, the wires, fibers, and ether, by allowing a number of computers to share a single physical layer. Note TDM is in use even on point-to-point links between any two routers, since the packets carried are for many different senders.

TDM works well with both constant-rate and bursty traffic. For bursty traffic, however, packets that temporarily exceed the capacity of a link must be either discarded or, if possible, stored somewhere. The fundamental mechanism for doing this in computer networks is to use queues, either FIFO queues (most common) or, if the traffic can be differentiated into different priorities, priority queues. All real queues used in networks have a finite amount of space, and in addition, queues increase the delay of packets, so disproportionately increasing the size of queues is generally a bad design decision. Instead, protocols must be designed to gracefully adapt to queue overflows and the resulting packet losses, perhaps (as is done in TCP) interpreting such packet losses as a sign of congestion and therefore reducing sending rates.

Queuing disciplines vary, and queuing theory has produced interesting theoretical results with wide practical applicability. As one particular example, if there are several servers, it is more efficient to have a single queue, with the entity (packet, person) at the head of the queue going to the next available server, than to have one queue per server. The intuitive reason for this is that if one server is very slow (perhaps because its transactions are complex), and the entities in its queue cannot move to other queues, they will have to wait a long time before being server, whereas with a single queue, the average wait is about the same for all and the worst-case wait is unlikely to be long. Interested readers are referred to the many books and web pages on the subject.

An essential tool of good network design is the measurement of the system. This is usually done manually on a new system, but is often converted to an automatic measurement and response system as time goes on. Some examples include congestion measurement and control on the Internet, and adaptive protocols for multimedia transmission that have been and continue to be proposed.

A crucial principle of automatic measurement and response is that the response should not cause instability. For example, if every sender responds to congestion at time A by slowing down, the network at time B > A will have no congestion. If everyone at time C > B responds by sending even more data than was sent at time A (because the network in now uncongested), the network will clearly be congested again. This is an oscillation, and is an unstable situation in which reaction to the network state causes substantial problems. Stability issues are often resolved by limiting the rate of change of crucial parameters, for example for congestion, by limiting how fast the sending rate can increase. However, even slowing down rates of change may not always solve stability problems. Control theory and systems theory provide large bodies of theoretical knowledge related to these issues, and again the interested reader is encouraged to consult the many books and web pages on the subject.

1.3 Policy vs. Mechanism

In general in computer science we say that the greatest benefit comes from the programmer providing a powerful mechanism, and allowing users to set the specific policy that applies to the mechanism. A particularly good example of this is the world-wide web, a mechanism for making information available to anyone on the internet and for making it easy to link other "pages" of information. There is nothing in the world-wide web itself that restricts the kinds of information provided, this decision is made by the individual web sites. In addition, users are generally free to select which information they access (though some web sites try hard to take away make this freedom). Likewise the internet protocols were designed to carry arbitrary data, the TCP and UDP port mechanisms allow for arbitrary new applications, the DNS system allows arbitrary names (though only using a restricted set of characters, which causes issues for non-english speakers but makes it easier to enter domain names on almost any keyboard), IP is designed to run on top of arbitrary lower protocols, Ethernet is designed to carry arbitrary higher-level protocols, and so on. Providing a more general mechanism instead of a specific combination of mechanism and policy has a number of consequences. The first is that the provider of the mechanism actually gives power away to the user of the mechanism. For some commercial (and even political) entities, this may not seem very desirable. The second consequence is that users may apply this mechanism in ways not foreseen by the original designer of the mechanism. This often has the consequence that the mechanism may be very widely adopted and applied to solve many different specific problems. This is a desirable consequence for many users (some would prefer that only the specific original problem be solved) and is often considered to bring many social benefits. Many of these issues are particularly highlighted with open source software, where the creator of the software releases to the user of the software the power to view and modify the program. In essence, the creation of open-source software is the creation of a very powerful mechanism, one that may be turned into other mechanisms as well, with very little in the way of policy restrictions on the use of this mechanism. Again, producers of software are not always as enthusiastic about open-source software as consumers are, though For a more complex example, imagine trying to design a QoS system for use on a network with limited bandwidth and a single queue to handle multiple connections. In this situation, a designer/implementer might try to design a QoS system with only a few classes of service, e.g. voice, video, and arbitrary data, or might try to design a QoS system to support arbitrary combinations of bandwidth, delay, and jitter. The latter is more powerful and can be applied to many unforeseen situations, but might be a lot harder to implement, and even when implemented, might be less efficient (in the sense of being unable to completely utilize the bandwidth of a link) than the former. So which is better? Clearly there is no single answer in this case.

2. Performance

2.1 Overhead

Every protocol adds some overhead in order to accomplish some objectives. Overhead generally identifies the "cost" of running a protocol, and is the penalty that must be paid, usually in time or utilization, for the data transfer to have the properties we wish it to have, e.g. end-to-end transmission for IP, reliable transfer for TCP, and so on. The overhead we are most commonly concerned with in networking can be described as header overhead or protocol overhead, though strictly speaking it includes all the signaling required to deliver the data and perform the protocol functions, including In addition, a typical protocol will require a certain amount of computational overhead (the most obvious example being the TCP checksum computation) on the sending and/or the receiving computer, and may have other forms of overhead. For example, TCP typically sends one ack every two full-size data segments received, rather than for every segment. This reduces the number of ack packets sent on the network, but may delay the transmission at the sender if the window is small enough or if for any other reason the sender is waiting for the ack packet before sending additional data. Because overhead is the cost that we pay for the features we want from the network, it is common to want to minimize overhead. When minimizing overhead, there may be some drawbacks to each of the optimizations we consider. For example, if we wanted to reduce the size of the TCP header, we could eliminate the urgent pointer. This has the benefit of having to send 16 fewer bits with every packet, but several drawbacks as well: In short, the expected benefits must be matched against the foreseeable costs, and it is often the case that it is not worth while trying to optimize a mature and widely deployed protocol. This is especially true as long as computer power and networking speeds keep increasing as quickly as they have in the past decades. Nonetheless, it is an interesting exercise (which is left for the reader) to compute the numerical value of the overhead for specific situations. The reader is warned that a 10% overhead is often interpreted in one of two slightly different ways: either that the overhead is 10% of the total, or that the overhead is 10% of the payload (and thus about 9.09% of the total). This is more significant with larger overheads, so that a 50% overhead that is 1/2 the payload is very different from (and much better than) a 50% overhead that is 1/2 the total (and thus 100% of the payload). I am not aware of a standard way of describing overhead, though some textbooks prefer one or the other version. One specific example will be worked out -- sending a single byte to a peer using TCP/IP over Ethernet. TCP and IP headers each require 20 bytes (for a total of 40), and the Ethernet header is 14 bytes, but we must also send a preamble (8 bytes), a CRC (4 bytes), and 5 bytes of padding. In addition, the Ethernet protocol requires an inter-frame gap of about 96 bit times, equivalent to 12 bytes, so the total overhead is 83 bytes. By the two measures above, this is either a 98.8% overhead, or a 8300% overhead, but either way is rather inefficient. But, with the considerations above, and perhaps also considering what small percentage of our bottleneck links is taken up by one-bit TCP payloads on Ethernet, even this glaring inefficiency is probably not worth fixing.

2.2 Load Balancing

A fundamental way of improving performance of computer systems is to add parallelism. This only improves the performance of those computations that can be performed in parallel, but fortunately, most of the communications for which we are interested in improving performance involve lots of data. More specifically, no matter how interested we may be in improving the delay of sending a single bit, most of us believe that we will be unable to send information faster than the speed of light, and so we do not usually focus our attention on doing so. The remaining problems often provide abundant grounds for turning our attention to parallelism. In computer networks we think of parallelism as load balancing -- distributing (balancing) traffic (load) across multiple parallel networks connecting the same source and destination. In IP, for example, different packets may follow different routes to a destination. As a result, they may be delivered to the destination out of order (since there is no guarantee that the different routes have identical delay). If the packets are alternated among the several routes in proportion to how much bandwidth is available on those routes, this will maximize the overall bandwidth available between this pair of source and destination. Multiple routes are desirable for reasons of reliability as well as performance -- when one route is temporarily unavailable, the other can carry all the traffic that both were carrying, though the overall bandwidth will not be as high. It is for these reasons that the entire IP protocol and IP routing are designed to support multiple equivalent routes to a destination when such routes are available. Load balancing can be used not only between two end-systems, but also between two intermediate nodes or between two routers. It may be easier to add a new connection between two routers, and to configure the two routers to load balance the traffic between them, than to upgrade both routers (and the link between them) to a faster technology. When multiple routes are available, it is usually tempting to try and direct more traffic along the route that currently has more bandwidth available, or for which queues are shorter. This is quite feasible if only local information is used -- for example, a router can easily figure out which one if its outgoing queues has less data to send. This is much harder to do, however, when trying to select a route based on information received from a different part of the network. In this case, the problem is that the information received is delayed, and this delay is often sufficient to make the information obsolete. This can lead to the instability mentioned above. This instability is always a possibility where the information is used to make predictions that may not come true. For example, if a load-balancing router can put the data for a destination X into any one of three queues always puts the data into the emptiest queue, this may not work well if only one of the queues can be used to reach destination Y, and suddenly a lot of data to be sent to Y is received at the router. In that case, round-robin assignment of data to queues may work better, even though it will be suboptimal during those times when no data is received for Y. As another example of load balancing, ATM requires that cells for a connection be delivered strictly in order. That means that a single ATM connection cannot be carried by multiple parallel links (unless these links were to provide a mechanism for maintaining cell order). Instead, ATM load balancing can be done on a per-connection basis, in that different connections between the same source and destination can be carried across parallel links. The math of load balancing is usually pretty simple -- with n equal links, each with bandwidth B and delay d and delay, the overall delay is usually still close to d, and the combined bandwidth is usually close to n * B. These are not exact formulas because additional delays may be caused by additional software overhead, because the delay variation is likely to increase with the additional links, and because the load may not be perfectly balanced among the different links, leading to slightly less overall bandwidth. For fast links, the software overhead is likely to dominate.

3. Access and Security

TBD.

A lot to write here... encryption and authentication. Physical security ("air gaps", but also tempest). Attacks. Purposes of attacks: how valuable is your information? Pranks ("hacks"). Who can make use of it? More about power and control. Firewalls and firewall rules. Secure shell, IPsec.

4. Overall System Reliability and Performance

TBD. What do we care about? Uptime, performance, error rates (elaborate considerably). Different users have different requirements, may be willing to pay different amounts to get them. Since the benefits of the network (especially the economies of scale) depend on providing the same basic infrastructure to all, how do you achieve a sane medium?