Congestion Control
- TCP Congestion Control
- Congestion Avoidance
- TCP Vegas
- Virtual Clock
TCP Reno Congestion Control
- determine capacity of network
- use ACK as signal that a packet has left the network (self-clocking)
- maintain a nearly constant (within a factor of 2) number of
bytes in the network
- adjust to variations in available capacity
- Additive Increase/Multiplicative Decrease
- Slow Start
- Fast Retransmit, Fast Recovery
Congestion Control Mechanism
- TCP flow control uses window size sent by the receiver:
window = w[packet] - (lastSent - lastAcked)
- congestion control adds congestion window cw
to the flow control mechanism:
w = min(w[packet], cw)
window = w - (lastSent - lastAcked)
- when cw is small, there can only be very few unacked bytes (packets)
in the network, and TCP sends slowly.
- when cw is large (and if w[packet] is large), TCP can send quickly.
- adaptive:
- increase cw when we get an ack
- reduce cw when we get a timeout
Additive Increase/Multiplicative Decrease
- increase cw when we get an ack:
- on each ack, cw = cw + MSS * ( MSS / cw)
- i.e. add 1 MSS whenever one cw worth of data has been acked
- additive increase: add one MSS every RTT
- reduce cw when we get a timeout:
- cw = max(cw / 2, MSS)
- dependent on accurate timeout
- multiplicative decrease: divide by two every packet drop
- additive increase/multiplicative decrease gives sawtooth pattern
- required for stable congestion control
Slow Start
- additive increase/multiplicative decrease works well when
operating near capacity, but too slow at beginning
- "slow" start is slower than original mechanism (send an
entire window at the beginning), but faster than additive increase
- use at connection start or when transmitting after a timeout
- use a new variable threshold
- on timeout, threshold = cw / 2, cw = MSS
- on ack,
- if cw > threshold, i = ( MSS * MSS) / cw
- otherwise, i = MSS
- cw = min(cw + i, MAX)
Slow Start: Properties
- multiplicative (unstable) increase until packet lost
- half of "packet loss" window (w_0) may be lost here
- threshold = w_0 / 2 cw = MSS
- w_0 / 2 is our estimate of network capacity
- now use slow start up to threshold
- thereafter, use additive increase/multiplicative decrease to
adapt to network capacity
- alternative methods to estimate network capacity, e.g. packet-pair
Fast Retransmit, Fast Recovery
- Fast Retransmit
- TCP coarse-grained timeout is inefficient, waits too long (0.5s)
before retransmitting
- receiver gets out-of-order packets, sends ack for expected packet
- sender sees these as duplicate acks
- after 3 duplicate acks, sender retransmit first unacked packet
- Fast Recovery is used when retransmitting because of duplicate acks:
- use acks still in the pipe for self-clocking
- slow start not needed
- set cw = cw / 2 (not cw = 1)
Source-Based Congestion Detection
- TCP Reno detects congestion after it happens and packets
are dropped
- As congestion builds up in the network, we have
- increased RTT
- increased spacing between successive packets
- less throughput than expected for a given window size (TCP Vegas)
- less throughput increase for an increase in the window
- we could use any of these measures to detect congestion
before packets are dropped
Comparison of Expected and Actual Throughput
TCP Vegas:
- measure amount of data being buffered in the network
- too little data means we cannot measure increases in capacity
- too much data means we are creating congestion
TCP Vegas
- Base RTT: RTT_0 = min_i RTT_i
- with no congestion we expect throughput: T_e = cw / RTT_0
- measure actual throughput: T = bytes / RTT
- RTT_0 <= RTT and bytes <= cw, so
T <= T_e
- delta = T_e - T >= 0,
- define thresholds Min = 1 (packet/RTT), Max = 3
- if delta < Min, use additive increase
- if Min <= delta <= Max, leave cw unchanged
- if Max < delta, use additive decrease
On timeout, decrease is multiplicative.
Congestion Control
- active and experimental area
- real evaluation is difficult
- high potential for catastrophe
- hard to convince people to use new algorithms
Virtual Clock
- Fair: each flow gets its allocated bandwidth
- Flows that exceed their capacity can use unallocated bandwidth
- Work conserving: the link will never be idle while there are
packets to send.
- algorithm: for each flow i,
- Define average rate AR_i (B/s or pkt/s)
- Tick_i = 1/AR_i
- When we receive n bytes (pkts) for flow i,
we stamp them with Clock_i += Tick_i * n
- Queue is sorted by increasing timestamp
- If queue is full, discard packet with largest timestamp.
- See fig. 8.17 and 8.18
Virtual Clock Flow Meter
Virtual clock has problem that source could send all its packets at the
end, and they would take priority over all other packets.
- Define average interval AI_i (burst duration, in sec) for flow i
- Define average interval rate AIR_i = AR_i * AI_i
(bytes or packets)
- Every AIR_i data, compare Clock_i with Time:
- if Clock_i > Time, source is sending faster than AR_i, send
it message to slow down.
- if Clock_i < Time, source is sending slower than AR_i, set
Clock_i = Time.
- Instead of stamping packets with Clock_i, stamp them with
max(Clock_i, Time).