ATP Implementation and Performance

(A technical report for MS final project)

By Zhenyu Yang

Adviser: Edoardo S. Biagioni

 

 

Overview

 

The paper discusses the project to implement a new network protocol, ATP (ATM Transport Protocol) that accommodates ATM network traffic and evaluate its performance. The project started from April and ended in December 1999. All the coding and testing is carried out at Advanced Network Computing Lab (ANCL), University of Hawaii at Manoa.

 

Brief Introduction to ATM network

ATM (Asynchronous Transfer Mode), a broadband network technology consists of the following key features:

        Asynchronous and high-speed

        Connection-oriented

        Highly reliable (low error-bit-rate, in order delivery)

        Fixed-size cells switching instead of variable-sized packets routing

        Quality of Service (QoS) and Resource Reservation

The basic ATM network structure consists of a set of ATM switches interconnected by point-to-point ATM links and interfaces, as is shown in the following Fig. 0.

 

 

Motivation for ATP

The reasons for designing and implementing a specialized protocol on transport layer (based on the seven-layer OSI model) for ATM network lie in two respects. First, as shown in Fig. 1, there is a huge gap between the bandwidth the network hardware has achieved and the throughput a user has actually obtained on his desktop. The realization of high-speed network from a user's view depends on the improvement in all four layers. Unless the speed on higher layers is also improved, improving the lower layers alone will not result in significantly higher performance on application layer.

Secondly, currently dominant TCP/IP layer over ATM incurs some unnecessary overhead. Hu [6] lists the cost of running TCP/IP stack. Goyal [4] proposes some ways to improve the TCP performance over ATM-UBR. ATP is designed to accommodate the same function or traffic of TCP/IP with a much simpler mechanism. Specifically, it is intended to reduce or eliminate the following costs that exist in TCP/IP with nice features from ATM or ATP layer that are enclosed in brackets:

        Checksum (low bit-error-rate and CRC in AAL)

        20+byte headers (4 byte ATP header)

        Data copy and context switch (Multiplexing and de-multiplexing the cells of different virtual connections identified by VPI and VCI values on ATM Layer)

        Complex congestion control (simple congestion avoidance mechanism in ATP and flow control mechanism at UNI on ATM Layer)

         Slow start (resource/bandwidth reservation in ATM, quick start up in ATP)

 

Brief Introduction to ATP

ATP is a specialized protocol to carry data traffic over high-speed ATM network. It mainly consists of the following components:

        Simple sending and receiving mechanism

        Simple retransmission mechanism---Once a receiver discovers the gap in the sequence numbers of packets it has received, it will send a NAK, which will trigger the retransmission from the sender.

        Quick start up algorithm---ATP employs sliding window protocol. It lets sending window size to jump to near optimum level very fast.

        Congestion control: Raj Jainís CARD (Congestion Avoidance using Round-trip Delay) approach---the mechanism involves minimum overhead of recording round trip time (RTT) of each packet and little computation of sending window size based on RTT

        Simple packet header processing---short and fixed size (4 byte) header which consists of only three fields: sequence number, last packet bit and ACK/NAK bit makes it quite easier to process packet header.

For more detailed description of the specification and design of ATP, Hu [6] is a good source of information.

 

Implementation

 

The full implementation of ATP is developed in C on Linux platform (RedHat 6.0) installed with the micro-kernel, ATM on Linux (version 0.59) which supports raw ATM connections (PVC and SVC), IP over ATM, LAN emulation, etc.The functions in the full implementation are the initial draft of ATP API that provides easy-to-use interface and similar functions as TCP/IP for an application program to access high-speed ATM networking. Among them, the utility functions are hidden from the application program but are used by other functions in ATP API.

 

ATP API

 

It consists of three main sections: General, Active Side and Passive Side. Hosts on both active and passive sides can use functions in the General Section. A host that initializes the connection and later sends packets calls functions in active side section. A host that listens for the incoming connection and receives packet once a connection is established calls functions in passive side section. Note a host on active side switches to the role on passive side when it starts to receive packets.

General

        InitAtpSocket (), initialize all components of an ATPSocket, including its three semaphores, MTU (Maximum Transfer Unit) size based on MTU for underlying AAL5 layer, previous and current window size, sending and receiving list length. It also sets the QoS (Quality of Service) based on the parameter specified by the application layer.

        Close (), terminate three threads (ATPSendthread, ATPRecvThread and ATPSendTimerThread), destroy the three semaphores and close the socket.

        Send (), fragment and pack the message passed from the application layer into one sendItem and put it on the sending list for ATPSendThread to handle and immediately return.

        Recv (), remove the first receiveFragments from the receiving list, extract and reassemble the whole message and deliver to the application layer; if nothing on the receiving list, block the process and wait.

 

Active Side

        Connect (), actively make connection to the passive side, calculate the initial RTT (Round Trip Time) based on the time to set up connection and activate three threads

 

Passive Side

        AtpBind (), bind an initialized ATPSocket based on ATM socket SVC/PVC address

        Listen (),listen for the incoming connection from the active side

        Accept (), after detecting the incoming connection, create a new ATPSocket, create and initialize three threads.

 

The following program is a simple example that demonstrates the use of ATP. In this example, the sender sends one byte to the receiver.

Declaration:

struct sockaddr_atmsvc†††††††††††††††††††† satm;

ATPSocket†††††††††††††††††††††††††††††††††††††††††† sock;

struct atm_qos†††††††††††††††††††††††††††††††††††† qos;

int†††††††††††††††††††††††††††††††††††††††††††††††††††††††††† atm_interface_number;

(initialize the above four variables)

On the sender side:

initAtpSocket(&sock, &satm, &qos, atm_interface_number);

char* send_buf = (char*)calloc(100, sizeof(char));

if (send_buf == NULL)

perror("calloc");

pattern(send_buf, 1);

if (Send(&sock, send_buf, 1) < 0)

††††††††††††††††††††††††††††††† perror("send");

††††††††††††††† Close(&sock);

††††††††††††††† free(send_buf);

On the receiver side:

initAtpSocket(&sock, &satm, &qos, atm_interface_number);

if (AtpBind(&sock,&satm) != 0 )

perror("AtpBind");

 

if (Listen(&sock,5) < 0)

perror("listen");

ATPSocket* newSock = Accept(&sock);

 

†††††††††††††† char* recv_buf = (char*)calloc(100, sizeof(char));

†††††††††††††† if (recv_buf == NULL)

perror("calloc");

if (Recv(newSock,recv_buf,100) < 0)

perror("receive");

Close(newSock);

Close(&sock);

free(recv_buf);

 

The utility functions are lower level functions called from within ATP API functions to fulfill such task as multiple threading, semaphore and realizing the fragmentation, re-assembly, re-transmission and congestion control functionality of ATP.

 

Multiple Threading

Create, activate and terminate three threads: ATPSendthread, ATPRecvThread and ATPSendTimerThread.

 

Semaphore

Create, activate and terminate three semaphores, send_turn, recv_turn and close_turn to help coordinate multiple threads in sending, receiving packets and closing socket.

Fragmentation and Re-assembly

On the active sending side, fragment the message passed down from application layer into a series of packets, attach appropriate headers and construct a sendFragments object. On the passive side, packets are stripped of headers and re-assembled into a receiveFragments object.

 

Re-transmission and Congestion Control

Set, increase and decrease window size on active sending side based on the formula to calculate sending window size given by Raj Jain [3]. The retransmission function is triggered by two events: timeout for a packet and a NAK sent by the passive receiving side.

Main Feature

 

Multiple Thread

††††††††† There are three threads in ATP. The ATP layer needs to perform the following tasks simultaneously:

        Receiving message from and passing message to the application layer.

        Receiving packets (either data or ACK/NAK) from the other side.

        Sending packets (either data or ACK/NAK) to the other side.

        Keeping timer for each packet to implement retransmission and self-regulatory congestion control.

 

When sending data, the application layer passes the message to ATP layer that creates a sendItem object and puts it on the SendList (a FIFO queue) and immediately returns. ATPSendThread constantly checks the SendList and if there is something there, it just removes the element at the front and proceeds with the sending. ATPRecvThread, on the other hand, waits for the incoming ACK/NAK packets. ATPSendTimerThread keeps the timer for each outstanding packet. When a time out occurs, the corresponding packet is retransmitted voluntarily. When an ACK is received, ATPRecvThread also computes the round-trip time and the sending window size is adjusted to implement the congestion control mechanism. When receiving data, ATPRecvThread waits for incoming data packets, pack them into receiveFragments and put them on recvList (a FIFO queue). The application layer removes the front element from the recvList once it checks and finds the length of the recvList is non-zero.

 

Semaphore

The use of semaphore here is based on the following two reasons. First, there are multiple threads working together to access the same data and function. Mutual exclusion is needed to protect the critical section. Second, the relationship of different threads is that of producer and consumer. Semaphore is used as the signaling mechanism to coordinate the work of different threads.

 

        Send_turn, blocks the ATPSendThread if there is nothing on sendList, once the application layer puts something on sendList, it notifies the ATPSendThread via send_turn.

        Recv_turn, blocks the application layer if there is nothing on recvList, once ATPRecvThread gets complete message, it notifies the application layer to retrieve the message.

        Close_turn, blocks the 'Close' function if 'Send' function is adding items to sendList or there is still outstanding packets not ACKed. Once the 'Send' function finishes adding items and ATPRecvThread receives all expected ACK, the closing socket can proceed.

 

One Data Copy

The implementation uses one data copy for both sending and receiving a message. On the sending side, when application layer sends one message, it passes down to the ATP layer that copy the message to sendList and then proceeds with fragmentation and sending. On the receiver side, when application layer receives a series of packets, it reassembles and copies them to recvList. Clark[7] states the major overhead of TCP/IP implementation is data copy. ATP implementation tries to minimize the cost in that respect.

Testing

 

Test bed set-up

 

The test bed is set up at the Advanced Network Computing Lab, University of Hawaii at Manoa. In order to test and compare the performance of ATP over ATM, TCP/IP over ATM and native ATM, the benchmark software 'atptest' is developed for ATP and public software 'ttcp' with an extension to support ATM is downloaded for testing TCP/IP and native ATM.

The hardware setup consists of three components:

        ATM backbone switch: Forerunner ASX-200BX (switching fabric: 2.5Gbps, 2 to 32 ports)

        Workstation: one Intel Pentium II 266MMX MHZ PC; one Intel Pentium PRO 200MHZ PC

        NIC: Forerunner LE155 PCI ATM adapter

        Fiber link: OC-3 (155Mbps) bandwidth

†††††††

Software setup consists of the following components:

        Linux operating system (including TCP/IP stack)

        ATM on Linux micro-kernel that supplies Linux ATM device driver to interact with ATM hardware as well as Linux ATM API for development of higher layer protocol.

        TCP/IP stack in kernel space while ATP in user space

        Benchmark software: both 'ttcp' and 'atptest' are in user space.

Fig. 3 illustrates the test bed setup from a view of functional layer. Three types of tests are performed: atptest over ATP over ATM, ttcp over TCP/IP over ATM and ttcp over ATM. Note the hardware part is not included in the diagram.

Metrics

 

Raj Jain [1] gives a partial list of performance metrics for ATM:throughput, frame latency, throughput fairness, frame loss ratio, maximum frame burst size, and call establishment. For this project, the test focuses on the throughput and latency measurement. Before any test is done, some parameters for TCP/IP and ATP are given here. Old TCP implementation uses a timeout value of 500 millisecond for retransmission but newer implementation such as the TCP/IP stack in Linux uses Jacobson/Karles Algorithm to calculate the timeout dynamically. The formula is as follows:

Timeout = a*EstimatedRTT + b*Deviation

where a is typically set to 1 and b is set to 4 based on empirical results. (Peterson and Davie [9] gives more details.) ATP transmission timer is 500 millisecond. TCP uses slow start mechanism and the starting window size is unknown to me for this version of Linux. ATP uses Raj Jain's algorithm to adjust window size. (See Raj Jain [3]) Basically, it is an additive increase and multiplicative decrease mechanism, i.e. it is increased by 1 packet and decreased by 1/8 of the old window size. The initial window size is set to be 10 and initial round trip time (RTT) is set to be 100millisecond.

 

Before test, it is assumed that the performance of ATP should fall between native ATM and TCP/IP in term of throughput test from a view of theory and protocol design logic.

Throughput test procedure and results

†††

Throughput tests have been performed under two scenarios. The first one is setting the MTU (Maximum Transfer Unit) = 10 bytes, sending buffer in the 'atptest' to be 100 bytes and receiving buffer in the 'atptest' to be 300000bytes. The sender, 'atptest' on one PC sends one message of a certain size to the receiver, 'atptest' on another PC. Once the sender sends all packets and gets all corresponding ACK (acknowledgement) from the receiver, it closes the connection. There is a timer on the sender and receiver respectively which record the time elapsed for sending the packets or receiving the packets. The throughput is obtained by dividing the number of bytes by the time recorded on the receiver side. Since 'ttcp' with ATM extension doesn't allow specified MTU. So there is no test for ttcp under MTU=10. Table 1 gives the results:

 

Table 1 Throughput Test for MTU = 10 bytes

Number of bytes

Time (second)

Throughput (Mb/s)

912,345

130.430048

0.055942

1,234,567

143.916097

0.068627

1,567,892

224.873770

0.055779

1,876,543

266.996203

0.056277

2,000,001

288.177193

0.069401

2,299,999

329.495739

0.055843

2,456,789

352.591978

0.055742

†††††

The first scenario is more of a correctness test rather than a performance test. There are two results worthy of noticing. First, the above test transmits almost 2.5MB data that is equivalent to 400,000 packets for MTU=10 bytes (excluding 4-byte header). That amounts to more than 350MB data if normal MTU size (=9180 in our case) is used. Second, the throughput looks mostly stable with different amounts of data.

 

The second scenario is setting the MTU to be normal, i.e. 9180 bytes, sending buffer in the 'atptest' to be 100000bytes and receiving buffer in the 'atptest' to be 300000 bytes. The test is performed for three cases under the same condition sequentially. Because of the variation in test results from ATP, the data in both high and low end of the range is listed in Table 2.

 

Table 2 Throughput Test for MTU = 9180 bytes

Number of bytes

atptest over ATP

ttcp over TCP/IP

ttcp over ATM

time

t-put

time

t-put

time

t-put

time

t-put

1,048,576

0.11

73.05

0.11

76.103

0.087

95.66

0.062

135.08

2,097,152

0.17

101.19

0.16

106.89

0.169

98.70

0.124

135.21

4,194,304

0.32

105.60

0.30

109.19

0.334

100.22

0.248

135.28

8,388,608

0.64

103.48

0.63

105.77

0.664

101.01

0.495

135.32

16,777,216

1.29

103.82

1.24

107.58

1.322

101.50

0.991

135.34

33,554,432

2.53

105.96

2.49

107.52

2.639

101.68

1.983

135.35

67,108,864

5.67

94.57

5.55

96.56

5.275

101.77

3.966

135.36

83,886,080

8.25

81.31

7.66

87.51

6.594

101.75

4.957

135.36

(The unit of time is second and the unit of throughput is Mb/s.)

Fig. 4 plots the above results into three curves.

 

Observation from Throughput Test

        Throughput of TCP/IP and native ATM is more consistent and stable than that of ATP

        ATP outperforms TCP/IP in some cases

        Throughput of ATP is more consistent for MTU = 10 than normal MTU (9180)

        Variation of ATP throughput is exacerbated if there is more retransmission of packets

 

Latency test procedure and results

 

The latency test is carried out as follows. The sender sends a certain amount of data to the receiver that sends back the original data to the sender. Note the role of sender and receiver is switched after the receiver gets the data and acknowledges it. There is a timer on the original sender that records the total round trip time for sending and receiving the same amount of data. The latency is the time recorded by the original sender side. Latency test is performed only on ATP. Table 3 summarizes the results.

 

Table 3 Latency Test Results

Message Size (byte)

Latency (low) sec.

Latency (high) sec.

1000

0.00154

0.00184

2000

0.00175

0.00183

3000

0.00181

0.00183

4000

0.00154

0.00163

5000

0.00200

0.00211

6000

0.00195

0.00210

7000

0.00167

0.00170

8000

0.00217

0.00222

9000

0.00227

0.00235

 

Performance Evaluation and Analysis

 

Comparing the performance of ATP over ATM, TCP/IP over ATM and native ATM in throughput test, the following is the explanation for the contrast of consistence in TCP/IP and native ATM but variation in ATP:

(1) ATP stack runs in user space while TCP/IP and native ATM runs in kernel space. That may affect the speed of executing the program.

(2) By analyzing the change in sending window size for throughput test for message size 83MB, see the following Fig 5.

It seems the quick start algorithm doesn't let a sender to jump to a near optimum level very quickly. So the problem comes down to whether the sender doesn't send fast enough because of the throttling effect by the window size.

(3) In the testing, we specify QoS to be Unspecified Bit Rate (UBR) because it seems our ATM switch and ATM layer doesn't support other types of resource reservation. Since the good performance of ATP is partly based on QoS support from ATM layer, UBR makes it impossible for ATP to take advantage of the nice features of ATM layer.

(4) There is possible inefficient handling of NAK. Compared with TCP/IP which usually has a lot of retransmission of packets but still performs consistently when transmitting message of different size, ATP can't maintain its high throughput if there is retransmission of packets. This may be related to our static 500ms retransmission timer.

(5) There is one data copy for both sending and receiving a message. It is possible to have no data copy on both sides although it will make the coding more difficult. Because of the time constraint, the zero data copy approach is not tried.

(6) As with any benchmark problem, time granularity is always a problem. The inaccurate system clock makes it relatively difficult to measure the transmission time for message of smaller size where the difference of a few milliseconds makes quite a difference.

 

Future Work

 

(1)   Further streamline the existing implementation and improve the efficiency of code. For example, in the current implementation, the receiver acknowledges (ACK) each packet. It is possible to only ACK the packet with the largest sequence number in order. For example, if the receiver gets packet number 7, 8, 9, 10, 11. Instead of sending five ACKs, it could just ACK packet number 11.

(2)   Explore other possible resource reservation approachs such as Available Bit Rate (ABR). Previously, the current implementation tried to specify QoS as ABR but it constantly got error message. Whether the error message comes from Linux ATP API or other sources has not been probed.

(3)   Record the sending window size of all packets for message of different sizes and study whether congestion control mechanism works efficiently. As is stated in the Performance Evaluation and Analysis section, a fixed 500 millisecond timeout is used in current ATP implementation. It is possible to explore other approaches.

(4)   More benchmark tests for message of larger size, such as several hundred megabytes is needed to better evaluate the performance of ATP and compare it with that of TCP/IP and native ATM. The throughput test comparison graph shows the throughput vs message size in a relatively small range. If it is extended over a larger range, the trend will become clearer.

 

Conclusion

 

In this paper, I summarize the implementation of ATP, present the test results for ATP, and compare the performance between ATP over ATM and TCP/IP over ATM and native ATM. The interesting test results have raised some questions about the implementation, the congestion control mechanism, the ATM layer support and etc. Future work should shed more light on some of the above questions.

 

References

 

[1]Gojko Babic, Raj Jain, Arjan Durresi, 1999

ATM Performance Testing and QoS Management

[2]Cisco System, 1988-1999

Designing ATM Internetworks

†††† (http://www.cisco.com/univercd/cc/td/doc/cisintwk/idg4/index.htm)

[3]Raj Jain, 1989

A Delay-Based Approach for Congestion Avoidance in Interconnected Heterogeneous Computer Networks

††† (http://www.cis.ohio-state.edu/~jain/papers/delay.htm)

[4]Rohit Goyal, Raj Jain, Shiv Kalyanaraman, Sonia Fahmy, Bobby Vandalore, 1998

Improving the Performance of TCP over the ATM-UBR service

††††† (http://www.cis.ohio-state.edu/~jain/papers/cc.htm)

[5]Werner Almesberger, 1995

High-speed ATM networking on low-end computer systems

†††† (http://icawww1.epfl.ch/linux-atm/doc.html#lowend)

[6]Xiaochun Hu, Zhifeng Jia , 1998

A Reliable Transport Protocol over ATM

†††† (http://www.ics.hawaii.edu/~esb/prof/proj/atp0.html)

[7]David D. Clark, Van Jacobson, John Romkey, Howard Salwen

††††††††††† An Analysis of TCP Processing Overhead

[8]Edoardo Biagioni, Eric Cooper, Robert Sansom

††††††††††† The Design of a Practical ATM LAN

[9] Larry L. Peterson,Bruce S. Davie

††††††††††† Computer Networks: A Systems Approach

†††††††††††