WORST CASE BUFFER REQUIREMENTS FOR TCP OVER ABR1

B. Vandalore, S. Kalyanaraman 2, R. Jain, R. Goyal, S. Fahmy


Dept. of Computer and Information Science,
The Ohio State University,
2015 Neil Ave,
Columbus,
OH 43210-1277, USA
E-mail: {vandalor, shivkuma, jain}@cse.wustl.edu

ATM (asynchronous transfer mode) is the technology chosen for the Broadband Integrated Services Digital Network (B-ISDN). The ATM ABR (available bit rate) service can be used to transport ``best-effort'' traffic. In this paper, we extend our earlier work on the buffer requirements problem for TCP over ABR. Here, a worst case scenario is generated such that TCP sources send a burst of data at the time when the sources have large congestion windows and the ACRs (allowed cell rates) for ABR are high. We find that ABR using the ERICA+ switch algorithm can control the maximum queue lengths (hence the buffer requirements) even for the worst case. We present analytical arguments for the expected queue length and simulation results for different number of sources values and parameter values.


Introduction

ATM is designed to handle different kinds of traffic (voice, audio, video and data) in an integrated manner. ATM uses small fixed-size (53 bytes) packet (also called cells). It provides multiple service categories to support various quality of service (QoS) requirements. The current set of categories specified are: the constant bit rate (CBR), real-time variable bit rate (rt-VBR), non-real time variable bit rate (nrt-VBR), available bit rate (ABR), and unspecified bit rate (UBR). The CBR service is aimed at transporting voice and synchronous applications. The VBR (rt- and nrt-) services provide support for video and audio applications which do not require isochronous transfer. The ABR and UBR provide ``best-effort'' delivery for data applications. The ABR service uses closed-loop feedback to control the rate of sources.

The ATM technology is already being used in the backbone of the Internet. The performance of Internet protocols such as TCP/IP over ATM has been addressed in references [ 1, 2]. ATM switches need buffers to store the packets before forwarding them to the next switch. We have addressed the problem of buffer requirements in the context of transporting TCP/IP applications over ABR [ 3, 4, 5]. In earlier studies, we had shown that achieving zero loss ABR service requires switch buffering which is only a small multiple of round trip times and the feedback delay. The buffering depends upon the switch scheme used.

One of the issues related to buffer sizing is that it is possible for a source to reach a high ACR and retain it. As long as it sends a packet before 500 ms elapse, the ``use-it or lose-it'' policy (the source loses its allocation if it does not use it to send data [ 6]) will not be triggered. The source can then use the high ACR to send a large amount of data suddenly. The effect can be amplified when many sources do the same thing.

In this paper, we generate a worst case scenario in which the TCP sources have a large congestion window and all ACR rates are high. Under such a condition, the TCP sources are made to send a huge burst of data into the network. We show that even under these extreme circumstances, ABR, using a good switch algorithm like ERICA+ [ 7, 8], performs well and controls the queues. We present results of simulation of up to 200 TCP sources.

Congestion Control

In this section, we give a brief introduction to the TCP/IP congestion control and ABR flow control mechanisms.


TCP Congestion Control

TCP provides reliable, connection-oriented service. TCP connections provide window based end-to-end flow control [ 9]. The receiver's window ( rcvwnd) is enforced by the receiver as a measure of its buffering capacity. The congestion window ( cwnd) is used at the sender as a measure of the capacity of the network. The sender cannot send more than the minimum of rcvwndand cwnd. The TCP congestion control scheme consists of the ``slow start'' and ``congestion avoidance'' phases. In the ``slow start'' phase, cwndis initialized to one TCP segment. The cwndis incremented by one segment for each acknowledgement received, so the cwnddoubles every round trip. The ``congestion phase'' is entered when cwndreachs ssthresh(initially 64K bytes). In this phase the cwndis incremented by 1/cwndfor every segment acknowledged. If an acknowledgement is not received by the time out period, the segment is considered to be lost, and ``slow start'' phase is entered. The cwndis set to one, and ssthreshis set to max(2, min(cwnd/2, rcvwnd)).


ABR Flow Control

The ABR service uses closed-loop feedback control to advise the sources about the rate at which they should be transmitting the data. The switches monitor their load, compute the available bandwidth and divide it fairly among the competing connections. The feedback from the switches to the sources is sent in Resource Management (RM) cells which are sent periodically by the sources and turned around by the destinations (see Figure 1).


   
Figure 1: RM cell path
\begin{figure} \centerline{\epsfig{height=360pt,width=150pt,angle=-90,figure=rmcell.eps}} \end{figure}

The RM cells flowing in the forward direction are called forward RM cells (FRMs) while those returning from the destination to the source are called backward RM cells (BRMs). When a switch receives a BRM cell, it computes its allowed cell rate (ACR) and sets it in the ER (explicit rate) field of that cell.


Generating Worst Case TCP Traffic

As previously mentioned, the maximum burst which TCP can send is equal to the maximum congestion window size. The TCP takes img2number of round trips to reach the maximum congestion window. In normal TCP over ABR activity when this maximum congestion window is achieved, the ACRs may not be high. In order to generate worst case conditions in which TCP has maximum congestion window size and the ACR is high, we have designed the following scenario.

A configuration where NTCP sources connected to NTCP destinations via two switches and a bottleneck link is used (see Figure  2).


   
Figure 2: N Sources - N Destinations Configuration
img3

Initially to build up the congestion window, each source sends one segment of data every tseconds. One thousand such segments are sent by each source, so that congestion window reaches a maximum for all the sources. The sources send segments in a staggered manner, i.e., not all sources send the segments simultaneously. This is done so that the TCP data can be sent without overloading the network. Since the network is not overloaded, the ACRs are high and the congestion window reaches the maximum values. At this point ( img4seconds), all the sources synchronize and send a burst of data (burst size equal to maximum congestion window). This is the worst case burst size which can arise due to the NTCP sources.

TCP Options and ERICA+ parameters

We use a TCP maximum segment size (MSS) of 512 and 1024 bytes. The MTU size used by IP is generally 9180 bytes, so there is no segmentation caused by IP. Since our simulations are performed under no loss conditions, the results hold even for TCP with fast retransmit and recovery or TCP with SACK (Selective Acknowledgements).

The TCP data is encapsulated over ATM as follows. First, a set of headers and trailers are added to every TCP segment. We have 20 bytes of TCP header, 20 bytes of IP header, 8 bytes for the RFC1577 LLC/SNAP encapsulation, and 8 bytes of AAL5 information, for a total of 56 bytes. Hence, every MSS of 512 bytes becomes 568 bytes of payload for transmission over ATM. This payload with padding requires 12 ATM cells of 48 data bytes each.

The ERICA+ algorithm operates at each output port (or link) of a switch. The switch periodically monitors the load on each link and determines quantities such as, load factor, the ABR capacity, and the number of active virtual connections or VCs. A measurement or ``averaging interval'' is used for this purpose. These quantities are used to calculate the feedback which is indicated in BRM cells. The measurements are made in the forward direction and feedback is given in reverse direction. ERICA+ uses dynamic queue control to quickly drain queues if there is a large queue build up. A hyperbolic queue control function is used to vary the available ABR capacity as a function of the current queue length.

The ERICA+ algorithm uses the queuing delay as a metric to calculate the feedback. ERICA+ uses four parameters: a target queuing delay (T0 = 500 microseconds), two curve parameters (a = 1.15 and b = 1.05), and a factor which limits the amount of ABR capacity allocated to drain the queues (QDLF = 0.5). In our simulations, the exponential averaging options for averaging N (number of active VC's) and for averaging overload are used to smooth the variances in the traffic.

Analytical Explanation

In this section, we derive the analytical value for the maximum queue lengths as a function of the number of sources, congestion window size and congestion status of network. The queue length is given in two equation for underloaded and overloaded network conditions as follows:


  img5 (1)


  img6 (2)

where

(353356 is the number of cells sent in the burst by one source under overloaded condition. This is explained in the derivation of equation  2later in this section)

The derivation of the two equations is as follows: Initially, for a small number of sources, the network is underloaded, so the ACRs are high when the burst occurs. When all the sources send the burst simultaneously, the whole burst can be sent into the network and this results in large switch queues.

Let cells_in_mssbe the number of cells required to send a segment and mssbe the maximum segment size.

The burst size due to one source = img9cells

So if there are N sources, the expected queue length is


img10

Substituting img11in the above we get (equation  1)

img12(1)

Under the no loss condition, the TCP congestion window increases exponentially in the ``slow start'' phase until it reaches the maximum value. As the number of sources increase the load of the network increases. In the simulation, every source sends a segment of size 512 bytes every tseconds. The time between two sources sending their segments is gseconds. At time 0, source 1 sends a segment, then at time gseconds, source 2 sends, at time 2gseconds source 3 sends and so on. Also source 1 sends its second segment at time tseconds and so on.

The network becomes overloaded when the input rate is greater than the output rate. In the simulation, the network will get overloaded if number of sources


img13

for t= 1 millisecond, g= 50 microseconds, img14= 20.

Once the network is overloaded, the ACRs become low and each of the N sources gets approximately img15of the bandwidth. Since the ACRs are low, the burst which occurs after 1 second should not give rise to large queues. There are still switch queues which occur initially when the network has not yet detected that it is overloaded.

The TCP congestion window grows exponentially whenever it receives an acknowledgement. Let dbe the length of the links in kms. The acknowledgement arrives after a round trip time of img16ms. But, by this time the source will have generated more segments to send. After each round trip, each source will send twice the number of segments. A larger round trip can give rise to a larger burst. The network gets overloaded when the number of segments sent just fills up the pipe.

Let kbe the number of round trips it takes to overload the network. Time taken for transmitting one cell is 2.83 microsecond (at 149.76 Mbps, on an OC-3 link accounting for SONET overhead). In tseconds, each source sends one segment which gives rise to img17cells. Therefore, the network gets overloaded when


img18


img19

For t= 1 millisecond, mss= 512, the value of kis 5. Hence, in this case after 6 (= k+1) round trips, the network detects the overload if N> 20. The maximum queues should occur after 6 round trips for N> 20.

The value of the maximum queue in an overloaded network can be derived as follows.

The burst due to one source = img20cells.

Hence, the burst due to Nsources = img21cells (2)

For t= 1 millisecond, g= 50 microseconds, mss= 512 and img22, we get


img23 (3)


img24 (4)

Simulation Results

We show the maximum queues at the bottleneck switch for the cases where the number of sources Nvaries from 2 to 200. The results are shown in Table 1. The plot of the expected queue length and the actual length obtained from simulation results versus the number of sources is shown in Figure   3.

The length of all the links is 1000 km, giving rise to 15 ms propagation delay from source to destination. The round trip time (RTT) is 30 milliseconds. The feedback delay (the delay between the bottleneck link and the source and back) is 10 ms. A round trip of 30 ms corresponds to 11029 cells. The other parameter values are t= 1 millisecond, g= 50 microseconds, mss= 512 bytes, img7= 65536.

The maximum queue length for sources img25was at time around 1 second. For N> 20, the maximum queue was at time around 300 millisecond (which agrees with the analytical prediction). The queue lengths increase with the number of sources as expected. The queue lengths also correspond to the burst size sent by the sources. But, this is true only when the number of sources is less than or equal to 20. For N= 30, there is a sharp decrease in the maximum queue length. For N> 40, it starts increasing again but now it increases at a slower rate.

From Table 1, it can be seen that for img25, the simulation agrees well with the analytical values. For N> 30 initially the queue lengths in the simulation are higher than the analytical value, but later it becomes lower than the expected queue length. This can be explained as follows. Initially the traffic generated in each cycle (tseconds) can be sent within that cycle. But as the number of sources increases, (N> 80) the network gets overloaded, and there are some pending cells which get sent in the next cycles. Because of this the network detects the overload at lower queue lengths than the analytical value.


   
Figure 3: Queue length Versus Number of Sources
img26


 
 
Table 1: Effect of the number of sources: actualand expectedqueue lengths
# TCP Max Queue Analytical # TCP Max Queue Analytical
Sources size Queue Size Sources size Queue Size
  (cells) (cells)   (cells) (cells)
2 1575 2730 100 38088 35300
3 3149 4095 110 43672 38830
5 6297 6825 120 44939 42360
10 14131 13650 130 44708 45890
20 29751 27300 140 44744 49420
30 20068 10590 150 46058 52950
40 19619 14120 160 48880 56480
50 24162 17650 170 50784 60010
60 28006 21180 180 49961 63540
70 30109 24710 190 53366 67070
80 31439 28240 200 55618 70600
90 34530 31770 - - -

To study the effect of different parameters, a full factorial experiment was carried out for different parameter values. For a given number of sources the following parameters were varied: maximum segment size (mss= 512, 1024 bytes), time between two sources sending segments (g= 50, 100 microseconds), time between successive segments of a source (t= 1, 10 milliseconds) and length of the links (d= 1000, 2000 kms).


 
 
Table 2: Effect of MSS ( mss), Distance(d), time intervals (t,g)
# mss/g/t/d N=3 N=10 N=30 N=40 N=50 N=100
1 512/50/1/1000 3171 14273 20068 19619 24162 35687
2 512/50/1/2000 3171 14273 19906 27567 30872 75083
3 512/50/10/1000 3172 14274 45994 61854 77714 150453
4 512/50/10/2000 3172 14274 45994 61854 77714 150458
5 512/100/1/1000 3171 14273 19283 20080 24164 NA
6 512/100/1/2000 3171 14273 21241 32314 35961 NA
7 512/100/10/1000 3172 14274 45994 61854 77714 NA
8 512/100/10/2000 3172 14274 45994 61854 77714 NA
9 1024/50/1/1000 3040 13680 18650 18824 23542 NA
10 1024/50/1/2000 1542 5612 19131 22934 29163 NA
11 1024/50/10/1000 3040 13680 44080 59280 74480 NA
12 1024/50/10/2000 3041 13681 44081 59281 74481 NA
13 1024/100/1/1000 3040 13680 18591 19600 24314 NA
14 1024/100/1/2000 1403 5556 17471 24412 30533 NA
15 1024/100/10/1000 3040 13680 44080 59280 74480 NA
16 1024/100/10/2000 3041 13681 44081 59281 74481 NA
(NA - data not available)

Table  2shows for number of source N=3,10,30,40,50,100 the maximum queue sizes for 16 experiments. From Table 2, the following observations can be made. (These observations support the analytical results.)

Conclusion

We have artificially generated a worst case scenario and studied the buffer requirements for TCP over ABR. The buffer size required depends on the switch scheme used. Our simulations are based on our ERICA+ switch scheme. For the worst case scenario generated, both the analytical prediction and simulation results for queue lengths (and hence buffer requirements) are given. The simulation agrees well with the analytical prediction. The queue lengths are affectedby maximum congestion window size, the round trip time, network congestion(overloaded or underloaded) and number of sources. It is not affectedby the maximum segment size.


Acknowledgments

This research was sponsored in part by Rome Laboratory/C3BC Contract #F30602-96-C-0156.


References


Bibliography

1
H. Li, K. Y. Siu, H. T. Tzeng, C. Ikeda and H. Suzuki ``TCP over ABR and UBR Services in ATM'', Proc. IPCCC'96, March 1996.

2
Allyn Romanov, Sally Floyd, ``Dynamics of TCP traffic over ATM Networks'', IEEE JSAC, May 1995.

3
Shiv Kalyanaraman, Raj Jain, Sonia Fahmy, Rohit Goyal and Seong-Cheol Kim, ``Buffer Requirements For TCP/IP Over ABR,'' Proc. IEEE ATM'96 Workshop, San Francisco 3 , August 23-24, 1996.

4
Shiv Kalyanaraman, Raj Jain, Sonia Fahmy, Rohit Goyal and Seong- Cheol Kim, ``Performance and Buffering Requirements of Internet Protocols over ATM ABR and UBR Services,'' accepted for publication in IEEE Communications Magazine.

5
Shiv Kalyanaraman, Raj Jain, Rohit Goyal, Sonia Fahmy and Seong- Cheol Kim, ``Performance of TCP/IP Using ATM ABR and UBR Services over Satellite Networks,'' IEEE Communication Society Workshop on Computer-Aided Modeling, Analysis and Design of Communication Links and Networks, Mclean, VA, October 20, 1996.

6
``ATM Forum Traffic Management Specification Version 4.0'', http://www.atmforum.com/atmforum/specs/approved.html, April 1996.

7
Raj Jain, Shiv Kalyanaraman, Rohit Goyal, Sonia Fahmy, and Bobby Vandalore, ``The ERICA Switch Algorithm for ABR Traffic Management in ATM Networks'' submitted to IEEE/ACM Transactions on Networking, November 1997.

8
Raj Jain, Shiv Kalyanaraman, Rohit Goyal, Sonia Fahmy, and Ram Viswanathan, ``ERICA Switch Algorithm: A Complete Description'', ATM Forum/96-1172, August 1996.

9
V. Jacobson, ``Congestion Avoidance and Control,'' Proceedings of the SIGCOMM'88 Symposium, pp. 314-32, August 1988.


... ABR 1
In Proc. of SICON'98, Singapore, June 1998
... Kalyanaraman 2
S. Kalyanaraman is now with Dept. of ECSE, Rensselaer Polytechnic Institute, Troy, NY 12180-3590
... Francisco 3
All our papers and ATM Forum contributions are available through http://www.cse.wustl.edu/~jain/index.html