B. Vandalore, S. Kalyanaraman 2, R. Jain, R. Goyal, S. Fahmy
Dept. of Computer and Information Science,
The Ohio State University,
2015 Neil Ave,
Columbus,
OH 43210-1277, USA
E-mail: {vandalor, shivkuma, jain}@cse.wustl.edu
ATM (asynchronous transfer mode) is the technology chosen for the Broadband Integrated Services Digital Network (B-ISDN). The ATM ABR (available bit rate) service can be used to transport ``best-effort'' traffic. In this paper, we extend our earlier work on the buffer requirements problem for TCP over ABR. Here, a worst case scenario is generated such that TCP sources send a burst of data at the time when the sources have large congestion windows and the ACRs (allowed cell rates) for ABR are high. We find that ABR using the ERICA+ switch algorithm can control the maximum queue lengths (hence the buffer requirements) even for the worst case. We present analytical arguments for the expected queue length and simulation results for different number of sources values and parameter values.
ATM is designed to handle different kinds of traffic (voice, audio, video and data) in an integrated manner. ATM uses small fixed-size (53 bytes) packet (also called cells). It provides multiple service categories to support various quality of service (QoS) requirements. The current set of categories specified are: the constant bit rate (CBR), real-time variable bit rate (rt-VBR), non-real time variable bit rate (nrt-VBR), available bit rate (ABR), and unspecified bit rate (UBR). The CBR service is aimed at transporting voice and synchronous applications. The VBR (rt- and nrt-) services provide support for video and audio applications which do not require isochronous transfer. The ABR and UBR provide ``best-effort'' delivery for data applications. The ABR service uses closed-loop feedback to control the rate of sources.
The ATM technology is already being used in the backbone of the Internet. The performance of Internet protocols such as TCP/IP over ATM has been addressed in references [ 1, 2]. ATM switches need buffers to store the packets before forwarding them to the next switch. We have addressed the problem of buffer requirements in the context of transporting TCP/IP applications over ABR [ 3, 4, 5]. In earlier studies, we had shown that achieving zero loss ABR service requires switch buffering which is only a small multiple of round trip times and the feedback delay. The buffering depends upon the switch scheme used.
One of the issues related to buffer sizing is that it is possible for a source to reach a high ACR and retain it. As long as it sends a packet before 500 ms elapse, the ``use-it or lose-it'' policy (the source loses its allocation if it does not use it to send data [ 6]) will not be triggered. The source can then use the high ACR to send a large amount of data suddenly. The effect can be amplified when many sources do the same thing.
In this paper, we generate a worst case scenario in which the TCP sources have a large congestion window and all ACR rates are high. Under such a condition, the TCP sources are made to send a huge burst of data into the network. We show that even under these extreme circumstances, ABR, using a good switch algorithm like ERICA+ [ 7, 8], performs well and controls the queues. We present results of simulation of up to 200 TCP sources.
The RM cells flowing in the forward direction are called forward RM cells (FRMs) while those returning from the destination to the source are called backward RM cells (BRMs). When a switch receives a BRM cell, it computes its allowed cell rate (ACR) and sets it in the ER (explicit rate) field of that cell.
As previously mentioned, the maximum burst which TCP can send is equal to the maximum congestion window size. The TCP takes number of round trips to reach the maximum congestion window. In normal TCP over ABR activity when this maximum congestion window is achieved, the ACRs may not be high. In order to generate worst case conditions in which TCP has maximum congestion window size and the ACR is high, we have designed the following scenario.
A configuration where NTCP sources connected to NTCP destinations via two switches and a bottleneck link is used (see Figure 2).
Initially to build up the congestion window, each source sends one segment of data every tseconds. One thousand such segments are sent by each source, so that congestion window reaches a maximum for all the sources. The sources send segments in a staggered manner, i.e., not all sources send the segments simultaneously. This is done so that the TCP data can be sent without overloading the network. Since the network is not overloaded, the ACRs are high and the congestion window reaches the maximum values. At this point ( seconds), all the sources synchronize and send a burst of data (burst size equal to maximum congestion window). This is the worst case burst size which can arise due to the NTCP sources.
We use a TCP maximum segment size (MSS) of 512 and 1024 bytes. The MTU size used by IP is generally 9180 bytes, so there is no segmentation caused by IP. Since our simulations are performed under no loss conditions, the results hold even for TCP with fast retransmit and recovery or TCP with SACK (Selective Acknowledgements).
The TCP data is encapsulated over ATM as follows. First, a set of headers and trailers are added to every TCP segment. We have 20 bytes of TCP header, 20 bytes of IP header, 8 bytes for the RFC1577 LLC/SNAP encapsulation, and 8 bytes of AAL5 information, for a total of 56 bytes. Hence, every MSS of 512 bytes becomes 568 bytes of payload for transmission over ATM. This payload with padding requires 12 ATM cells of 48 data bytes each.
The ERICA+ algorithm operates at each output port (or link) of a switch. The switch periodically monitors the load on each link and determines quantities such as, load factor, the ABR capacity, and the number of active virtual connections or VCs. A measurement or ``averaging interval'' is used for this purpose. These quantities are used to calculate the feedback which is indicated in BRM cells. The measurements are made in the forward direction and feedback is given in reverse direction. ERICA+ uses dynamic queue control to quickly drain queues if there is a large queue build up. A hyperbolic queue control function is used to vary the available ABR capacity as a function of the current queue length.
The ERICA+ algorithm uses the queuing delay as a metric to calculate the feedback. ERICA+ uses four parameters: a target queuing delay (T0 = 500 microseconds), two curve parameters (a = 1.15 and b = 1.05), and a factor which limits the amount of ABR capacity allocated to drain the queues (QDLF = 0.5). In our simulations, the exponential averaging options for averaging N (number of active VC's) and for averaging overload are used to smooth the variances in the traffic.
In this section, we derive the analytical value for the maximum queue lengths as a function of the number of sources, congestion window size and congestion status of network. The queue length is given in two equation for underloaded and overloaded network conditions as follows:
where
The derivation of the two equations is as follows: Initially, for a small number of sources, the network is underloaded, so the ACRs are high when the burst occurs. When all the sources send the burst simultaneously, the whole burst can be sent into the network and this results in large switch queues.
Let cells_in_mssbe the number of cells required to send a segment and mssbe the maximum segment size.
The burst size due to one source = cells
So if there are N sources, the expected queue length is
Substituting in the above we get (equation 1)
(1)
Under the no loss condition, the TCP congestion window increases exponentially in the ``slow start'' phase until it reaches the maximum value. As the number of sources increase the load of the network increases. In the simulation, every source sends a segment of size 512 bytes every tseconds. The time between two sources sending their segments is gseconds. At time 0, source 1 sends a segment, then at time gseconds, source 2 sends, at time 2gseconds source 3 sends and so on. Also source 1 sends its second segment at time tseconds and so on.
The network becomes overloaded when the input rate is greater than the output rate. In the simulation, the network will get overloaded if number of sources
for t= 1 millisecond, g= 50 microseconds, = 20.
Once the network is overloaded, the ACRs become low and each of the N sources gets approximately of the bandwidth. Since the ACRs are low, the burst which occurs after 1 second should not give rise to large queues. There are still switch queues which occur initially when the network has not yet detected that it is overloaded.
The TCP congestion window grows exponentially whenever it receives an acknowledgement. Let dbe the length of the links in kms. The acknowledgement arrives after a round trip time of ms. But, by this time the source will have generated more segments to send. After each round trip, each source will send twice the number of segments. A larger round trip can give rise to a larger burst. The network gets overloaded when the number of segments sent just fills up the pipe.
Let kbe the number of round trips it takes to overload the network. Time taken for transmitting one cell is 2.83 microsecond (at 149.76 Mbps, on an OC-3 link accounting for SONET overhead). In tseconds, each source sends one segment which gives rise to cells. Therefore, the network gets overloaded when
For t= 1 millisecond, mss= 512, the value of kis 5. Hence, in this case after 6 (= k+1) round trips, the network detects the overload if N> 20. The maximum queues should occur after 6 round trips for N> 20.
The value of the maximum queue in an overloaded network can be derived as follows.
The burst due to one source = cells.
Hence, the burst due to Nsources = cells (2)
For t= 1 millisecond, g= 50 microseconds, mss= 512 and , we get
(3) |
(4) |
We show the maximum queues at the bottleneck switch for the cases where the number of sources Nvaries from 2 to 200. The results are shown in Table 1. The plot of the expected queue length and the actual length obtained from simulation results versus the number of sources is shown in Figure 3.
The length of all the links is 1000 km, giving rise to 15 ms propagation delay from source to destination. The round trip time (RTT) is 30 milliseconds. The feedback delay (the delay between the bottleneck link and the source and back) is 10 ms. A round trip of 30 ms corresponds to 11029 cells. The other parameter values are t= 1 millisecond, g= 50 microseconds, mss= 512 bytes, = 65536.
The maximum queue length for sources was at time around 1 second. For N> 20, the maximum queue was at time around 300 millisecond (which agrees with the analytical prediction). The queue lengths increase with the number of sources as expected. The queue lengths also correspond to the burst size sent by the sources. But, this is true only when the number of sources is less than or equal to 20. For N= 30, there is a sharp decrease in the maximum queue length. For N> 40, it starts increasing again but now it increases at a slower rate.
From Table 1, it can be seen that for , the simulation agrees well with the analytical values. For N> 30 initially the queue lengths in the simulation are higher than the analytical value, but later it becomes lower than the expected queue length. This can be explained as follows. Initially the traffic generated in each cycle (tseconds) can be sent within that cycle. But as the number of sources increases, (N> 80) the network gets overloaded, and there are some pending cells which get sent in the next cycles. Because of this the network detects the overload at lower queue lengths than the analytical value.
# TCP | Max Queue | Analytical | # TCP | Max Queue | Analytical |
Sources | size | Queue Size | Sources | size | Queue Size |
(cells) | (cells) | (cells) | (cells) | ||
2 | 1575 | 2730 | 100 | 38088 | 35300 |
3 | 3149 | 4095 | 110 | 43672 | 38830 |
5 | 6297 | 6825 | 120 | 44939 | 42360 |
10 | 14131 | 13650 | 130 | 44708 | 45890 |
20 | 29751 | 27300 | 140 | 44744 | 49420 |
30 | 20068 | 10590 | 150 | 46058 | 52950 |
40 | 19619 | 14120 | 160 | 48880 | 56480 |
50 | 24162 | 17650 | 170 | 50784 | 60010 |
60 | 28006 | 21180 | 180 | 49961 | 63540 |
70 | 30109 | 24710 | 190 | 53366 | 67070 |
80 | 31439 | 28240 | 200 | 55618 | 70600 |
90 | 34530 | 31770 | - | - | - |
To study the effect of different parameters, a full factorial experiment was carried out for different parameter values. For a given number of sources the following parameters were varied: maximum segment size (mss= 512, 1024 bytes), time between two sources sending segments (g= 50, 100 microseconds), time between successive segments of a source (t= 1, 10 milliseconds) and length of the links (d= 1000, 2000 kms).
# | mss/g/t/d | N=3 | N=10 | N=30 | N=40 | N=50 | N=100 |
1 | 512/50/1/1000 | 3171 | 14273 | 20068 | 19619 | 24162 | 35687 |
2 | 512/50/1/2000 | 3171 | 14273 | 19906 | 27567 | 30872 | 75083 |
3 | 512/50/10/1000 | 3172 | 14274 | 45994 | 61854 | 77714 | 150453 |
4 | 512/50/10/2000 | 3172 | 14274 | 45994 | 61854 | 77714 | 150458 |
5 | 512/100/1/1000 | 3171 | 14273 | 19283 | 20080 | 24164 | NA |
6 | 512/100/1/2000 | 3171 | 14273 | 21241 | 32314 | 35961 | NA |
7 | 512/100/10/1000 | 3172 | 14274 | 45994 | 61854 | 77714 | NA |
8 | 512/100/10/2000 | 3172 | 14274 | 45994 | 61854 | 77714 | NA |
9 | 1024/50/1/1000 | 3040 | 13680 | 18650 | 18824 | 23542 | NA |
10 | 1024/50/1/2000 | 1542 | 5612 | 19131 | 22934 | 29163 | NA |
11 | 1024/50/10/1000 | 3040 | 13680 | 44080 | 59280 | 74480 | NA |
12 | 1024/50/10/2000 | 3041 | 13681 | 44081 | 59281 | 74481 | NA |
13 | 1024/100/1/1000 | 3040 | 13680 | 18591 | 19600 | 24314 | NA |
14 | 1024/100/1/2000 | 1403 | 5556 | 17471 | 24412 | 30533 | NA |
15 | 1024/100/10/1000 | 3040 | 13680 | 44080 | 59280 | 74480 | NA |
16 | 1024/100/10/2000 | 3041 | 13681 | 44081 | 59281 | 74481 | NA |
Table 2shows for number of source N=3,10,30,40,50,100 the maximum queue sizes for 16 experiments. From Table 2, the following observations can be made. (These observations support the analytical results.)
We have artificially generated a worst case scenario and studied the buffer requirements for TCP over ABR. The buffer size required depends on the switch scheme used. Our simulations are based on our ERICA+ switch scheme. For the worst case scenario generated, both the analytical prediction and simulation results for queue lengths (and hence buffer requirements) are given. The simulation agrees well with the analytical prediction. The queue lengths are affectedby maximum congestion window size, the round trip time, network congestion(overloaded or underloaded) and number of sources. It is not affectedby the maximum segment size.