A Survey of Application Layer Techniques for Adaptive Streaming of Multimedia 1
Bobby Vandalore, Wu-chi Feng, Raj Jain, Sonia Fahmy
Department of Computer and Information Science
The Ohio State University, Columbus, OH, USA
E-mail: jain@cse.wustl.edu
Abstract
The current Internet only supports best-effort traffic. New high-speed technologies such as ATM (asynchronous transfer mode), gigabit Ethernet, fast Ethernet, and frame relay, have spurred higher user expectations. These technologies are expected to support real-time applications such as video-on-demand, Internet telephony, distance education and video-broadcasting. Towards this end, networking methods such as service classes and integrated service models are being developed.
Today's Internet is a heterogeneous networking environment. In such an environment, resources available to multimedia applications vary. To adapt to the changes in network conditions, both networking techniques and application layer techniques have been proposed. In this paper, we focus on the application techniques, including methods based on compression algorithm features, layered encoding, rate shaping, adaptive error control, and bandwidth smoothing. We also discuss operating system methods to support adaptive multimedia. Throughout the paper, we discuss how feedback from lower networking layers can be used by these application-level adaptation schemes to deliver the highest quality content.
Keywords: Adaptive Multimedia applications, QoS, Rate Shaping, Smoothing, Adaptive Error Control
The Internet was designed for best-effort data traffic. With the development of high-speed technologies such as ATM (asynchronous transfer mode), gigabit Ethernet, and frame relay, user expectations have increased. Real-time multimedia applications including video-on-demand, video-broadcast, and distance education are expected to be supported by these high-speed networks. Organizations such as the IETF (Internet Engineering Task Force), ATM Forum, ITU-T (International Telecommunications Union) are developing new protocols (e.g., real-time transport protocol), service models (e.g., integrated services and differentiated services), and service classes (e.g., ATM Forum service categories and ITU-T transfer capabilities) to support multimedia application requirements.
The Internet is a heterogeneous environment connecting various networking technologies. Even with networking support through service classes, the available network resources to a multimedia applications will be variable. For example, the network conditions may change due to difference in link speeds (ranging from 28.8 kbps modem links to 622 Mbps OC-12 links) or variability in a wireless environment caused by interference and mobility. One way of achieving the desired quality of service in such situations is by massively over-provisioning resources for multimedia applications. But this solution leads to inefficiency. Without over-provisioning, network resources can be used efficiently if multimedia applications are capable of adapting to changing network conditions.
Adaptation of multimedia applications can be done at several layers of the network protocol stack. At the physical layer, adaptive power control techniques can be used to mitigate variations in a wireless environment. At the data link layer, error control and adaptive reservation techniques can be used to protect against variation in error and available rate. At the network layer, dynamic re-routing mechanisms can be used to avoid congestion and mitigate variations in a mobile environment. At the transport layer, dynamic re-negotiation of connection parameters can be used for adaptation. Applications can use protocols such as real-time streaming protocol (RTSP) [ 1 ] and real-time protocol (RTP) [ 2 ]. At the application layer, the application can adapt to changes in network conditions using several techniques including hierarchical encoding, efficient compression, bandwidth smoothing, rate shaping, error control, and adaptive synchronization.
This paper focuses mainly on the application layer techniques for adaptation. The rest of the paper is organized as follows. Section 2 gives an overview of the compression methods and discusses techniques for adaptation based on these methods. Section 3 discusses the application streaming level adaptation techniques. These techniques include both reactive and passive methods of adaptation. Throughout the paper, we also discuss how low-level network feedback (such as available capacity or error rate) can be/is used in adaptation methods.
While the rest of the paper deals with multimedia in general, in this section we briefly examine video compression algorithms and their features which are useful for adaptation. Two reasons for focusing on video are: (1) video requires larger bandwidth (100 kbps to 15 Mbps) than audio (8 kbps - 128 kbps), and (2) humans are more sensitive to loss of audio than video. Hence, we generally should bias towards adapting the video part of the multimedia application.
Transmitting raw video information is inefficient. Hence, video is invariably compressed before transmission. The three main compression techniques used for video are: (1) discrete cosine transformation (DCT) based, (2) wavelet transforms based, and (3) proprietary methods. Other methods of compressing video not discussed here include vector quantization [ 3 , 4 ] and content-based compression. Adapting to changing network conditions can be achieved by a number of techniques at the compression level (video encoder) including layered encoding, changing parameters of compression methods, and using efficient compression methods. In the event of bandwidth not being available to the video source it can reduce its encoding rate by temporal scaling (reducing frame rate) or spatial scaling (reducing resolution).
DCT is the compression method used in the popular MPEG (Moving Picture Experts Group) set of standards [ 5 ]. MPEG standards are used for both audio and video signals. MPEG-2, MPEG-1 and JPEG (an earlier standard for still images) all use discrete cosine transformations, in which the signals are transformed to the frequency domain using Fourier transforms. The transformed coefficients are quantized using scalar quantization and run length encoded before transmission. The transformed higher frequency coefficients of video are truncated since the human eye is insensitive to these coefficients. The compression relies on two basic methods: intra-frame DCT coding for reduction of spatial redundancy, and inter-frame motion compensation for reduction of temporal redundancy. MPEG-2 video has three kinds of frames: I, P, and B. I frames are independent frames compressed using only intra-frame compression. P frames are predictive, which carry the signal difference between the previous frame and motion vectors. B frames are interpolated, i.e., encoded based on the previous and the next frame. MPEG-2 video is transmitted in group of pictures (GoP) format which specifies the distribution of I, P, and B frames in the video stream.
There are several aspects of the MPEG compression methods which can be used for adaptation. First, the rate of the source can be changed by using different quantization levels and encoding rate [ 6 , 7 ]. Second, DCT coefficients can be partitioned and transmitted in two layers with different priorities. The base layer carries the important video information and additional layer improves the quality. In the event of congestion, the lower priority layer can be dropped to reduce the rate [ 8 , 9 , 1 0].
In wavelet compression, the image is divided into various sub-bands with increasing resolutions. Image data in each sub-band is transformed using a wavelet function to obtain transformed coefficients. The transformed coefficients are then quantized and run length encoded before transmission. In a sense, wavelet compression results in progressively encoded video. Two common approaches for wavelet compression are to use a motion-compensated 2-dimensional (2-D) wavelet function [ 1 1] or a 3-D wavelet [ 1 2].
Wavelet compression overcomes the blocking effects of DCT based methods since the entire image is used in encoding instead of blocks. An important feature of wavelet transforms is the support of scalability for image and video compression. Wavelet transforms coupled with encoding techniques provide support for continuous rate scalability, where the video can be encoded at any desired rate within the scalable range [ 1 3]. A wavelet encoder can benefit from network feedback such as available capacity to achieve scalability [ 1 4].
Commercial applications such as Real Networks Inc.'s RealVideo and Intel's Indeo use proprietary methods for compression and adaptation. These proprietary schemes use both DCT based and wavelet based techniques for compression. A distinguishing feature of these methods is that they are optimized to work for particular bandwidths such as 28.8 kbps and 56 kbps. Some of the techniques used for adaptation by these applications are discussed later in the paper.
In layered encoding, a passive method, the video information is encoded into several layers. The base layer carries important video (lower order coefficients of DCT) and critical timing information. The higher layers improve the quality of video progressively. The receiver can get a reasonable quality with the base layer, and quality improves with reception of higher layers. The encoder assigns priorities to the encoded layers, with the base layer having the highest priority. When the network transmits layered video, it can drop lower priority (higher) layers in the event of congestion.
A discussion of adaptive transmission of multi-layered video is given in [ 15]. Here the layered encoding method is made reactive by adding or dropping layers based on network feedback. The paper discusses both credit-based and rate-based approaches for providing feedback.
An optimal data partitioning method for MPEG-2 encoded video is discussed in [ 8 ]. In this method, the data in the MPEG-2 stream is partitioned into two layers (which can be done even after encoding). The two streams are transmitted over an ATM-based network, with ATM cells of the lower priority stream having their CLP (cell loss priority) bit set. The paper discusses an optimal algorithm to partition the MPEG-2 video stream. The problem is posed as an optimization problem and the Lagrangian optimization technique is used for finding the optimal partitioning for I, P, and B frames. Data partitioning methods can benefit from network feedback. For example, if the network indicates that more bandwidth is available, more data can be sent in the base layer, and conversely data in the base layer can be reduced when bandwidth is scarce.
Receiver driven Layered Multicast (RLM), a reactive method, was the first published scheme which described how layered video can be transmitted and controlled [ 1 6]. In RLM, receivers dynamically subscribe to the different layers of the video streams. Receivers use ``probing'' experiments to decide when they can join a layer. Specifically, if a receiver detects congestion, the receiver quits the multicast group of the current highest layer (drops a layer), and when extra bandwidth is available, it joins the next layer (adds a layer). The network congestion is detected by packet losses. Extra capacity is detected by join experiments. In a join experiment, the receiver measures the packet loss after joining. The join experiment fails if the packet loss is above a certain threshold. Receiver join experiments are randomized to avoid synchronization. The overhead of join experiments in the presence of a large number of receivers is controlled by the receivers learning from the join experiments of others, instead of initiating their own.
Currently, extra capacity is only estimated in RLM. The low-level network feedback can aid the receivers in measuring precisely the available capacity. Hence, this scheme will benefit if network layer feedback is used.
Layered video multicast with retransmission [ 1 7] is another method which uses layered video. The issue of inter-session fairness and scalable feedback control of layered video is discussed in [ 1 8].
Rate shaping techniques are reactive and attempt to adjust the rate of traffic generated by the video encoder according to the current network conditions. Feedback mechanisms are used to detect changes in the network and control the rate of the video encoder.
Video has been traditionally transported over connections with constant bit rate (e.g., telephone or cable TV networks). The rate of the video sequence changes rapidly due to scene content and motion. The variable rate video is sent to a buffer which is drained at a constant rate. In such as situation, the video encoder can achieve constant rate by controlling its compression parameters based on feedback information such as the buffer occupancy level.
A similar technique is used for adapting video and audio to network changes. In these cases, the feedback from the network is used instead of local buffer information. Control mechanisms for audio and video are presented in [ 7 , 1 9].
Rate shaping of the IVS video coder (which uses the H.261 standard) is discussed in [ 7 ]. The rate shaping can be obtained by changing one or more of the following:
Two modes are used for controlling the rate of encoder. In Privilege Quality mode (PQ mode), only the refresh rate is changed. In Privilege Rate mode (PR mode), only the quantizer and movement detection threshold are changed. PQ mode control results in higher frame rates, but with lower SNR (signal-to-noise ratio) than PR mode.
The packet loss information is used as feedback. The receiver sends periodically its current loss rate. The following simple control algorithm is used to dynamically control the rate of the video encoder:
If median loss > tolerable loss
= max(
,
)
else
= max(
,
)
This multiple decrease, additive increase mechanism adapts well to network changes.
Two dimensional scaling changes both the frame rate and the bit rate based on the feedback [ 2 0]. Experimental results show that the system performs well in a rate constrained environment such as the Internet. A heuristic (success rate) is used to decide whether the rate can be increased. The low-level network feedback information, if available, can replace this heuristic.
Rate shaping mechanisms use similar methods but differ in how the rate shaping is achieved. Other rate change approaches include block dropping [ 2 1] and frame dropping [ 2 2].
The error rate is variable in a wireless network due to interference, and the loss rate is variable in the Internet due to congestion. Multimedia applications need to adapt to changes in error and loss rates. Two approaches to mitigate errors and losses are Automatic Repeat Request (ARQ) and Forward Error Correction (FEC). ARQ is a closed-loop and reactive mechanism in which the destination requests the source to retransmit the lost packets. FEC is an open-loop and passive method in which source sends redundant information, which can partly recover the original information in the event of packet loss. ARQ increases the end-to-end delay dramatically in networks such as the Internet. Hence, ARQ is not suitable for error control of multimedia applications in the Internet. It may be used in high-speed LANs where round trip latencies are small.
Other error control methods include block erasure codes, convolutional codes, interleaving and multiple description codes.
An adaptive FEC-based error control scheme (a reactive method) for interactive audio in the Internet is proposed in [ 2 3]. The FEC scheme used is the ``signal processing'' FEC mechanism [ 2 4]. In this scheme, the n+1st packet includes, in addition to its encoded signal samples, information about packet n which can be used to approximately reconstruct packet n. The IETF recently standardized this scheme to be used in Internet telephony. The scheme works only for isolated packet losses, but can be generalized to tolerate consecutive packet losses by adding redundant versions of previous packets (n-1 and n-2). The FEC-based scheme needs more bandwidth, so it should be coupled with a rate control scheme. The joint rate/FEC scheme can be used to adaptively control the rate and the amount of redundant information to be sent by the FEC method. The inventors of the scheme formulate the problem of choosing the FEC-method to use under the constraints of the rate control scheme as an optimization problem. A simple algorithm is used to find the optimal scheme. Actual measurements of the scheme for audio applications between France and London have shown that the scheme performs well and the perceptual quality of the audio is good.
An adaptive FEC-based scheme for Internet video is discussed
in [
2
5]. The packet can carry redundant FEC information
for up to four packets, i.e., packet n carries redundant information
about packets n-1, n-2, n-3. Let n-i indicate that packet
n includes information about n-i. The different possible
combinations of these methods are: (n), (n, n-1), (n, n-2), (n, n-1,
n-2) and (n, n-1, n-2, n-3). These are numbered as combination-1
through combination-5. Different combinations can be used to adapt to
network changes. The network changes are detected through packet loss,
and a loss threshold (high loss) is used in the algorithm for
adaptation. The following simple adaptation algorithm was used:
If loss
high loss
Combinaton = min(
Combination+1,4)
else
Combinaton = max(
Combination-1,0)
This algorithm adds more error protection when there is more loss and less protection when the losses are low.
One way to use network feedback in this method is to couple the rate available and the FEC combination used. For example, information about available rate and loss rate got as feedback from network can be used to choose the FEC combination for error protection.
Synchronization is an important problem for multimedia applications. Synchronization problems arise due to clock frequency drift, network delay, and jitter. Adaptive synchronization can be used for multipoint multimedia teleconferencing systems [ 26]. The adaptive synchronization technique proposed in [ 2 6] is immune to clock offset and/or clock frequency drift, does not need a global clock, and provides the optimal delay and buffering for the given QoS requirement. The adaptive synchronization technique can be used to solve both intramedia (in a single stream) synchronization and intermedia (among multiple streams) synchronization.
The main idea used in the synchronization algorithm is to divide the packets into wait, no wait and discard categories. Packets in the wait bucket are displayed after some time, no wait packets are displayed immediately, and discard category packets are discarded.
The basic adaptive synchronization algorithm requires the user to specify the acceptable synchronization error, maximum jitter and maximum loss ratio. The sender is assumed to put a timestamp in the packets. At the receiver, the playback clock (PBC) and three counters for no wait, wait and discard packets are maintained. The algorithm specifies that when packets arrive early and enough wait packets have been received, the PBC is incremented. Similarly, when a threshold of no wait or discard packets are received, the PBC is decremented. This adaptive algorithm is shown to be immune to clock drift. Achieving intramedia synchronization is a straight-forward application of the basic algorithm. For intermedia synchronization, a group PBC is used, which is incremented and decremented based on the slowest of the streams to be synchronized.
The network delay is only estimated in this adaptive algorithm. The adaptation can benefit if the low-level feedback provides accurate information on the delay experienced in the network.
One way to mitigate the rate variations of the multimedia application is to perform shaping or smoothing of the video information transmitted. Recent studies show that smoothing allows for greater statistical multiplexing gain.
For live (non-interactive) video, a sliding-window of buffers can be used, and the buffer can be drained at the desired rate. This method is used in SAVE (smoothed adaptive video over explicit rate networks) [ 6 ], where a small number of frames (30) is buffered in a window. The video is transmitted over the ATM ABR (available bit rate) service, where the feedback from the network is indicated explicitly. The SAVE algorithm (a reactive method) uses this feedback information to dynamically change the quantizer value of the MPEG-2 encoder. Note that this method already uses the low-level network feedback. Similar approaches have been proposed in [ 2 7, 2 8].
For pre-recorded (stored) video, the a-priori video (frame) information can be utilized to smooth the video traffic at the source. Bandwidth smoothing (a passive method), can reduce the burstiness of compressed video traffic in video-on-demand applications.
The main idea behind smoothing techniques is to send ahead large frames which need to be displayed later when there is enough buffer space at the client. There has been considerable research in this area resulting in several smoothing algorithms [ 2 9, 3 0, 3 1, 3 2, 3 3]. These differ in the optimality condition achieved, and whether they assume that the rate is constrained or the client buffer size is limited. A good comparison of bandwidth smoothing algorithm is given in [ 3 4]. In the next subsection, we discuss the idea of bandwidth smoothing in more detail.
A compressed video stream consists of n frames, where frame irequires fi bytes of storage. To permit continuous playback, the server must always transmit video frames ahead to avoid buffer underflow at the client. This requirement can be expressed as:
Where Funder(k) indicates the amount of data consumed at the client when it is displaying frame k ( ). Similarly, the client should not receive more data than its buffer capacity. This requirement is represented as:
where b is client buffer size. Consequently, any valid transmission plan should stay within the river outlined by these vertically equidistant functions. That is,
where ci is the transmission rate during frame slot i of the smoothed video stream.
Generating a bandwidth plan is done by finding m consecutive runs which use constant bandwidth rj. Within each run, the frames are transmitted at this constant rate. The rate change occurs to avoid buffer overflow or buffer underflow. Mathematically, the runs of bandwidth plan must be such that the amount of frame data transfered forms a monotonically increasing, piecewise linear function.
Different bandwidth smoothing algorithms result from choosing the rate changes among the bandwidth runs. Several optimal bandwidth allocation plan-generating algorithms are discussed in [ 29]. These algorithms achieve optimal criteria such as minimum number of bandwidth changes, minimum peak rate requirement, and largest minimum bandwidth requirement. We discuss below an online smoothing algorithm, a proactive buffering mechanism and two algorithms which combine smoothing and rate shaping techniques.
Live video applications such as broadcasting of a lecture and news are delay tolerant, in the sense that the user does not mind if the video is delayed in the order of few seconds (or even a minute). For these live video applications, smoothing techniques (passive methods) can significantly reduce the resource variability.
Several window based online smoothing algorithms (passive methods) are discussed in [ 3 5]. In the first approach, a hop-by-hop window smoothing algorithm is used. Here the server stores up to a window of W frames. The smoothing algorithm is performed over this window of frames taking into consideration the server and client buffer constraints. After the transmission of W frames, the smoothing algorithm is performed for the next set of W frames. This algorithm does not handle an inter-mixture of large I frames among P and B frames, since only in the first window the transmission of I frame is amortized. The consecutive windows can be aligned with an I frame at the end of each window.
While in the hop-by-hop algorithm, the server cannot prefetch data across window boundaries, the sliding-window method SLWIN() uses a sliding window of size W for smoothing. The smoothing algorithm is repetitively performed for every frames time units over the next W frames. The sliding-window performs better but is more complex, since the smoothing algorithm is executed more times than in the hop-by-hop method.
Another passive method, rate constrained bandwidth smoothing for stored video, is given in [ 3 6]. Here, the rate is assumed to be constrained to a given value (for example, the minimum cell rate (MCR) in the ABR service). The algorithm proactively manages buffers and bandwidth. This method uses the rate constrained bandwidth smoothing algorithm (RCBS) [ 3 1] which minimizes the buffer utilization for a given rate constraint. In RCBS, the movie frames are examined in reverse order from the end of the movie. The large frames which require more than the constrained rate are prefetched. These prefetches fill the gaps of earlier smaller frames.
The proactive method identifies feasible regions of the movie. A feasible range is where the average rate requirement of the range is less than the constrained rate. The movie frames are examined in the reverse order and feasible regions are identified. The algorithm keeps track of the expected buffer occupancy at the client side and finds the maximal feasible regions. When the rate constraint is violated, frames are dropped. The dropped frames are placed apart to avoid consecutive frame drops. The proactive method results in maximizing the minimum frame rate for a given rate constraint.
The low-level network feedback can be used in this method as follows: assume that the network can guarantee the constrained rate and inform the source through feedback if any extra bandwidth is available. This extra bandwidth and current buffer occupancy level can be used decide if additional frames can be sent.
Passive adapatation techniques like bandwidth smoothing algorithms take advantage of a priori information to reduce burden on the network, however, they do not actively alter the video stream to make them network sensitive. The reactive techniques usually do not take advantage of the a priori information and hence, may not provide the best possible quality video over best effort networks. Some recent work has focused on augmenting reactive techniques to take advantage of this a priori knowledge.
A priority-based technique (reactive method) is used to deliver prerecorded compressed video over best-effort networks in [ 3 7]. Multiple level priority queues are used in addition to a window at each level to help smooth the video frame rate while allowing it to change according to changing network conditions. The scheme uses frame dropping (adaptation technique) and a priori knowledge of frame sizes. The scheme tries to deliver the frame of highest priority level (base layer) before delivering the frames of enhancement layers. Adaptation is accomplished by dropping frame at the head of the queue if enough resources are not available.
Another algorithm which combines the smoothing and rate changing technique (frame dropping) is discussed in [ 3 8]. An efficient algorithm to find the optimal frame discards for transporting stored video over best-effort networks is given. The algorithm uses the selective frame discarding technique. The problem of finding the minimum number of frame discards for a sequence of frames is posed as an optimization problem. A dynamic programming based algorithm and several simpler heuristic algorithms are given to solve this problem.
In this section, we present two commercial adaptive applications: (1) real networks suite, and (2) Vosaic: video mosaic. These commercial applications are currently available and incorporate some of the adaptation techniques discussed in the previous section. They also incorporate additional optimization techniques. There are several other adaptive multimedia applications and architectures developed by academia such as Berkeley's vic [ 1 6], videoconferencing system for the Internet (IVS) [ 7 ] developed by INIRA in France, Berkeley's continuous media tool kit (CMT) [ 3 9], OGI's adaptive MPEG streaming player [ 4 0], and MIT's View Station [ 4 1]. Most of these applications have contributed to the research results of adaptation methods discussed in earlier sections.
SureStream uses the Adaptive Stream Management (ASM) functionalities available in the RealSystem API (application program interfrace). ASM provides rules to describe the data streams. These rules provide facilities such as marking priorities and indicating average bandwidth for a group of frames. This information is used by the server for achieving adaptability. For example, the server might drop lower priority frames when the available rate decreases. A condition in the rule can specify different client capabilities. For example, it can indicate that the client will be able to receive at 5 to 15 kbps and can tolerate a packet loss of 2.5 percent. If the network conditions change, the clients can subscribe to another appropriate rule.
The techniques used in RealVideo and SureStream can benefit from low-level network feedback. For example, instead of detecting bandwidth changes through measurements, the server can use the available bandwidth information from the lower network layer to the choose the appropriate stream to transmit.
VDP uses two connections: an unreliable one for streaming data and a reliable one for control. In the control channel, the client application can issue VCR-like instructions such as play, stop, fast forward, and rewind. An adaptation algorithm is used to adjust the rate of the stream according to network conditions. The clients indicate two metrics: frame drop rate and packet drop rate as measured at the client, to the server as feedback using the control channel. The server initially transmits frames at the recorded rate and adjusts the frame rate based upon the feedback received from the client side. Experimental results show that the frame rate improves considerably when the VDP protocol and the adaptation algorithm are used (e.g., frame rate improved to 9 frames/sec from 0.2 frames/sec).
Vosaic can definitely benefit from low-level network feedback. Currently, the network condition is detected by measurements of the received frame rate at the client and sent to the server. Instead, the server can use network feedback such as available rate to dynamically adjust its frame rate.
Conventional operating systems are not designed for multimedia applications. For example, playback applications need to access CPU resources periodically during playout. This entails that the operating system provide ways for multimedia applications to access resources. To develop adaptive multimedia applications, there is need for an operating system capability which can provide information about available resources. In this section, we discuss some techniques which are used in operating systems to support adaptive multimedia streaming.
Multimedia applications use multiple resources, and resources such as CPU availability and bandwidth change dynamically. An integrated QoS management system to manage CPU, network and I/O resources is proposed in [ 4 5]. This cooperative model enables multimedia end-systems and OS to cooperate dynamically for adaptively sharing end-system resources. The thesis of this work is that end-system resources should be allocated and managed adaptively. The proposed OS architecture called AQUA (Adaptive Quality of service Architecture) aims to achieve this objective.
In AQUA, when an application starts, it specifies a partial QoS (for example, a video application can specify frame rate and may not specify bandwidth requirement). The OS allocates initial resources such as CPU time based on this QoS specification. As the application executes the OS and the application cooperate to estimate the resource requirements and QoS received. Resource changes are detected by the measuring QoS. Then, the OS and the application renegotiate and adapt to provide predictable QoS with current resource constraints. To enable these functionalities, the AQUA framework includes a QoS manager, QoS negotiation library, and usage-estimation library.
The application specifies an adaptation function when the connection is setup. The QoS manager calls this function when it detects changes in QoS. Using this methodology a CPU-resource manager and network-I/O manager has been implemented in AQUA. A composite QoS manager uses the services of both CPU and network-I/O managers. This facilitates an integrated way to manage resources in AQUA.
The AQUA framework can use low-level network feedback to detect current availability of network resources. The QoS measuring function can be enhanced by using the network layer feedback.
Multimedia applications need to access periodically resources such as CPU. The operating system needs to schedule multimedia applications appropriately to support such needs. The CPU requirement of a multimedia application might dynamically change due to the frame rate change caused by scene changes or network conditions. A framework called Adaptive Rate-controlled (ARC) scheduling is proposed to solve this problem in [ 4 6]. It consists of a rate-controlled online CPU scheduler, an admission control interface, a monitor, and a rate adaptation interface.
ARC operates in a operating system which supports threads (Solaris 2.3). The threads can be of three types: RT (real-time), SYS (system) or TS (timesharing). RT threads have the highest priority in accessing the CPU. The online CPU scheduler schedules the threads belonging to these classes based on their priorities.
Adaptive rate-controlled scheduling is achieved as follows: multimedia threads register with the monitor thread during connection setup. The monitor thread is executed periodically (every 2 seconds). It estimates for each registered thread whether the CPU usage lags (how fast the thread is running ahead of its rate) or lax (measures how much of the CPU is unused). This estimation is given as feedback to the multimedia application which increases or decreases its CPU access rate accordingly.
This method can benefit by low-level network feedback. For example, the multimedia application can use the available bandwidth indicated in network layer feedback and change its encoding rate and also change its CPU access rate.
In this section, we present a summary of some related work. Most of these works are related to supporting multimedia, thought not directly dealing with the problem of adapting multimedia streaming to changing network conditions. When appropriate, we identify if the work could be used for achieving adaptation of multimedia streaming application.
Adaptation to changing network conditions can be achieved at several layers of the network protocol stack. In this paper, we surveyed several techniques for achieving adaptation at the application layer. A summary of these are as follows:
For each of these techniques, we discussed if it could benefit from low-level network feedbacks. When appropriate, we discussed how the low-level feedback can be used for enhancing adaptation technique.