SIP Working Group Sanjoy Sen Internet Draft Jayshree Bharatia draft-sen-sip-earlymedia-00.txt Chris Hogg Category: Informational Francois Audet NORTEL NETWORKS Expires: January 2002 July 13, 2001 Early Media Issues and Scenarios Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [5]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. For potential updates to the above required-text see: http://www.ietf.org/ietf/1id-guidelines.txt Abstract Once an INVITE is sent, an UAC should be able to handle incoming media at anytime. Hence, a SIP terminal should be able to support early media. However, there are some issues with the way session descriptions (SDP) in provisional responses are handled, optional use of resource reservation, complexity introduced by forked INVITEs and interactions with NAT/Firewall traversal. This draft discusses some of these issues along with related IP-PSTN inter-working scenarios. 1 Introduction Early media is a concept of delivering a media stream prior to call answer or session establishment. Normally, early media designates a media transmission sent before the actual completion of the call. In terms of SIP, early media refers to transmission of media prior to response code 200 OK being sent to an Invite. Early media is generally required to deliver inbound call progress messages when inter-working with Public Switched Telephone Network (PSTN) or Private Switched Telephone Network (i.e. PBX). In the Sen 1 Early Media Issues and Scenarios July, 2001 PSTN, a one-way voice path is established to the caller by the Address Complete Message (ACM). In the PBX, a one-way voice path is established to the caller based on the presence of a Progress indicator indicating in-band information. The one-way voice path is used for transmitting early media, such as busy tone, reorder tone, announcements etc. From SIP perspective, early media can also be useful to avoid clipping of the backwards voice path when a call is answered. This can happen because the audio media may unintentionally arrive at the originating user agent ahead of a 200 OK response of an INVITE. According to [4], SIP terminals should be ready to receive media any time after sending an INVITE. Hence, a SIP terminal should be able to support early media. However, there are some issues with the way session descriptions (SDP) in provisional responses are handled, optional use of resource reservation, complexity introduced by forked INVITEs, interactions with NAT/Firewall traversal etc. This draft discusses some of these issues along with related IP-PSTN inter-working scenarios. Although most of the early media scenarios discussed in this document refer to IP-PSTN inter-working, they are applicable to other IP-IP scenarios as well. Also note that, although many of the current SIP implementations do not support the origination of early media, it is assumed that this support will soon become an intrinsic feature of all SIP terminals. This draft discusses issues in supporting early media when dealing with SIP terminals. 1.1 Terminology Callee Refers to the terminating host Caller Refers to the originating host Media Gateway (MG) The media gateway converts media provided in one type of network to the format required in another type of network. For example, a MG could terminate bearer channels from PSTN (e.g., DS0s) and media streams from a packet network (e.g., RTP streams in an IP network). Media Gateway Controller (MGC) The media gateway controller controls the parts of the call state that pertain to connection control for media channels in a Media Gateway. ISUP Initial Address Message (IAM) This message is used to establish a connection on a specified circuit. It includes all necessary information required for handling ISUP call. Informational - Expires January 2002 2 Early Media Issues and Scenarios July, 2001 ISUP Address Complete (ACM) This message is considered as a response of an ISUP IAM. It indicates that the call is being processed, and the distant exchange is checking the availability of the called party. This could also mean that called party is ringing/alerted. In PSTN, a one-way voice path is established to the caller by the ACM message. This voice path is used to carry voice announcements and to transmit tones. ISUP Answer Message (ANM) This is also sent in the same direction as ACM to indicate that the called party has answered. IP Terminal The term used to represent all end-user devices that originate and terminate SIP calls PBX A Private Branch Exchange is a private telephone switch. For the purpose of this document, the protocol used by the PBX is assumed to be Q.931 or a derivative (Q.SIG, etc.). PSTN This is the Public Switched Telephone Network. PSTN is also sometimes referred as GSTN (General Switched Telephone Network). PSTN Origination An originator's ingress-MGC receives ISUP from the PSTN network. This request is forwarded to SIP network either using SIP-T mechanism or direct translation of parameters from received ISUP message to a SIP method. PSTN termination A terminator's egress-MGC receives an INVITE from an IP terminal. This request is forwarded to PSTN network either using SIP-T mechanism or direct translation of parameters from received SIP method to the ISUP message. Q.931 SETUP message This message is used to establish a connection on a specified circuit. It includes all necessary information required for handling ISDN calls. Q.931 Progress indicator A progress indicator can be included in a Q.931 CALL PROCEEDING, PROGRESS or ALERTING message. If the value of the progress indicator is #1, "Call is not end-to-end ISDN" or #8, "In-band information or an appropriate pattern is now available", it means that in-band tones and announcements (such as ringing or busy) may be provided by the far end. Q.931 ALERTING message Informational - Expires January 2002 3 Early Media Issues and Scenarios July, 2001 An ALERTING message indicates that the user is being alerted, and that unless in-band tones and announcements are provided, local ringing shall be provided. Q.931 CONNECT message A CONNECT message indicates that the called party has answered. 2 Early Media Support in current version of SIP Support of early media is implicit in the current version of SIP [4]. Once an INVITE with SDP is sent, an UAC should able to handle incoming media at anytime. According to [4]: "If a 1xx response contains a session description, a UAC SHOULD cease generating local ring-back tone." It also states: "The UAS can remove the media stream by setting the port number to zero in a subsequent session description contained in a provisional response and thus restore normal ring-back behaviour." If 1xx is received without SDP, this does not cause any change in the behaviour of the UAC. Also, [4] mandates that for the UAS to send a provisional response with SDP, the UAC need to send an INVITE with SDP. The above way of notifying the UAC of the ensuing early media session (i.e., with 18x containing SDP) will create problems for PSTN inter-working as there is no way for the MGC to know, apriori, of any early media session origination from the PSTN. The SDP received in 18x provisional response may contain a different session description for early media than that contained in 200 OK for the actual media (this may happen, for example, when the early media is generated by an announcement server different than the called end-point). This separation of session description for early media should be supported. Also, SDP is meant for defining session description of the request /response initiating entity. In this context, SDP in the provisional response is treated as an indication that early media will follow. This is a misuse of SDP. This may also cause a problem if other session description mechanism is used instead of SDP. 3 Early Media Scenarios We will examine two special cases: (1) is where the INVITE is never forked, and (2) is where the INVITE is forked at least once at a Proxy or Gateway. Under each of these cases, SIP-PSTN interworking scenarios will be examined. Considerations are given for issues with resource reservation and NAT/firewall traversal. 3.1 Non-forking Proxies 3.1.1 PSTN terminating Informational - Expires January 2002 4 Early Media Issues and Scenarios July, 2001 In case of PSTN terminating calls, the early media is expected from the PSTN network only after an ACM message is received at the MGC, because, in the PSTN network, a one-way call is established to the caller by the ACM message. Thus the MGC should reserve appropriate resources at the media gateway to allow this media through, even before sending out the IAM message (the ACM message is sent in response to the IAM message). Note that, this assumes that the original INVITE from the caller contains an SDP with receiving RTP port information for early media reception. The ACM message from the PSTN is mapped to an 18x message with SDP (port>0) to notify the user of incoming early media from PSTN. The issue here is how the MGC is able to distinguish between the cases when early media will be received and when it would not be, so that it can send the appropriate SDP information in 18x. This may be predicted from the Backward Call Indicator bits in the ACM message, which provides status information of the called end-point (see Table 1). Figure 1 provides a partial call set-up flow. ---------------------------------------------------- ACM Backward Call May be Interpreted as Indicator Status ---------------------------------------------------- Free No early media Busy Early Media possible Status Unknown Early Media possible ---------------------------------------------------- Table 1 Originating SIP UAC Gateway Controller PSTN | | | | INVITE | | |---------------------->| | | | | | Reserve GW Resource | | | | | | IAM | | |---------------------->| | | | | | ACM | | 183 w/SDP (port>0) |<----------------------| |<----------------------| | | | | | Early Media | |<==============================================| Figure 1 - PSTN Terminating 3.1.2 PSTN originating Informational - Expires January 2002 5 Early Media Issues and Scenarios July, 2001 In this case, the terminating SIP end-point can be the originator of early media. When the MGC receives an IAM from the PSTN, for the same reason as described in the previous section, the Media Gateway and the PSTN network should reserve resources to allow the receipt of early media from the callee. This requires that the ACM message (in response to the IAM message) be sent by the gateway controller to the PSTN network after reserving gateway resources (say, port a) and before sending out the INVITE (with SDP carrying port a) to the callee. This implies that the ACM is sent out without receiving any call status information from the UAS. If the callee wishes to send early media, it SHOULD send a 183 with SDP (port>0) before starting early media transmission. See Figure 2 for a partial call set-up flow. Terminating SIP UAS Gateway Controller PSTN | | IAM | | |<----------------------| | | | | Reserve GW Resource | | | | | | ACM | | |---------------------->| | INVITE | | |<----------------------| | | 183 w/SDP (port>0) | | |---------------------->| | | | | | Early Media | |==============================================>| Figure 2 PSTN Originating 3.1.3 PBX terminating In case of PBX terminating calls, the early media is expected from the PSTN network only if a Progress indicator #1 or #8 is received in any message before the CONNECT message (i.e., CALL PROCEEDING, PROGRESS or ALERTING). Note that this assumes that the original INVITE from the caller contains an SDP with receiving RTP port information for early media reception. If an ALERTING message is received without a Progress indicator #1 or #8 having been receive in any message up-to-and-including the ALERTING message, user alerting (e.g., ringing the phone) has to be applied locally with no early media. The message including the Progress indicator #1 or #8 from the PBX is mapped to an 18x message with SDP (port>0) to notify the user of incoming early media from PSTN. Originating SIP UAC Gateway Controller PBX | | | | INVITE | | Informational - Expires January 2002 6 Early Media Issues and Scenarios July, 2001 |---------------------->| | | | | | Reserve GW Resource | | | | | | SETUP | | |---------------------->| | | | | |message with PI=1 or 8 | | 183 w/SDP (port>0) |<----------------------| |<----------------------| | | | | | Early Media | |<==============================================| Figure 3 - PBX Terminating 3.1.4 PBX originating In this case, the terminating SIP end-point can be the originator of early media. When the MGC receives a SETUP from the PBX, for the same reason as described in the previous section, the Media Gateway and the PBX should reserve resources to allow the receipt of early media from the callee. This requires that a CALL PROCEEDING message (in response to the SETUP message) be sent by the gateway controller to the PBX network after reserving gateway resources (say, port a) and before sending out the INVITE (with SDP carrying port a) to the callee. This implies that the CALL PROCEEDING is sent out without receiving any call status information from the UAS. If the callee wishes to send early media, it SHOULD send a 183 with SDP (port>0) before starting early media transmission. That 183 should be mapped to a PROGRESS message with Progress indicator #8 since there is no way to know if the terminating UAS is busy, ringing, or being provided an announcement. See Figure 4 for a partial call set-up flow. Terminating SIP UAS Gateway Controller PSTN | | SETUP | | |<----------------------| | | | | Reserve GW Resource | | | | | | CALL PROCEEDING | | |---------------------->| | INVITE | | |<----------------------| | | 183 w/SDP (port>0) | | |---------------------->| PROGRESS (PI=8) | | |---------------------->| | Early Media | |==============================================>| Informational - Expires January 2002 7 Early Media Issues and Scenarios July, 2001 Figure 4 - PBX Originating 3.1.5 Other Early Media Issues 1. Interaction with Resource Reservation [3] proposes extensions to SIP to deal with resource reservation and security set-up negotiation. Proposed SDP extensions allow indication of pre-conditions for sessions, namely, (1) end-to-end resource reservation, and (2) end-to-end security; the sessions are not allowed to proceed until these pre-conditions are met. The successful establishment of a pre-condition (e.g., resource reservation) can be confirmed by either party using a new SIP method called COMET. If [3] is used, the QoS reservation and the media resource allocation are completed upon the exchange of the COMET message. For early media reception, the SDP carried in subsequent 18x and 200 OK messages should not be modified after the resource reservation phase is completed (except for port number). In case the SDP information changes in subsequent 18x or 200 response, the resource reservation must re-occur. 2. Interaction with Firewall/NAT traversal When the SIP client is behind a firewall or NAT/NAPT, the firewall pinhole need to be opened or NAT/NAPT bindings need to be established in one direction to allow early media before the session establishment is completed [1]. In this scenario, there are potential security loopholes if the firewall/NAT has to establish pinholes/bindings without complete knowledge of the media flow (i.e., IP address/port of the callee). This is currently being considered by the MIDCOM WG. 3.2 Forking Proxies The fact that proxies, en-route, can fork a SIP INVITE creates additional issues with the potential of the caller receiving multiple early media streams. The issues can be summarized as follows: - Need to arbitrate between multiple early media streams - Need to ensure consistent user behaviour that does not end in the user hanging up the call in between multiple early media sessions - Partial knowledge about the early-media sources during call set-up - Arbitration between multiple provisional 18x responses from early media sources - Potential of race conditions between multiple media streams The decoupling of SIP call control from the media allows us less control over the ensuing early media sessions leading to inconsistency in call set-up and undesirable user behaviour. We will discuss these issues in the context of the two kinds of forking scenarios supported by SIP - Parallel and Sequential. Again, we Informational - Expires January 2002 8 Early Media Issues and Scenarios July, 2001 assume that the SIP terminals have the potential to generate early media. 3.2.1 Parallel Forking This is definitely the most complex scenario. Multiple proxies can be involved in a call some or all of which can fork an Invite transaction In one scenario, the forking end-point destinations can be multiple PSTN gateways. Depending on the call progress at the PSTN networks at the forked legs, the caller can expect one or more simultaneous early media sessions. The issue is with how to treat and, if required, arbitrate between the multiple early media sessions. For example, consider the scenario depicted in Figure 5, where two forked INVITES reach a media gateway (GW1) and a SIP end-point. GW1 sends back a busy-tone, which reaches the caller before the called party answers (through the SIP end-point). In this case, the caller may hang-up the call before the callee answers. Thus, such race conditions need to be prevented to avoid undesirable user behaviour during call set-up. ---------- --------- | |----- INVITE (1)-------->| | INVITE--->| Forking | | GW1 | | Proxy | --------- ---------- PSTN | ---------- | | | | | SIP End- | +----- INVITE (2)-------->| point | ---------- Figure 5 The INVITE can be forked multiple times by proxies, en-route, compounding the problem of race condition. This is shown in Figure 6. ---------- --------- | |-- INVITE (1)-->| | INVITE--->| Forking | | GW1 | | Proxy | | | | | --------- ---------- PSTN | ---------- ---------- | | | | | | | Forking |-INVITE(3)-->| GW2 | +- INVITE (2)->| Proxy | ---------- ---------- PSTN | Informational - Expires January 2002 9 Early Media Issues and Scenarios July, 2001 | ---------- | | | | | SIP End- | +--INVITE (4)------>| point | ---------- Figure 6 Here early media from three potential sources can reach the caller at any time and, potentially, at the same time too. The multiple simultaneous early media sessions, which can result in these scenarios, need to be segregated on the bearer path such that it provides coherent and consistent information to the caller. This can either be done in any media gateway on the media path or at the calling end-point. Another problem may occur when multiple provisional 18x responses are received by the UAC from early media end-points. The current version of SIP recommends that the UAC should cease local ring-back when it receives an 18x response with SDP. Also, the normal ring- back behaviour is resumed if the UAC receives a provisional response with SDP port=0. According to [4], the 18x responses are treated in the order that they are received. If multiple 18x responses with different SDP are received, the outcome of this behaviour may not be acceptable to the user (e.g., there can be a ring back tone sandwiched between two announcements). 3.2.2 Sequential Forking Sequential forking somewhat alleviates the problem caused by multiple parallel early media streams. The forking process may be controlled at the forking proxy imposing certain priority and order on the execution of the early media sessions. Sequential forking can be implemented under policy control, where the forking process is governed by a pre-established priority of the called end-points (assumed to be known at the forking proxy). There might be a need for the sequential play-out of all the early media sessions (assuming there are multiple of them). This implies that the forking proxy may need indication of the end of an early media session and use this to trigger the next INVITE to another branch. It may be required that this type of branch migration be controlled by either the caller or the called endpoints. Note that, route-advance is currently triggered [4] either when the party rejects the call with a 4xx or 5xx response, or when the proxy makes a route advance decision based on a timer. 4 Some proposed strategies towards solution 4.1 Background on Previous Proposals Informational - Expires January 2002 10 Early Media Issues and Scenarios July, 2001 The following options to deal with 18x provisional responses were proposed in previous working group meetings and are still under investigation as possible resolutions. 1. Use INFO to pass ACM-related parameters for interworking with ISUP. 2. Eliminate usage of 18x completely (if QoS negotiation is not required, 18x/PRACK can be eliminated). Instead use a one-way 200 OK to establish one-way media path and subsequently use a re-INVITE to complete the two-way session establishment. 3. Use of 18x is made optional and negotiable between the clients. 4.2 Early Media with resource reservation For PSTN terminating calls, it is required that the IAM message (on receipt of an INVITE from the SIP UAC) be sent out to the PSTN by the gateway controller only after receiving confirmation of resource reservation for delivering the early media to the SIP UAC. For PSTN originating calls, an INVITE carrying SDP with the QoS pre- condition parameters [3] is sent to the SIP UAC by the MGC on receipt of an IAM message from the PSTN. The (early) media session should not be initiated by the SIP UAC until the resource reservation pre-condition is met. As discussed in Section 3.1.5, in both of the above cases, the SDP parameters in subsequent provisional and final responses after COMET should not be modified except for the required indication in SDP port parameter of the early media session. In case the SDP information changes in subsequent 18x or 200 response, the resource reservation must re-occur. 4.3 Strategies to deal with forking There are multiple ways to deal with the forking issue. In this section, we discuss some of the possible solution strategies. Possible ways of handling multiple early media sessions due to forking are as follows: 1. Allow no early media 2. Allow only one early media session (e.g., the first one) 3. Allow multiple early media sessions in a particular order For case (1), any SDP information received before the final 200 OK can be blocked at the Proxy. A solution for (2) will, for example, allow the first 18x with SDP and block at the Proxy any other SDP information before the final 200 OK. For case (3), where there is a need to allow multiple early media sessions, the two types of forking scenarios are discussed separately in the following sections. Informational - Expires January 2002 11 Early Media Issues and Scenarios July, 2001 4.3.1 Sequential forking The two main issues here are - (1) control of the forking process, and (2) triggering of branch migration at the end of an early media session, in case of multiple sequential early media sessions. If the forking proxy is aware of the priorities of the end-points (potential early media sources), it would be possible for it to send them INVITEs at a particular order. This priority may be set by the end-user and can be communicated to the Proxy prior to session establishment. When an end-point completes transmission of early media, it may send a message (TBD) to trigger the proxy to route-advance the next INVITE. 4.3.2 Parallel forking The main issue here is the potential of the UAC receiving multiple early media streams. The arbitration between the media streams can be done by intelligent handling either at the client terminal or at a gateway on the media path. For example, an intelligent gateway controller can initiate a specific announcement to the client based on interpretation of messages indicating multiple possible early media sessions. To avoid caller hanging up the phone on receiving the first announcement and miss several important announcements following it, it may be necessary to notify the UAC via a SIP provisional response (18x) that multiple early media sessions are possible. This is possible by adding an indication (e.g., through a new header) in the 18x response at the proxies, if the original INVITE had been forked. Note that, this is applicable to both types of forking scenarios. 5 Conclusion There are no clear solutions for the issues with SDP within 18x provisional responses and the forking problems. While workarounds within the current SIP framework are possible for the 18x issues, the forking problem clearly needs improved signalling between the UA and the proxy (e.g. notifying user of multiple possible early media sessions). In the end we would like to offer the end user the choice of dealing with multiple received early media sessions. This requires that the user be reconditioned from their current expectations of the behaviour of the traditional PSTN calls. 6 Acknowledgements Authors of this document would like to acknowledge Mary Barnes and Scott Orton for their input and reflections on this work. 7 References Informational - Expires January 2002 12 Early Media Issues and Scenarios July, 2001 [1] C. Huitema, "MIDCOM Scenarios", draft-ietf-midcom-scenarios- 02 (work in progress), November 2001 [2] Aparna Vemuri, Jon Peterson, "SIP for Telephones (SIP-T): Context and Architectures", draft-vemuri-sip-t-context-02 (work in progress), August 2001 [3] Marshall et al "Integration of Resource Management and SIP extensions for Resource Management", draft-ietf-sip-manyfolks- resource-01, February, 2001 [4] Handley, Schulzrinne, Schooler, Rosenberg, "Session Initiation Protocol", draft-ietf-sip-rfc2543bis-03 bis-3 draft, November, 2001 [5] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. 8 Full copyright statement Copyright (C) The Internet Society (1999). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 9 Authors Addresses Sanjoy Sen 2375 N. Glenville Drive, Building B, Richardson, TX-75082 Phone : 972-685-8275 E-mail: sanjoylnetworks.com Jayshree Bharatia Informational - Expires January 2002 13 Early Media Issues and Scenarios July, 2001 2201, Lakeside Blvd, Richardson, TX-75082 Phone : 972-684-5767 E-mail: jayshree@nortelnetworks.com Chris Hogg Roxborough Way Foundation Park, Maidenhead, SL6 3UD GB, UK Phone : + 44-162-843-1720 E-mail: chogg@nortelnetworks.com Francois Audet 4301 Great American Parkway, Santa Clara, CA-95054 Phone : 408-495-3756 E-mail: audet@nortelnetworks.com Informational - Expires January 2002 14