Internet Draft Bert Culpepper draft-culpepper-sipping-app-interact- InterVoice-Brite, Inc. reqs-00.txt March 26, 2002 Robert Fairlie-Cuninghame Expires: September, 2002 Nuera Communications, Inc. Network Application Interaction Requirements Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This particular draft is intended to be discussed in the SIPPING Working Group. Discussion of it therefore belongs on that list. The charter for SIPPING working group may be found at http://www.ietf.org/html.charters/sipping-charter.html Abstract This document defines the requirements for a mechanism based on SIP that allows network entities to request and report user indications that can be used to interact with applications associated with SIP- based services. 1. Motivation Telecommunications services in circuit-switched networks have utilized end-user indications as the means for users to interact with the services while users are both engaged and not engaged in a call. These end-user indications are produced by the user pressing keys on their telephone and are sent end-to-end through each of the network entities participating in the call. As communications services move to IP networks, the ability for users to interact with Culpepper/Fairle-Cuninghame [Page 1] Internet Draft Application Interaction Requirements March 26, 2002 their communications services in a real-time like fashion must also follow. Users of communications services have become accustomed to control of services through interaction via the communications terminal. The traditional means by which users interact with their communications services in legacy networks is via the use of DTMF generated as a result of the user pressing a key on terminal's keypad. Because of this, there is a significant desire to duplicate the use of DTMF to support user interaction with services tightly associated with IP communications sessions. The Internet network model for communications separates session control from the session media in that the devices involved in session control are not necessarily tightly coupled to the devices that process media. As the transport of DTMF is provided for in IP networks as a media stream, access to these user indications by the network entities involved in the session control is awkward. In addition, limiting user interaction with communications services to input devices that emulate the traditional telephone keypad constrain the user devices unnecessarily. It is for these reasons a different mechanism than that based on legacy networks is needed to transport user indications for service (application) interaction in IP networks. The Session Initial Protocol (SIP) [2] has been chosen as the session control protocol for multimedia session establishment in IP networks. Because of this choice, it is desirable to have a mechanism supporting user service interaction that works with SIP. As SIP deals with session control and not media transport, the mechanism should not be limited to the media plane. While other protocol approaches have been proposed, none are seen as supporting dynamic/real-time like application interaction on many of the devices that are used for personal communications. 2. End-to-end Verses Asynchronous User Activity Indications The end-to-end user activity indications currently supported in IP networks require "workarounds" in SIP networks so that applications along the session signaling path have access to the indications. The current solution requires "DTMF forking" be supported by the endpoint, or requires the receiving entity to re-generate the indication towards the destination. In many scenarios, the indications meant for the service application are not used at the destination. User indications needed for application interaction on the other hand, are only needed between an endpoint/user and the application within the network. Using end-to-end mechanisms for application interaction, when the application is not itself an endpoint in the session, is problematic as indicated above. 3. Low-level Verses High-level Application Interaction Culpepper/Fairlie-Cuninghame [Page 2] Internet Draft Application Interaction Requirements March 26, 2002 The model of interaction between a user device and an application must be carefully considered. Two models have been suggested each with their merits and drawbacks. It is the authors' belief that the eventual mechanism should support both. 3.1. High-level Interaction In this model an end device has embedded application-specific knowledge and configuration. Rather than interacting through a set of key presses, the interaction occurs through an application- specific set of operations, for instance, "Go left", "Go right", "Jump". The device or network must store a mapping from the actual device interface to the application-specific operations. An alternative way of viewing this form of interactions is that the set of operations are simply application-specific stimuli. Advantages: - Application interaction is independent of the deviceÆs actual interface. - Automatons can be used more easily to interact with supported applications. Disadvantages: - End devices are forced to incorporate application-specific knowledge or configuration in order to be able to use a service - this severely restricts the development and deployment of future applications. - Local devices are required to create and store the local mapping between the user interface and the application-specific stimuli (or the mapping must be stored in the network somewhere). The disadvantages of this approach are as great as the advantages. It is unacceptable to require that a device incorporates application-specific knowledge to be able to use a service - such a design principle is against the principles of the IETF. For this reason this method of interaction is NOT sufficient. Another drawback of this approach is that it does not intrinsically encourage application interoperability. This method of application interaction has been suggested by a number of people at IETF meetings; however, the IETF working group needs to decide whether or not this method of interaction should be encouraged. 3.2. Low-level Interaction In this model an application uses the mechanism to determine the Culpepper/Fairlie-Cuninghame [Page 3] Internet Draft Application Interaction Requirements March 26, 2002 makeup of a device's user interface and then interactions are driven though user interface associated stimuli. Advantages: - Devices do not need to incorporate application-specific information or configuration. - An application can adapt its operation to best suit the interface that a device possesses. For instance, this may mean greater reliance on non-tactile interface based methods such as voice recognition. The application is the only entity qualified to make these sorts of decisions. - In most instances, there is always some level of application interaction possible, albeit perhaps through a less graceful interface. Disadvantages: - User interface widgets cannot automatically be used with an application if the application does not recognize the widget (although this can be sidestepped by using local mapping configuration). 3.3. Summary Thankfully these two schemes are not mutually exclusive and the benefits of both can be obtained. A device can utilize an enhanced level of interaction when interacting with an application that the device (or network) has knowledge and/or configuration for; likewise, an application can fall back to low-level interaction if the device (or network) does not possess the require application- specific knowledge. 4. Requirements R1: The mechanism must support collecting device/user input which is associated with an established SIP session but must also support collecting device/user input that is outside of any established sessions. R2: The mechanism must transport user indications to network elements independently of the media plane. R3: The transport mechanism must be sensitive to the limited bandwidth constraints of some signaling planes, for instance, reliability through blind retransmission is not acceptable. R4: The mechanism must support multiple network entities requesting and receiving indications independently of each other. Culpepper/Fairlie-Cuninghame [Page 4] Internet Draft Application Interaction Requirements March 26, 2002 R5: A network entity desiring user indications must be able to request user indications from another network entity. The entity receiving a request must be able to respond with its capability/intent to transmit user indications. R6: The mechanism must support filtering so that only user indications of interest are transmitted. R7: User activity indications must not be generated unless implicitly or explicitly requested by an entity. R8: The mechanism must support user indications via keys or buttons and at the very least must define support for user interaction via a standard, generic computer keyboard. R9: The mechanism must support the definition of device and/or user-specific buttons. R10: The mechanism must be extensible so that some non key-based user indications can be supported in the future, for instance, sliders, dials or wheels. R11: A requestor must be able to determine the makeup/contents of the user interface possessed by a target device. R12: The mechanism must support reliable delivery at least as good as the session control protocol. R13: For key-based indications, the mechanism must provide some form of indication of key press duration. R14: For key-based indications, the mechanism must provide some form of indication of relative key-press start time (relative to other key presses). R15: The receiving application must be able to detect user activity indication loss due to packet loss from received user activity indications. R16: The mechanism must allow for end-to-end security/privacy between source and destination. R17: Both entities must be able to authenticate each other. 5. Desirables D1: The mechanism should be simple to implement and execute on devices with simple interfaces. D2: There should be a separation between the transport mechanism in the signaling plane and the message syntax. Culpepper/Fairlie-Cuninghame [Page 5] Internet Draft Application Interaction Requirements March 26, 2002 D3: The mechanism should attempt to reduce recovery delays under packet loss scenarios. D4: The mechanism should support routing and identification that is compatible with use in a SIP-based network. 6. Authors Robert Fairlie-Cuninghame Nuera Communications, Inc. 50 Victoria Rd Farnborough, Hants GU14-7PG United Kingdom Phone: +44-1252-548200 Email: rfairlie@nuera.com Bert Culpepper InterVoice-Brite, Inc. 701 International Parkway Heathrow, FL 32746 Phone: 407-357-1536 Email: bert.culpepper@intervoice-brite.com 7. References 1 S. Bradner, "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. 2 M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg, "SIP: Session Initiation Protocol", RFC 2543, March 1999. Culpepper/Fairlie-Cuninghame [Page 6]