SIPPING E. Burger Internet-Draft SnowShore Networks, Inc. Expires: April 25, 2004 M. Dolly AT&T Labs October 26, 2003 Keypad Stimulus Protocol (KPML) draft-ietf-sipping-kpml-01 Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 25, 2004. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract The Key Press Stimulus Protocol uses the SIP SUBSCRIBE/NOTIFY mechanism and Keypad Markup Language (KPML) to provide instructions to SIP User Agents for the reporting of user key presses. Conventions used in this document RFC2119 [1] provides the interpretations for the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" found in this document. In the narrative discussion, the "user device" is a User Agent that Burger & Dolly Expires April 25, 2004 [Page 1] Internet-Draft KPML October 2003 will report stimulus. it could be, for example, a SIP phone, edge media processor, or media gateway. An "application" is a User Agent requesting the user device to report stimulus. The "user" is an entity that stimulates the user device. In English, the user device is a phone, the application is an application server or proxy server, and the user presses keys to generate stimulus. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Keypress Stimulus Protocol . . . . . . . . . . . . . . . . . 5 2.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Operation . . . . . . . . . . . . . . . . . . . . . . . . . 7 3. Protocol Machinery . . . . . . . . . . . . . . . . . . . . . 8 3.1 Event Package Name . . . . . . . . . . . . . . . . . . . . . 8 3.2 Event Package Parameters . . . . . . . . . . . . . . . . . . 8 3.3 SUBSCRIBE Bodies . . . . . . . . . . . . . . . . . . . . . . 10 3.4 Subscription Duration . . . . . . . . . . . . . . . . . . . 10 3.5 NOTIFY Bodies . . . . . . . . . . . . . . . . . . . . . . . 10 3.6 Notifier Generation of NOTIFY Messages . . . . . . . . . . . 10 3.6.1 SIP Protocol-Generated . . . . . . . . . . . . . . . . . . . 10 3.6.2 KPML-Generated . . . . . . . . . . . . . . . . . . . . . . . 10 3.6.3 One-Shot vs. Persistant Requests . . . . . . . . . . . . . . 10 4. Message Format - KPML . . . . . . . . . . . . . . . . . . . 11 4.1 KPML Request . . . . . . . . . . . . . . . . . . . . . . . . 11 4.1.1 Digit Suppression . . . . . . . . . . . . . . . . . . . . . 12 4.1.2 One-Shot and Persistant Triggers . . . . . . . . . . . . . . 14 4.1.3 Multiple Patterns . . . . . . . . . . . . . . . . . . . . . 14 4.1.4 Multiple, Simultaneous Subscriptions . . . . . . . . . . . . 14 4.2 KPML Reports . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2.1 Pattern Match Reports . . . . . . . . . . . . . . . . . . . 15 4.2.2 KPML No Match Reports . . . . . . . . . . . . . . . . . . . 16 5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5.1 Monitoring for Octorhorpe . . . . . . . . . . . . . . . . . 17 5.2 Dial String Collection . . . . . . . . . . . . . . . . . . . 17 5.3 Interactive Digit Collection . . . . . . . . . . . . . . . . 18 6. Call Flow Example . . . . . . . . . . . . . . . . . . . . . 19 6.1 INVITE-Initiated Dialog . . . . . . . . . . . . . . . . . . 19 6.2 Third-Party Subscription . . . . . . . . . . . . . . . . . . 24 6.3 Remote-End Monitoring . . . . . . . . . . . . . . . . . . . 24 7. Formal Syntax . . . . . . . . . . . . . . . . . . . . . . . 24 8. Enumeration of KPML Failure Codes . . . . . . . . . . . . . 26 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . 26 9.1 IANA Registration of MIME media type application/kpml+xml . 26 10. Security Considerations . . . . . . . . . . . . . . . . . . 26 Normative References . . . . . . . . . . . . . . . . . . . . 27 Informative References . . . . . . . . . . . . . . . . . . . 27 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 28 Burger & Dolly Expires April 25, 2004 [Page 2] Internet-Draft KPML October 2003 A. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 28 B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 29 Intellectual Property and Copyright Statements . . . . . . . 30 Burger & Dolly Expires April 25, 2004 [Page 3] Internet-Draft KPML October 2003 1. Introduction This document describes the Key Press Stimulus Protocol. The Key Press Stimulus Protocol exchanges messages using the SUBSCIBE and NOTIFY methods of SIP [2] with message bodies formed from the Keypad Markup Language, KPML. KPML is a markup [12] that enables "dumb phones" to report user key-press events. Colloquially, this mechanism provides for "digit reporting" or "DTMF reporting." We strongly discourage the use of non-validating XML parsers, as one can expect problems with future versions of KPML. That said, one could envision user devices that only accept SIP reporting and have a fixed parser, rather than a full XML parser. This means that a goal of KPML is to fit in an extremely small memory and processing footprint. Note KPML has a corresponding lack of functionality. For those applications that require more functionality, please refer to VoiceXML [13] and MSCML [3]. The name of the markup, KPML, reflects its legacy support role. The public switched telephony network (PSTN) accomplished end-to-end signaling by transporting Dual-Tone, Multi-Frequency (DTMF) tones in the bearer channel. This is in-band signaling. From the point of view of an application being signaled, what is important is the fact the stimulus occurred, not the tones used to transport the stimulus. For example, an application may ask the caller to press the "1" key. What the application cares about is the key press, not that there were two cosine waves of 697 Hz and 1209 Hz transmitted. A SIP-signaled [4] network transports end-to-end signaling with RFC2833 [14] packets. In RFC2833, the signaling application inserts RFC2833 named signal packets as well as or instead of generating tones in the media path. The receiving application gets the signal information, which is what it wanted in the first place. RFC2833 is the only method that can correlate the time the end user pressed a digit with the user's media. However, out-of-band signaling methods, as are appropriate for user device to application signaling, do not need millisecond accuracy. On the other hand, they do need reliability, which RFC2833 does not provide. An interested application could request notifications of every key press. However, many of the use cases for such signaling has the application interested in only one or a few keystrokes. Thus we need a mechanism for specifying to the user device what stimulus the application would like notification of. Burger & Dolly Expires April 25, 2004 [Page 4] Internet-Draft KPML October 2003 2. Keypress Stimulus Protocol 2.1 Model There are two usage models for the protocol. Functionally, they are both equivalent. However, it is useful to understand the use cases driving the signaling. The first model is that of a SIP User Agent (UA) that directly interacts, on a given dialog, with the end device. Figure 1 shows a two-party SIP dialog. In this scenario, the SIP UA requests the End Point to report on key press events that would normally eminate from End Point port B. In this case, the requesting User Agent requests digit notification on the same dialog established for the call, between SIP ports A and X. +-------+ SIP +-----+ | A--------------------X | | End | | SIP | | Point | RTP | UA | | B--------------------Y | +-------+ +-----+ Figure 1: Endpoint Model The second model is that of a third-party application that is interested in entered key presses. Figure 2 shows an established two-party SIP dialog between the End Point and the SIP UA. The requesting application addresses the particular media stream either by referencing the established dialog identifier refering to the dialog between SIP ports A and X or by referencing the SDP, either of port B or port Y. Specifying the SDP for port Y monitors the key presses at the SIP UA, as received by the End Point. Specifying the SDP for port B monitors the key presses at the End Point. Not all End Point devices are able to monitor the remote media stream. However, the End Point MUST be able to report on local (End Point-generated) key press events. Burger & Dolly Expires April 25, 2004 [Page 5] Internet-Draft KPML October 2003 +-------------+ | Requesting | /---| Application | / +-------------+ / SIP / (SUBSCRIBE/NOTIFY) / / +---M---+ SIP (INVITE) +-----+ | A--------------------X | | End | | SIP | | Point | RTP | UA | | B--------------------Y | +-------+ +-----+ Figure 2: Third-Party Model The third model is that of a media proxy. A media proxy is a media relay in the terminology of RFC1889 [15]. However, in addition to the RTP forwarding capability of a RFC1889 media relay, the media proxy can also do light media processing, such as tone detection, tone transcoding (tones to RFC2833 [14], and so on. If the Requesting Application uses dialog identifiers to identify the stream to monitor, the default is to monitor the media entering the End Point. For example, if the Requesting Applciation in Figure 3 usess the dialog represented by SIP ports V-C, then the media coming from SIP UAa RTP port W gets monitored. Likewise, the dialog represented by A-X directs the End Point to monitor the media coming from SIP UAb RTP Port Y. To monitor the reverse direction, from the End Point to one of the User Agents, the Requesting Application MUST specify the SDP of the End Point RTP port to monitor, as in the first example above. Burger & Dolly Expires April 25, 2004 [Page 6] Internet-Draft KPML October 2003 +-------------+ | Requesting | /---| Application | / +-------------+ / SIP / (SUBSCRIBE/NOTIFY) / / +-----+ SIP +---M---+ SIP +-----+ | V--------------------C A--------------------X | | SIP | | End | | SIP | | UAa | RTP | Point | RTP | UAb | | W--------------------D B--------------------Y | +-----+ +-------+ +-----+ Figure 3: Media Proxy Model 2.2 Operation The key press stimulus protocol uses explicit subscription requests and notification requests, using the semantics of SUBSCRIBE/NOTIFY [2]. Following the semantics of SUBSCRIBE, if the user device receives a second subscription on the same dialog, the user device MUST terminate the existing KPML request (if any) and replace it with the new request. If the user device supports multiple, simultaneous KPML requests, the application registers the separate requests on different SUBSCRIBE-initiated dialogs. An application may register multiple digit patterns in a single KPML request. If the user device does not support multiple, simultaneous KPML requests, it responds with an error response code. See Section 4.1.4 for more information. A KPML request can be persistent or one-shot. Persistent requests are active until either the dialog terminates, the client replaces them, or the client deletes them by sending a null document on the user instance. One-shot requests terminate themselves once a match occurs. The persist KPML element specifies whether the subscription remains registered for the duration specified in the SUBSCRIBE message or if it automatically terminates after a pattern matches. KPML requests route to the user device using standard SIP request routing. A KPML request identifies the leg in question in one of two Burger & Dolly Expires April 25, 2004 [Page 7] Internet-Draft KPML October 2003 ways. The first method is to send the request on an existing, INVITE-initiated dialog. The second method is to explicitly identify the call leg by its transport-layer identifiers, such as RTP port number and IP address. Response messages are KPML documents (messages). If the user device matched a digit map, the response indicates the digits detected and whether the user device suppressed digits. If the user device had an error, such as a timeout, it will indicate that, instead. 3. Protocol Machinery The Key Press Stimulus Protocol uses the SIP [4]SUBSCRIBE/NOTIFY [2] mechanism. The registration of a digit map is simply setting a digit event notification filter. When the device detects the digits, it sends an event notification to the application. The following sub-sections are the formal specification of the KPML SIP-specific event notification package. 3.1 Event Package Name The name for the Key Press Stimulus Protocol package is "kpml". 3.2 Event Package Parameters The "leg" parameter identifies the call leg being monitored. If the "leg" parameter is not present, the SUBSCRIBE MUST be on an established INVITE-initiated SIP dialog. In this case, the leg the end device monitors is the call leg associated with the established dialog. If there is no corresponding dialog or call leg, the end device will send a 481 result code in a KPML notification. NOTE: The SUBSCRIBE will presumably succeed, resulting in a 200 OK. However, the "current state" will be the KPML 481 result, and the subscription state will be "terminated." If the application is using SIP-level identifiers, the value of the "leg" parameter is "SIP". If the application is using SDP-level parameters, the value of the "leg" parameter is "SDP". SIP identifies call legs by their dialog identifier. The dialog identifier is the to:, from:, and call-id: entities. To identify a specific dialog, all three of these parameters MUST be present. The to-tag matches the local address including tag, the Burger & Dolly Expires April 25, 2004 [Page 8] Internet-Draft KPML October 2003 from-tag matches the remote address including tag, and the call-id matches the Call-ID. Note there may be ambiguity in specifying only the SIP dialog to monitor. The dialog may specify multiple SDP streams that could carry key press events. For example, a dialog may have multiple audio streams. Wherever possible, the End Point MAY apply local policy to diambiguate which stream or streams to monitor. However, if the Application desires to specify exactly which stream to monitor, it MUST use the SDP method of specifying which stream to monitor. For most situations, such as a mono point-to-point call with a single codec, the stream to monitor is obvious. In such situations the Application need not specify which stream to monitor. The BNF for these parameters is as follows. The definitions of callid, token, EQUAL, SWS, and DQUOTE are from RFC3261 [4]. call-id = "call-id" EQUAL DQUOTE callid DQUOTE from-tag = "from-tag" EQUAL token to-tag = "to-tag" EQUAL token The call-id parameter is a quoted string. This is because the BNF for word (which is used by callid) allows for characters not allowed within token. One usually just copies these elements from the Call-Id, to, and from fields of the SIP INVITE. One can use any method of determining the dialog identifier. One method available, particularly for third-party applications, is to use the SIP Dialog Package [16]. SDP identifies call legs by transport connection information (e.g., IPv6 IP address) and media address. The identifiers are the c-line and m-line from SDP. The BNF for these parameters is as follows. The definitions of nettype, addrtype, connection-address, media, port, integer, space, proto, and fmt are from RFC2327 [5] as updated by RFC3266 [6]. address = "c" EQUAL DQUOTE nettype space addrtype space connection-address DQUOTE media = "m" EQUAL DQUOTE media space port ["/" integer] [space [proto [1*(space fmt)]]] DQUOTE All of the c-line attributes are significant. However, for the m-line, only the port (and optional pair mark) is significant. Burger & Dolly Expires April 25, 2004 [Page 9] Internet-Draft KPML October 2003 Note the c-line might not be on the End Point. In this case, the End Point monitors the stream from the specified host. Note there is no quirement on an End Point to be able to monitor remote streams. 3.3 SUBSCRIBE Bodies Key press filtering requests uses KPML, as described in Section 4.1. The MIME type for KPML is application/kpml+xml. 3.4 Subscription Duration The subscription lifetime should be longer than the expected call time. The default subscription lifetime (Expires value) MUST be 7200 seconds. This two-hour subscription time is entirely arbitrary. Please contact the editor if you have a better suggestion, and why. 3.5 NOTIFY Bodies The key press notification uses KPML, as described in Section 4.2. The MIME type for KPML is application/kpml+xml. The default MIME type for the kpml event package is application/kpml+xml. 3.6 Notifier Generation of NOTIFY Messages 3.6.1 SIP Protocol-Generated The end device (notifier in SUBSCRIBE/NOTIFY parlance) generates NOTIFY requests based on the requirements of RFC3265 [2]. Specifically, unless a SUBSCRIBE request is not valid, all SUBSCRIBE requests will result in an immediate NOTIFY. The KPML payload distinguishes between a NOTIFY that RFC3265 mandates and a NOTIFY informing of key presses. If there are no digits quarantined at the time of the SUBSCRIBE (see Section 4.1 below), then the immediate NOTIFY MUST return a valid KPML document with a KPML result code of 100. If there are digits quarantined, then the NOTIFY MUST return the appropriate KPML document. 3.6.2 KPML-Generated During the subscription lifetime, the end device may detect a key press stimulus that triggers a KPML event. In this case, the end device (notifier) MUST return the appropriate KPML document. 3.6.3 One-Shot vs. Persistant Requests A one-shot kpml subscription is one that the KPML document does not mark as persistent. If the end device detects a key press stimulus Burger & Dolly Expires April 25, 2004 [Page 10] Internet-Draft KPML October 2003 that triggers a one-shot KPML event, then the end device (notifier) MUST set the "Subscription-State" in the NOTIFY message to "terminated". At this point the end device MUST consider the subscription destroyed. This means that further SUBSCRIBE requests on the same dialog MUST result in SIP 481 SUBSCRIPTION DOES NOT EXIST response. For persistent kpml subscriptions, the KPML document remains active for the lifetime of the SUBSCRIPTION. NOTE: If the subscription uses the leg="SDP" method of determining the call leg to monitor, be aware that if the call ends, it is the responsibility of the application to unsubscribe the kpml subscription. 4. Message Format - KPML The Key Press Stimulus Protocol exchanges KPML messages. There are two, mutually exclusive elements to KPML: the request and response. 4.1 KPML Request A KPML document (message) contains a tag with a series of tags. The element specifies a digit pattern for the device to report on. KPML supports three modes of digit map specification: MSCML [3] regular expressions, MGCP [7] digit maps, and H.248.1 [8] digit maps. The type attribute indicates what kind of digit map appears in the expression. regex The default; use regular expression matching. mgcpdigitmap Use digit maps as specified in MGCP [7]. megacodigitmap Use digit maps as specified in H.248.1 [8]. Interface attributes, such as what constitutes a long key press, are implementation matters beyond the scope of this document. Some devices can buffer entered digits. Subsequent KPML requests first apply their patterns against the buffered digits. Some applications use modal interfaces where the first few key presses determine what the following digits mean. For a novice user, the application may play a prompt describing what mode the application is in. However, "power users" often barge through the prompt. The protocol provides a flush attribute to the tag. The default is "flush=no". Flushing digits means that the user device flushes any buffered digits. This has the effect of ignoring entered digits before the KPML request. NOTE: Protocol action like this imposes an infinite buffer requirement on the End Device. Options are to make buffer depth Burger & Dolly Expires April 25, 2004 [Page 11] Internet-Draft KPML October 2003 purely an implementation issue; have a buffer size attribute on the request (and fail if cannot honor request); NOTIFY if the buffer fills; others? If the user presses a key not matched by the tags, the user device MUST discard the key press from consideration against the current or future KPML messages. However, as described above, once there is a match, the user device quarantines any keys the user enters subsequent to the match. The end device MAY support an inter-digit timeout value. This is the amount of time the end device will wait for user input before returning a timeout error result on a partially matched pattern. The application can specify the inter-digit timeout as an integer number of milliseconds by using the interdigittimer attribute to the tag. The default is 1000ms. If the end device does not support the specification of an inter-digit timeout, the end device MUST silently ignore the specification. If the end device supports the specification of an inter-digit timeout, but not to the granularity specified by the value presented, the end device MUST round the requested value to the closest value it can support. KPML messages are independent. Thus it is not possible for the current document to know if a following document will enable barging or want the digits flushed. Therefore, the user device MUST quarantine all digits detected between the time of the report and the interpretation of the next script, if any. If the next script has "flush=yes", then the interpreter MUST flush all collected digits. If the next script has "flush=no", then the interpreter MUST apply the collected digits (if possible) against the digit maps presented by the script's tags. If there is a match, the interpreter MUST quarantine the remaining digits. If there is no match, the interpreter MUST flush all of the collected digits. Unless there is a suppress indicator in the digit map, it is not possible to know if the signaled digits are for local KPML processing or for other recipients of the media stream. Thus, in the absence of a digit suppression indicator, the user device transmits the digits to the far end in real time, using either RFC2833, generating the appropriate tones, or both. The section Digit Suppression (Section 4.1.1) describes the operation of the suppress indicator. 4.1.1 Digit Suppression Under basic operation, a KPML endpoint will transmit in-band tones (RFC2833 [14] or actual tone) in parallel with digit reporting. Burger & Dolly Expires April 25, 2004 [Page 12] Internet-Draft KPML October 2003 NOTE: If KPML did not have this behavior, then a user device executing KPML could easily break called applications. For example, take a personal assistant that uses "*9" for attention. If the user presses the "*" key, KPML will hold the digit, looking for the "9". What if the user just enters a "*" key, possibly because they accessed an IVR system that looks for "*"? In this case, the "*" would get held by the user device, because it is looking for the "*9" pattern. The user would probably press the "*" key again, hoping that the called IVR system just did not hear the key press. At that point, the user device would send both "*" entries, as "**" does not match "*9". However, that would not have the effect the user intended when they pressed "*". On the other hand, there are situations where passing through tones in-band is not desirable. Such situations include call centers that use in-band tone spills to effect a transfer. For those situations, KPML adds a digit suppression attribute, "pre", to the tag. There MUST NOT be more than one pre in any given . If there is only a single and a single , the suppression processing is straightforward. The end-point passes digits until the stream matches the regular expression pre. At that point, the endpoint will continue collecting digits, but will suppress the generation or pass-through of any in-band digits. If the endpoint suppresses digits, it MUST indicate this by including the attribute "suppressed" with a value of "yes" in the digit report. A KPML endpoint MAY perform digit suppression. If it is not capable of digit suppression, it ignores the digit suppression attribute and will never send a suppressed indication in the digit report. In this case, it will match concatenated patterns of pre+value. At some point in time, the endpoint will collect enough digits to the point it hits a pre pattern. The interdigittimer attribute indicates how long to wait once the user enters digits before reporting a time-out error. If the interdigittimer expires, the endpoint MUST issue a time-out report and transmit the suppressed digits on the media stream. After digit suppression begins, it may become clear that a match will not occur. For example, take the expression . At the point the endpoint receives "*8", it will stop forwarding digits. Let us say that the next three digits are "408". If the next digit is a zero or one, the pattern will not match. Burger & Dolly Expires April 25, 2004 [Page 13] Internet-Draft KPML October 2003 NOTE: It is critically important for the endpoint to have a sensible inter-digit timer. This is because an errant dot (".") may suppress digit sending forever. See Section 4.1 for setting the inter-digit timer. Applications should be very careful to indicate suppression only when they are fairly sure the user will enter a digit string that will match the regular expression. In addition, applications should deal with situations such as no-match or time-out. This is because the endpoint will hold digits, which will have obvious user interface issues in the case of a failure. 4.1.2 One-Shot and Persistant Triggers The KPML document specifies if the patterns are to be persistent by setting the persistent attribute to the tag to "true". Otherwise, the request will be a one-shot subscription. If the end device does not support persistent subscriptions, it returns a KPML document with the KPML result code set to 531. 4.1.3 Multiple Patterns Some end devices may support multiple regular expressions in a given pattern request. In this situation, the application may wish to know which pattern triggered the event. KPML provides a "tag" attribute to the tag. The "tag" is an opaque string that the end device sends back in the notification report upon a match in the digit map. In the case of multiple matches, the end device MUST chose the longest match in the KPML document. If multiple matches match the same length, the end device MUST chose the first expression listed in the subscription KPML document based on KPML document order. If the end device does not support multiple regular expressions in a pattern request, the end device MUST return a KPML document with the KPML result code set to 532. 4.1.4 Multiple, Simultaneous Subscriptions Some end devices may support multiple key press event notification subscriptions at the same time. In this situation, the end device honors each subscription individually and independently. If the end device does not support multiple, simultaneous subscriptions, the end device MUST return a KPML document with the KPML result code set to 533. Burger & Dolly Expires April 25, 2004 [Page 14] Internet-Draft KPML October 2003 4.2 KPML Reports When the user enters key press(es) that match a tag, the end device will issue a report. After reporting, the interpreter terminates the KPML session unless the subscription has a persistence indicator. If the subscription does not have a persistence indicator, the end device MUST set the state of the subscription to "terminated" in the NOTIFY report. If the subscription does not have a persistence indicator, to collect more digits the requestor must issue a new request. NOTE: This highlights the "one shot" nature of KPML, reflecting the balance of features and ease of implementing an interpreter. If your goal is to build an IVR session, we strongly suggest you investigate more appropriate technologies such as VoiceXML [13] or MSCML [3]. KPML reports have two mandatory attributes, code and text. These attributes describe the state of the KPML interpreter on the end device. Note the KPML code is not necessarily related to the SIP result code. An important example of this is where a legal SIP subscription request gets a normal SIP 200 OK followed by a NOTIFY, but there is something wrong with the KPML request. In this case, the NOTIFY would include the KPML failure code in the KPML report. Note that from a SIP perspective, the SUBSCRIBE and NOTIFY were successful. Also, if the KPML failure is not recoverable, the end device will most likely set the Subscription-Sate to terminated. This lets the SIP machinery know the subscription is no longer active. 4.2.1 Pattern Match Reports If a pattern matches, the end device will emit a KPML report. Since this is a success report, the code is "200" and the text is "OK". The KPML report includes the actual digits matched in the digit attribute. The digit string uses the conventional characters '*' and '#' for star and octothorpe respectively. The KPML report also includes the tag attribute if the regex that matched the digits had a tag attribute. If the subscription requested digit suppression (Section 4.1.1) and the end device suppressed digits, the suppressed attribute indicates "true". The default value of suppressed is "false". Burger & Dolly Expires April 25, 2004 [Page 15] Internet-Draft KPML October 2003 NOTE: KPML does not include a timestamp. There are a number of reasons for this. First, what timestamp would in include? Would it be the time of the first detected key press? The time the interpreter collected the entire string? A range? Second, if the RTP timestamp is a datum of interest, why not simply get RTP in the first place? That all said, if it is really compelling to have the timestamp in the response, it could be an attribute to the tag. 4.2.2 KPML No Match Reports There are a few circumstances in which the end device will emit a no match report. They are an immediate NOTIFY in response to SUBSCRIBE request (no digits detected yet), a request for service not supported by end device, or a failure of a digit map to match a string (timeout). 4.2.2.1 Immediate NOTIFY The NOTIFY in response to a SUBSCRIBE request results in a KPML code of 100. An example of this is in Figure 6. NOTIFY sip:application@example.com SIP/2.0 Via: SIP/2.0/UDP proxy.example.com Max-Forwards: 70 To: From: Call-Id: 439hu409h4h09903fj0ioij CSeq: 49851 NOTIFY Content-Type: application/kpml+xml Content-Length: 79 Event: kpml Figure 6: Immediate NOTIFY Example NOTE: We should give serious thought to just having an empty body mean this message was protocol generated. Since the Section 4.2.2.3 section describes all the message bodies on match failure, including time-out, which has no digits returned, an empty body is probably a much better route to go. Burger & Dolly Expires April 25, 2004 [Page 16] Internet-Draft KPML October 2003 4.2.2.2 Unsupported Service See discussion above on 5xx errors. 4.2.2.3 Match Failure Discuss timeouts here. Timeouts result in a NOTIFY with a descriptive code and text. 5. Examples 5.1 Monitoring for Octorhorpe A common need for pre-paid and personal assistant applications is to monitor a conversation for a signal indicating a change in user focus from the party they called through the application to the application itself. For example, if you call a party using a pre-paid calling card and the party you call redirects you to voice mail, digits you press are for the voice mail system. However, many applications have a special key sequence, such as the octothorpe (#, or pound sign) or *9 that terminate the called party leg and shift the user's focus to the application. Figure 7 shows the KPML for long octothorpe. Note that the href is really on one line, but divided for clarity. Figure 7: Long Octothorpe Example The regex value Z indicates the following digit needs to be a long-duration key press. F, from the H.248.1 DTMF package, is the octothorpe key. In fact, KPML supports all digits, 1-9, *, #, A-D from the H.248.1 DTMF package. 5.2 Dial String Collection In this example, the user device collects a dial string. The application uses KPML to quickly determine when the user enters a target number. In addition, KPML indicates what type of number the user entered. Burger & Dolly Expires April 25, 2004 [Page 17] Internet-Draft KPML October 2003 Figure 8: Dial String KPML Example Code Note the use of the "tag" attribute to indicate which regex matched the dialed string. The interesting case here is if the user entered "94015551212". This string matches both the "9401xxxxxxx" and "9xxxxxxxxxx" regular expressions. By following the rules described in Section 4.1.3, the KPML interpreter will pick the "9401xxxxxxx" string, as it occurs first in document order (both expressions match the same length). Figure 9 shows the response. Figure 9: Dial String KPML Response 5.3 Interactive Digit Collection This is an example where one would probably be better off using a full scripting language such as VoiceXML [13] or MSCML [3] or a device control language such as H.248.1 [8]. In this example, an application requests the user device to send the Burger & Dolly Expires April 25, 2004 [Page 18] Internet-Draft KPML October 2003 user's signaling directly to the platform in HTTP, rather than monitoring the entire RTP stream. Figure 10 shows a voice mail menu, where presumably the application played a "Press K to keep the message, R to replay the message, and D to delete the message" prompt. In addition, the application does not want the user to be able to barge the prompt. Figure 10: IVR KPML Example Code NOTE: This usage of KPML is clearly inferior to using a device control protocol like H.248.1. From the application's point of view, it has to do the low-level prompt-collect logic. Granted, it is relatively easy to change the key mappings for a given menu. However, often more of the call flow than a given menu mapping gets changed. Thus there would be little value in such a mapping to KPML. We STRONGLY suggest using a real scripting language such as VoiceXML or MSCML for this purpose. 6. Call Flow Example 6.1 INVITE-Initiated Dialog This section describes a successful subscription and notification from an Application with an End Device ("User A") in an INVITE-Initiated dialog. Note the Application can be a Record-Route Proxy, a B2BUA, or another end device. Burger & Dolly Expires April 25, 2004 [Page 19] Internet-Draft KPML October 2003 User A Application | | | INVITE F1 | |--------------->| | 100 F2| |<---------------| | 180 F3 | |<---------------| | 200 OK F4 | |<---------------| | ACK F5 | |--------------->| | Media Session | |<==============>| | SUBSCRIBE F6 | Application Subscribes to "***" from User A |<---------------| | 200 OK F7 | |--------------->| | NOTIFY F8 | Immediate Notify indicating moinitoring |--------------->| | 200 OK F9 | |<---------------| | . | | : | | NOTIFY F10 | |--------------->| Notification of detection of "***" | 200 OK F11 | |<---------------| | | Connection setup between User A and an Application subscribing to a DTMF event of "***" at User A. F1 INVITE User A --> Application INVITE sip:UserB@subB.example.com SIP/2.0 Via: SIP/2.0/UDP client.subA.example.com:5060;branch=z9hG4bK74 Max-Forwards: 70 From: ;tag=1234567 To: Call-ID: 12345601@subA.example.com CSeq: 1 INVITE Contact: Route: Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, SUBCRIBE, NOTIFY Allow-Events: kpml Supported: replaces Burger & Dolly Expires April 25, 2004 [Page 20] Internet-Draft KPML October 2003 Content-Type: application/sdp Content-Length: ... v=0 o=UserA 2890844526 2890844526 IN IP4 client.subA.example.com s=Session SDP c=IN IP4 client.subA.example.com t=3034423619 0 m=audio 49170 RTP/AVP 0 a=rtpmap:0 PCMU/8000 F2 100 Trying Application --> User A SIP/2.0 100 Trying Via: SIP/2.0/UDP client.subA.example.com:5060;branch=z9hG4bK74 ;received=192.168.12.22 From: ;tag=1234567 To: Call-ID: 12345601@subA.example.com CSeq: 1 INVITE Content-Length: 0 F3 180 Ringing Application --> User A SIP/2.0 180 Ringing Via: SIP/2.0/UDP client.subA.example.com:5060;branch=z9hG4bK74 ;received=192.168.12.22 Record-Route: From: ;tag=1234567 To: ;tag=567890 Call-ID: 12345601@subA.example.com CSeq: 1 INVITE Contact: Content Length: 0 F4 200 OK Application --> User A SIP/2.0 200 OK Via: SIP/2.0/UDP client.subA.example.com:5060;branch=z9hG4bK74 ;received=192.168.12.22 Record-Route: From: ;tag=1234567 To: ;tag=567890 Call-ID: 12345601@subA.example.com CSeq: 1 INVITE Burger & Dolly Expires April 25, 2004 [Page 21] Internet-Draft KPML October 2003 Contact: Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, SUBSCRIBE, NOTIFY Supported: replaces Content-Type: application/sdp Content-Length: ... v=0 o=UserB 2890844527 2890844527 IN IP4 client.subB.example.com s=Session SDP c=IN IP4 client.subB.example.com t=3034423619 0 m=audio 3456 RTP/AVP 0 a=rtpmap:0 PCMU/8000 F5 ACK User A --> Application ACK sip:UserB@subB.example.com SIP/2.0 Via: SIP/2.0/UDP client.subA.example.com:5060;branch=z9hG4bK74 Max-Forwards: 70 Route: From: ;tag=1234567 To: ;tag=567890 Call-ID: 12345601@subA.example.com CSeq: 1 ACK Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, NOTIFY Supported: replaces Content-Length: 0 F6 SUBSCRIBE Application --> User A SUBSCRIBE sip:UserA@subA.example.com SIP/2.0 Max-Forwards: 70 JVD: Swap To: and From: for new request From: ;tag=567890 To: ;tag=1234567 Call-ID: 12345601@subA.example.com CSeq: 1 SUBSCRIBE Contact: Event: kpml Subscription-State: active;expires=3600 Accept: application/kpml+xml Content-Type: application/kmpl+xml Content-Length: ... Burger & Dolly Expires April 25, 2004 [Page 22] Internet-Draft KPML October 2003 F7 200 OK User A --> Application SIP/2.0 200 OK To: ;tag=1234567 From: ;tag=567890 Call-ID: 12345601@subA.example.com CSeq: 1 SUBSCRIBE Contact: Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, SUBSCRIBE, NOTIFY Supported: replaces Content-Length: 0 F8 NOTIFY User A --> Application NOTIFY sip:UserB@subB.example.com SIP/2.0 Max-Forwards: 70 From: ;tag=1234567 To: ;tag=567890 Call-ID: 12345601@subA.example.com CSeq: 2 NOTIFY Content-Type: application/kpml+xml Content-Length: ... Event: kpml F9 200 OK Application --> User A SIP/2.0 200 OK From: ;tag=1234567 To: ;tag=567890 Call-ID: 12345601@subA.example.com CSeq: 2 NOTIFY Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, SUBSCRIBE, NOTIFY Supported: replaces Content-Type: application/sdp Content-Length: 0 Burger & Dolly Expires April 25, 2004 [Page 23] Internet-Draft KPML October 2003 F10 NOTIFY User A --> Application NOTIFY sip:UserB@subB.example.com SIP/2.0 Max-Forwards: 70 From: ;tag=1234567 To: ;tag=567890 Call-ID: 12345601@subA.example.com Increment CSeq CSeq: 3 NOTIFY Content-Type: application/kpml+xml Content-Length: ... Event: kpml F11 200 OK Application --> User A SIP/2.0 200 OK From: ;tag=1234567 To: Call-ID: 12345601@subA.com JVD: CSeq: 3 NOTIFY Contact: Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, SUBSCRIBE, NOTIFY Supported: replaces Content-Type: application/sdp Content-Length: 0 6.2 Third-Party Subscription Coming soon! 6.3 Remote-End Monitoring Coming soon! 7. Formal Syntax The following syntax in Figure 13 uses the XML Schema [9]. IETF Keypad Markup Language Burger & Dolly Expires April 25, 2004 [Page 25] Internet-Draft KPML October 2003 Figure 13: XML Schema for KPML 8. Enumeration of KPML Failure Codes Coming soon. 9. IANA Considerations 9.1 IANA Registration of MIME media type application/kpml+xml MIME media type name: application MIME subtype name: kpml+xml Required parameters: none Optional parameters: charset charset This parameter has identical semantics to the charset parameter of the "application/xml" media type as specified in XML Media Types [10]. Encoding considerations: See RFC3023 [10]. Interoperability considerations: See RFC2023 [10] and this document. Published specification: This document. Applications which use this media type: Session-oriented applications that have primitive user interfaces. Intended usage: COMMON 10. Security Considerations KPML presents no further security issues beyond the startup issues Burger & Dolly Expires April 25, 2004 [Page 26] Internet-Draft KPML October 2003 addressed in the companion documents to this document. As an XML markup, all of the security considerations of RFC3023 [10] apply. Normative References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [2] Roach, A., "Session Initiation Protocol (SIP)-Specific Event Notification", RFC 3265, June 2002. [3] Burger, E., Van Dyke, J. and A. Spitzer, "Media Server Control Markup Language (MSCML) and Protocol", draft-vandyke-mscml-02 (work in progress), June 2003. [4] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [5] Handley, M. and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998. [6] Olson, S., Camarillo, G. and A. Roach, "Support for IPv6 in Session Description Protocol (SDP)", RFC 3266, June 2002. [7] Andreasen, F. and B. Foster, "Media Gateway Control Protocol (MGCP) Version 1.0", RFC 3435, January 2003. [8] Groves, C., Pantaleo, M., Anderson, T. and T. Taylor, "Gateway Control Protocol Version 1", RFC 3525, June 2003. [9] Thompson, H., Beech, D., Maloney, M. and N. Mendelsohn, "XML Schema Part 1: Structures", W3C REC REC-xmlschema-1-20010502, May 2001. [10] Murata, M., St. Laurent, S. and D. Kohn, "XML Media Types", RFC 3023, January 2001. [11] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. Informative References [12] Bray, T., Paoli, J., Sperberg-McQueen, C. and E. Maler, "Extensible Markup Language (XML) 1.0 (Second Edition)", W3C Burger & Dolly Expires April 25, 2004 [Page 27] Internet-Draft KPML October 2003 REC REC-xml-20001006, October 2000. [13] World Wide Web Consortium, "Voice Extensible Markup Language (VoiceXML) Version 2.0", W3C Working Draft , April 2002, . [14] Schulzrinne, H. and S. Petrack, "RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals", RFC 2833, May 2000. [15] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 1889, January 1996. [16] Rosenberg, J. and H. Schulzrinne, "An INVITE Inititiated Dialog Event Package for the Session Initiation Protocol (SIP", draft-ietf-sipping-dialog-package-02 (work in progress), June 2003. [17] Burger (Ed.), E., Van Dyke, J. and A. Spitzer, "Basic Network Media Services with SIP", draft-burger-sipping-netann-07 (work in progress), September 2003. [18] Hunt, A. and S. McGlashan, "Speech Recognition Grammar Specification Version 1.0", W3C CR CR-speech-grammar-20020626, June 2002. Authors' Addresses Eric Burger SnowShore Networks, Inc. 285 Billerica Rd. Chelmsford, MA 01824-4120 USA EMail: e.burger@ieee.org Martin Dolly AT&T Labs EMail: mdolly@att.com Appendix A. Contributors Jeff Van Dyke worked enough hours and wrote enough text to be considered an author under the old rules. Burger & Dolly Expires April 25, 2004 [Page 28] Internet-Draft KPML October 2003 Robert Fairlie-Cuninghame, Cullen Jennings, Jonathan Rosenberg, and I were the members of the Application Stimulus Signaling Design Team. All members of the team contributed to this work. In addition, Jonathan Rosenberg postulated DML in his "A Framework for Stimulus Signaling in SIP Using Markup" draft. This version of KPML has significant influence from MSCML, the SnowShore Media Server Control Markup Language. Jeff Van Dyke and Andy Spitzer were the primary contributors to that effort. That said, any errors, misinterpretation, or fouls in this document are my own. Appendix B. Acknowledgements Hal Purdy and Eric Cheung of AT&T Laboratories helped immensely through many conversations and challenges. Steve Fisher of AT&T Laboratories helped with the digit suppression logic and syntax. Terence Lobo of SnowShore Networks made it all work. Burger & Dolly Expires April 25, 2004 [Page 29] Internet-Draft KPML October 2003 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. Full Copyright Statement Copyright (C) The Internet Society (2003). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION Burger & Dolly Expires April 25, 2004 [Page 30] Internet-Draft KPML October 2003 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Burger & Dolly Expires April 25, 2004 [Page 31]