Draft: draft-ietf-sipping-toip-04.txt Reviewer: taylor@nortel.com Review Date: Thursday 3/23/2006 6:27 AM CST Additional Review Date: Thursday 5/4/2006 1:06 PM CST Review Milestone: Post WGLC/07 April 2006 Summary: This draft has serious issues, described in the review, and needs to be rethought. I volunteered to review the subject draft during the meeting yesterday. I find the draft reasonably well structured, but I have a large number of editorial suggestions which will be mostly moot if what I am about to suggest is accepted. Leaving these aside, I have also identified some substantive issues. The key issue is that of the proper scope of the document. It is clear that the intent of the editor was that this be a system specification for text-on-IP systems. With that in mind, I looked for a precedent for such a document in the IETF archives. The first thing that came to mind was RFCs 1122 and 1123, "Host Requirements". However, these documents really consist a suite of protocol profiles. draft-ietf-sipping-toip-04.txt covers a range of issues, only some of them relating to protocols. Searching the RFC database using the term "system specification", I came across RFC 1297, "NOC Trouble Ticket Requirements". Compared with RFC 1297, the subject draft is very protocol-oriented. On the other hand, RFC 1297 is concerned with running the network, while the present draft is about a network application. So I'm not sure that RFC 1297 is a suitable precedent as an indicator of the extent of IETF interest in defining application systems. My gut feeling is that the present scope of the document is suitable to an organization like the TIA or the ITU-T. On the other hand, an applicability statement describing the use of SIP, SDP, and specific RTP payload types to support ToIP would lie squarely within the IETF's sphere of interest. In fact, I believe there are a few requirements for new SDP attributes buried in here, and further examination might reveal new SIP requirements too. I therefore believe it is in the interest of the WG to extract an applicability statement from the current content of the draft. If the WG chooses to accept the current intent of the draft, I have a number of substantive comments which should be fairly easily addressed. I record them below so they won't be lost, but the question of scope should be dealt with before people read further. --------------------------- 1. In the definition of transcoding service in section 4, I have a feeling the editor meant transcoding to apply especially to conversions between different media types. Could this be confirmed? 2. Requirement R4 in section 5.2.1 currently reads: "R4: Systems SHOULD allow the user to specify a preferred mode of communication, with the ability to fall back to alternatives that the user has indicated are acceptable." The introduction to 5.2.1 talks about people who are constrained to one media type in one direction and a different one in the other. It seems that this directionality should be captured in R4 by adding the words "in each direction" after "communication" at the start of the second line. 3. Requirement R5 immediately following gives two reasons why the system may be unable to provide simultaneous real-time text and voice: lack of system capability and network constraints. It then goes on to require best-effort establishment of communication. I suspect this best effort was meant to apply to the network-limited case only. If this is the intention, the words "either because the system only supports alternate modalities or" -- that is, the case of lack of system capability -- should be deleted from the requirement. If something else was intended, it must be better spelled out. One can't quite visualize how a system not designed to do something will go about making a best effort to do it anyway. 4. Requirement R8 seems to be a special case of requirements R1 and perhaps R2. As such it is redundant and should be removed. 5. Requirement R9 is missing capitalization of MUST and RECOMMENDED. Further to that, in terms of protocol requirements the ability to specify the sign language should be a MUST, even if it is only RECOMMENDED for system implementation. This strikes me as a need for a new SDP attribute. 6. I wasn't sure whether the ability to support international character sets was a set-up issue (section 5.2.1) or a transport issue (its current section, 5.2.2). I think I see the argument for its current placement, but the editor may wish to comment. 7. Requirement R15 as written is circular: "Where possible, it must be possible ...". It sounds like the intention was to say that it SHOULD be possible to send and receive real-time text simultaneously. 8. R19 in section 5.2.3 reads as follows: "R19: Adding or removing a relay service MUST be possible without disrupting the current session." I'm a little unclear what is meant by this requirement. Adding or removing a relay station is definitely going to disrupt some media flows within the session. The modality of communication between the participants will change, in general. I suspect the requirement means to say that (1) communication between the end participants should continue after the addition or removal of the relay service, and (2) the effect of the change should be limited in the users' perception to the direct effect of having or not having the trnscoding service in the connection. I will do more work on the document when the WG makes its decision on the scope. Additional Review Comments: =========================== General fixes proposed: ======================= 1. Your apostrophes (') all appear as non-ASCII characters. You need to do a global substitution. 2. Your headings all come out on two lines, with the number on the first line followed by text on the second. Usually headings are presented on a single line (e.g., TAB ). 3. I recommend NOT capitalizing terms for different services, as you have a tendency to do. It just confuses the reader. Example: "Text Bridging" rather than "text bridging". I call these out in my individual remarks below. Abstract ======== Repetitious -- you say twice that the document provides a framework. I propose that you delete the first sentence, and start the second sentence with "This document ...". If you want to capture the greater detail in the discarded sentence, you can change "existing protocols and techniques" in the new first sentence to "the Session Initiation Protocol (SIP) and the Real-Time Transport Protocol (RTP)". Section 1 ========= Second para: add two sentences that should be moved here from section 5 (will also be noted in comments on that section): "Real-time text conversation can be combined with other conversational services like video or voice. ToIP also offers an IP equivalent of analog text telephony services as used by deaf, hard of hearing, speech-impaired and mainstream users." Third para, second sentence: has redundancy. Change "the user's requirements, including those of ..." to "the requirements of ...". Add "of" before "mainstream". Section 2 ========= Before the bullets: I suggest "provides:" rather than "defines the:". In the bullets: start with small letters rather than capitals, since they continue a sentence rather than starting a new one. In the first one (trivial point) "requirements for" rather than "requirements of" to be consistent with the second bullet. I also would not capitalize "Real-time" in the first bullet. Section 4 ========= Definition of audio bridging is awkwardly phrased. How about: "Audio bridging: a function of an audio media bridge server, gateway or relay service that sends to each destination the combination of audio from all participants in a conference excluding the participant(s) at that destination. At the RTP level, this is an instance of the mixer function as defined in RFC 3550 [4]." Definition of half duplex: add "simultaneously" after "directions". Change "errors can" to "errors may". Definition of interactive text: same thing as real-time text, as far as I can see, so why not point to the definition: "Interactive text: another term for real-time text, as defined below." Definition of real-time text: I suggest you expand the citation of F.700 a bit, as follows: "is defined in the ITU-T Framework for multimedia services, Recommendation F.700 [25]." Definition of text relay service: small initial letters for "relay service". Delete the comma after "people", since it has the effect of including voice telephone users in the preceding list rather than making them the other party of the "between". Definition of text bridging: delete one copy, as noted by Gonzalo. I suggest the other be rewritten to take advantage of the extended definition already given for audio bridging, as follows: "Text bridging: a function of the text media bridge server, gateway or relay service analogous to that of audio bridging as defined above, except that text is the medium of conversation." Definition of total conversation: add "streams" after "media" in the second sentence. Definition of transcoding services: a bunch of small fixes and one suggested additional sentence as indicated below: "Transcoding service: a service provided by a third-party user agent ^ ^ ^ ^^^^^^^^^^^^^ that transcodes one stream into another. Transcoding can be done by human operators, in an automated manner, or by a combination of both ^ ^^ methods. Within this document the term particularly applies to ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conversion between different types of media. A text relay service is an ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^ ^ ^ ^ ^^^^^ example of a transcoding service which converts between real-time text ^ ^^^^^^^^^^^^^^ and audio." Definition of TTY: a suggested bit of additional information, and applicability to Canada as well as the USA: TTY: originally, an abbreviation for "teletype". Often used in North ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^ America as an alternative designation for a text telephone or textphone. ^^^^^^^ ^ Also called TDD, Telecommunication Device for the Deaf. Definition of video relay service: small initial letters for "relay service". Delete comma after "people" to make the logic right. Section 5 ========= Opening sentence: I suggest changing "This framework" to "The framework described in section 6". Second and third paras: more suitable in the introduction, as indicated in my previous note. Last para: effectively repeats the first one. Delete. Section 5.1 =========== Para 4 (just before bullets): modify the end of the sentence as follows: "in terms of conversationality to those provided by voice." Bullets a-e: small initial letters since the bullets continue rather than begin a sentence. Next para after bullets a-e: I'll give you capitalized "Total Conversation", since it is a special term created by Gunnar. Suggest adding "as" immediately after it. Beginning of second sentence: replace "Users could ..." by "Total Conversation allows participants to ...". Following para, all bullets: small initial letters. End all bullets but the last one with semicolon rather than period. Third bullet of that para: Add missing closing parenthesis after "recording". I suggest the rest of the bullet be modified a bit as follows: "for legal purposes, for clarity, or for flexibility;" ^^^^^^^^^^^^^^ ^^^^^^^^ ^ Final bullet of that para: delete "thus" (which seems redundant) before "creating". Final para of section: delete the first "or" (before "join") and add a comma after "several users". Section 5.2 =========== First para: "sections list" rather than "sections lists". I suggest you add to this paragraph as follows: "The requirements are organized under the following headings: • session set-up and session control; • transport; • use of transcoding services; • interworking." Second para: "unique" rather than "uniquely". Section 5.2.1 ============= First para: the first sentence is totally unrelated to the topic of the next sentence and the following paragraph (i.e., the need for modes besides text). I propose the first sentence and the word "However" beginning the next sentence be deleted. Then you can combine the remaining sentence of the first para with the second para for a concise motivation of the requirements. Second para first sentence: "use" appears followed by "is used" followed by "users". To avoid so much repetition, delete "use of" after "alternating", and substitute "people" for "users". R3: add "in" before "mid-conversation"? R4: substantive: add "in each direction" after "communication". R5: as noted earlier, a bit of a contradiction as written. I suggest deleting "either because the system only supports alternate modalities or". R7: as written, the requirement is unclear. Could I suggest deleting the first paragraph, which is too general? Then rewrite the second paragraph as a requirement: "It MUST be possible to use real-time text in conferences both as a medium of discussion between individual participants (for example, for sidebar discussions in text while listening to the main conference audio) and for central support of the conference with real time text interpretation of speech." R8: as I indicated in my first note a month ago, seems redundant once you satisfy R1 and R2. Mark "deleted"? R9: capitalize MUST and RECOMMENDED. Section 5.2.2 ============= SUBSTANTIVE: R10: an exchange with Gunnar suggested to me that the delay requirement given here is end-to-end. If that is indeed what is intended, insert "end-to-end" before "delay time" at the start of the second sentence. Insert "as" before "good" at the end of that sentence. R11 first sentence: hyphenate "speech-to-text"? Change "conversation text" to conversational text". In the second sentence, insert "as" before "sufficient". R15: too wishy-washy. Start the sentence with "It SHOULD be possible ...". Section 5.2.3 ============= SUBSTANTIVE: R18: I think you have two requirements: "R18A: It MUST be possible to negotiate the requirements for transcoding services in real time in the process of setting up a call." "R18B: It MUST be possible to negotiate the requirements for transcoding services in mid-call, for the immediate addition of those services to the call." SUBSTANTIVE: R19: As I noted a month ago, this needs clarification. Does the following capture the intent? "R19: Communication between the end participants SHOULD continue after the addition or removal of a text relay service, and the effect of the change should be limited in the users' perception to the direct effect of having or not having the transcoding service in the connection." R20: I suggest "specify" rather than "determine" in the first sentence. Section 5.2.4 ============= R22: capitalize MUST. SUBSTANTIVE: R23: the two paragraphs deal with different requirements. The first talks about how the services work. The second talks about what call features are available. I suggest splitting them into two requirements, with an example added to the first one, as follows: "R23A: Where real-time text is used in conjunction with other media, exposure of user control functions through the User Interface needs to be done in an equivalent manner for all supported media. For example, it must be possible for the user to select between audio or visual prompts, or both must be supplied." "R23B: Where certain session services are available for the audio media part of a session, these functions MUST also be supported for the real-time text media part of the same session. For example, call transfer must act on all media in the session." R23B may belong in section 5.2.1, since it has to do with session control rather than user interface. R24: Suggest beginning with "If available" rather than "If present". R26: capitalize SHOULD. R28: capitalize MUST. R32: capitalize SHOULD. Section 5.2.5 ============= Second para, second sentence: change "it is" to "they are" to match "facilities" in the first sentence. "In the PSTN" or "When they operate in the PSTN" rather than "On the PSTN". Third para, second sentence: change "could" to "can". SUBSTANTIVE: R35 isn't clear. Does it refer to the need for correlation between sessions on the two sides of a gateway, or does it refer to what is presented to an user? I suspect the latter is what is intended, in which case the requirement should be rewritten in more concrete terms. I expect the ToIP user is meant to see the telephone number at the other end, but what is the PSTN user supposed to see? R36: add "differences in" before "transmission speeds". Section 5.2.5.1 =============== No comment. Section 5.2.5.2 =============== Second para first sentence: change "have been developed" to "were developed". Change "was mandated" to "were mandated", to agree with "services". Third para has a SHOULD that is an unnumbered requirement. I suggest using a small-letter "should" here. Section 5.2.5.3 =============== No comment. Section 6 ========== This note is a beginning on section 6, which has given me considerable difficulty because so much of it seems to be written at the same level as section 5 but say different things. What I see in there is: -- some motivational text; -- implementation details; -- a lot of requirements language. It seems to me that requirements language in this section should be limited to the selection of implementation options. I would also be inclined to suggest the use of lower-case "must" etc. in this section. Anything that is more in the way of a system requirement should be in section 5. I think you'll find that a lot of redundancy (and occasional contradictions) will get removed in that way. The document is generally well organized around the components of the complete system for providing ToIP services. I would suggest tightening up the last little bit by explicitly listing those components in a section before the present section 5, then organizing both section 5 and section 6 around it. Here is my picture of those system components: 1. User interface Sub-topics: configuration, session initiation and control, alerts and indicators. 2. Communications subsystem Sub-topics: signalling capabilities, rendezvous capabilities, media capabilities 3. Storage Sub-topics: message taking, recording 4. Transcoding support 5. Conferencing support 6. Interworking Sub-topics: PSTN text phones, cellular, instant messaging. I guess I will proceed from here by commenting on and classifying the contents of section 6, paragraph by paragraph. There will generally be no point in my suggesting alternate text, since if I am right in what I've said above there is a fair amount of moving of text and rewriting to be done. Having proposed an organization around system components in my previous note (text above), I will assume that this is what we are aiming for. This means, as one example, that most SIP and SDP-related aspects will be dealt with in the section on signalling capabilities within the communications subsystem description. However, interactions between SIP and other functions will be dealt with in the respective sections -- for example, the relationship between what is presented to the user and the content of the SIP INVITE gets covered under "user interface". One overall remark: section 6 is missing a description of the implementation of many of the section 5 requirements. Where I suggest discarding material below because it is covered in section 5, the authors may sometimes want to generate new text dealing with implementation instead. Section 6 ========= No comment. Section 6.1 =========== I'm not sure how much this section should say beyond the first paragraph, in which case it can be moved down as introductory text in section 6.2. The reason is that the content of RFC 3351 and the remainder of section 5.1 are really covered by the more detailed requirements in the rest of section 5. The initial paragraph indicating the protocol selections is useful because as indicated above it will be referred to in most of the subsequent sections. May I suggest a slight change in wording: "This framework specifies the use of the Session Initiation Protocol ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (SIP) [3] to set up, control and tear down the connections between ToIP ^^^ users whilst the real-time text medium is transported ..." The reason for the wording change is that I doubt you want to actively prevent someone from specifying an implementation of ToIP using some other signalling protocol by defining it to be SIP-based only. The second paragraph beyond the first sentence describes motivation. Since this ground is already covered in 5.1, it can be discarded. The third paragraph belongs in the section discussing implementation of signalling within the communications subsystem. The fourth paragraph on the virtues of T.140 is more motivation and can also be discarded. Section 6.2 =========== No comment. Section 6.2.1 ============= The use of SIP is presumably an assumption, not a requirement. The paragraph is part of the communications subsystem: signalling capabilities topic. Section 6.2.1.1 =============== This section belongs in the communications subsystem: rendezvous capabilities topic. Does this section respond to R17 in section 5? Section 6.2.1.2 =============== This looks like a continuation of section 6.2.1.1, and belongs under the same topic. Why was it put into a separate section? Section 6.2.1.3 =============== This doesn't seem particularly related to ToIP. Discard. Section 6.2.1.4 =============== The first paragraph falls under the communications subsystem: signalling capabilities topic. The second para has some content relating to this topic, other content relating to user interface. However, the paragraph is a statement of system requirements rather than implementation options, and belongs in the appropriate parts of section 5. Probably already covered by R5, R23 and R28. Section 6.2.1.5 =============== This section repeats R23B as I suggested it. Discard. Section 6.2.2 ============= First para: system requirement -- belongs in section 5, but really looks like a tautology. Second, third, and fourth paras: belong under the communications subsystem: media capabilities topic. Fifth para belongs under the user interface: presentation topic. Sixth para seems to repeat and contradict ("default" vs. "MUST") earlier material -- discard. Seventh para: belongs under the communications subsystem: media capabilities topic. Eight para: replace "MUST be" by "is" -- this is an example! Belongs under the communications subsystem: signalling capabilities topic. Ninth para: belongs under the communications subsystem: media capabilities topic. Tenth para: system requirement. Put it into section 5 following or as part of R10. Final para: system requirement. Contradicts R11. Section 6.2.3 ============= First para: belongs under the transcoding topic. Second and third paras: covered by definitions. Discard. Fourth para: first sentence is a system requirement. It is implied by RFC 3351, but could be added to section 5.2.3. Second sentence is related to the final sentence of 6.2.1.1. Perhaps they should be brought together under the transcoding topic. Remaining sentences belong under the transcoding topic. Section 6.2.4 ============= No comment. Section 6.2.4.1 =============== This entire section presents system requirements and therefore belongs in section 5. Most is covered already. What the authors might want to put under the user interface: alerting and indications topic instead is notes on the indications that should follow specific signalling events, where these aren't obvious. Section 6.2.4.2 =============== Content falls under the user interface: alerting and indications topic. The text is not clear to me, but that can wait until the document is reshaped. (assuming it will be). Section 6.2.4.3 =============== System requirements. Move to section 5.2.4. Modify R29 and R30 to be consistent with the content of this section -- they are too specific in their placement of the answering machine function. storage: message taking topic. Section 6.2.4.4 =============== First para: first sentence repeats R32 and should refer to that requirement. e.g., "Requirement R32 requires that, in the display of text conversation, users be able to distinguish easily between different speakers. Remainder is an implementation note. user interface: presentation topic. Second para: system requirement -- add to section 5.2.4. Section 6.2.4.5 =============== Relates to the storage: recording topic. Should refer to requirement R31. Section 6.2.5 ============= Interworking topic. Section 6.2.5.1 =============== Paras 1-5: Implementation notes, I suppose. Interworking: PSTN topic. I'm a little dubious whether the requirement to support SIP and RFC 4103 in para 5 is necessary, or is covered by the general specification under communication subsystem. Para 6: restates R38. Discard. Para 7: system requirement -- doesn't belong here. Use this instead of the existing wording of R35. Remaining paras: implementation notes under interworking: PSTN topic. Section 6.2.5.2 and subsections: implementation notes under the interworking: cellular topic, I suppose. A bit repetitive from section 5.2.5.2. Section 6.2.5.3 =============== OK for interworking: instant messaging topic. Section 6.2.5.4 =============== Should probably be merged into 6.2.5. Repeats some of that content. Section 6.2.5.5 =============== Is there a motivation for the requirement to support all possible combinations? It doesn't seem to be something that has to be standardized. Section 6.2.5.6 =============== No comments.