[Apologies to the author:  This document slipped through the gen-art
last call review net and so I only got to volunteer on this one 24 hours
ago.  Later... Having slept on the comments I see there are a couple that were 
items that I missed and should be modified or removed.  In particular 
about whitespace (where I missed the definition in terminology).
Further I have now read (very quickly) the address draft which clarifies
some matters on what are possible JIDs but doesn't totally solve some of the
issues.]

Document: draft-ietf-xmpp-3920bis-19
Reviewer: Elwyn Davies
Review Date: 29 November 2010
IETF LC End Date: 
IESG Telechat date: 2 December 2010

Summary:
This is a very well written document that is mostly easy to read and
follow.  However I believe it is not (quite) ready for the IESG for the
following reasons:

It has (IMO) one major issue as an EXTENSIBLE system - the three level-1
stanzas are hard coded and it is not immediately possible to add new
types of level-1 stanza should this be thought useful in future.  This is, 
I guess a philosophical issue, and as a -bis document it probably isn't 
going to get changed at this stage, so this need not concern us deeply.  
However I think that it might be useful to provide an explanation of
the general semantics of the three types of stanza early on in the document -
not having participated in XMPP, it wasn't until I stepped back from
the details of the document (as well as finding out what IQ stood for
when I reached section 8) that I realized that the three existing 
categories of stanza are  effectively semantic types that are appropriate
for the three classes of message for which they are named, but could 
be used for other sorts of function - this might avoid people thinking that
they need extra stanza types and complaining that it isn't extensible :-( . 


There are several more minor issues with these three being important:

There are a largish number of SHOULD requirements where the
alternative is not discussed or specified.

There seems to be a lack of clarity or definiteness about the effects of
support of SASL being mandatory.  There are a number of places in the
document where the text is written as if SASL were not mandatory, and
others where the reverse is true but the surrounding text doesn't seem
to think it is mandatory..  All these places need to be looked at
critically to ensure that the wording is appropriate for SASL being
mandatory and that the whole is consistent.

The separation of the address format into a separate document means
that there should be some information about the terms that are 
imported in the terminology section.  The early sections on stream setup 
talking about 'from' and 'to' addresses have some difficulties with the 
use of hostname versus the term used in the JID/addressing draft of 
domainname.  Almost the whole of this document implies that a JID
*necessarily*
has a localpart, whereas the definition in the addressing draft allows
it to be just a bare (or mere!) domainname.  The couple of places where 
JIDs may or in some cases, must not have a localpart confuse matters because
they don't describe the hostname only forms as JIDs in the main text, but they 
are suddenly transformed into JIDs in summaries (in particular the summary
table in s4.6.6
where it turns out that, yes, all the 'to' and 'from' addresses are actually
JIDs but (1) only restricted forms are allowed for some interactions
and (2) these restricted forms were carefully *not* described as JIDs earlier
in s4.6).
Again, the term 'bare JID' is almost everywhere shown as
<localpart@domainname>
except for in s8.1.2.1.  This term is not actually defined in the addressing
draft
and appears only in one place as a term for <localpart@domainname>.   

Apart from these there are a number of other minor issues and various
nits/editorial issues.

Caveat: As usual I haven't tried to verify the XML in Appendix A, and my
scan of the examples in s9 was very cursory.

Major issues:
s4.1, Definition of XML Stanza: Given that xmpp is supposed to be
extensible, is there a really good reason why the first-level
stanza-defining elements (message, presence, iq) are hard-coded into the
specification rather than being possibly extensible? There is already a
stream features mechanism where additional first-level items could be
signaled.  This might be alleviated by describing the semantic types of
message that are covered by the available stanzas - the naming covers up
that they are quite general types of message that could be used for other 
purposes.


Minor issues:
s2.5, first bullet describing server respnsibilities:  The term 'local
clients' is slightly confusing.  My immediate understanding of 'local'
would be something attached to the server from the local network or
machine.  In this case it is presumably intended to imply clients that
have a session connected to this server rather than via some other
server but may of course be very remote as regards physical and network
location.  Maybe 'attached clients' or 'associated clients' (this term
is used elsewhere in the document) might be clearer, or maybe this just
needs a terminology definition for how 'local' is used - there is the
use of 'local entities' previously in s2.5 but this didn't jar so much.

s2.5, last para: Does the use of 'local services' imply anything about
the disposition of the code implementing the service? This use of
'local' may or may not conflict with the implications of 'local
mentioned in the last comment.  Terminology needs to be made a little
more rigorous.

s4.2.2: Towards the end of this section, there are two statements
stating 'the initiating entity is cleared to send XML stanzas'. Should
this be qualified with 'unless a stream restart is required as a result
of the feature negotiation' (at least at the first appearance)?  This
probably depends on whether voluntary-to-negotiate features can require
stream restarts if agreed but not otherwise.

s4.2.3 and s4.4: In s4.2.3, if a stream has to be restarted, the parties
are supposed to consider the the stream 'replaced'.  This presumably
means that the old stream is closed.  It is not absolutely clear whether
this requires that a <stream/> is exchanged or definitely not exchanged
in this case. [OK: This is cleared up for the specific TLS case at item 
3 in s5.4.3.3 and for SASL in s6.4.6 - a general note at s4.2.3 would be
appropriate since these are not necessarily the only restart cases.]

s4.2.6, paras 2 and 3: The first parts of these paragraphs are written
as if SASL autrhentication was mandatory (which it is) and is the only
possibility (which it might not be).  Para 2 (the client-to-server case)
does not appear to allow for any other options.  Para 3, OTOH, goes on
to deal with Server Dialback but the first part still reads as if SASL
was the only option.  Both paras need to be rewritten to cater for other
options (and the version 0.9 case?). 

s4.6.1, para 6: 
  'For initial stream headers in server-to-server communication, a
   server MUST include the 'from' attribute and MUST set the value to
   the domainpart of the 'from' attribute of the stanza that caused the
   stream to be established...':  Is it absolutely always the case that 
a server-to-server connection would be established because some stanza 
came in (presumably from an attached client)?  Could this be indirectly
through some local service requiring a connection for any other reason?
It is clear why the server needs to specify 'from' but the constraint
on the origin seems overly restrictive.

s5.3.5: 'Because it is relatively inexpensive to establish streams in XMPP,
   for the first two cases it is preferable to use an XMPP stream reset
   (as described under Section 4.8.3.16) instead of performing TLS
   renegotiation.':
Is this always true?  Probably so for simple instant messaging but if 
the client is using some other services provided by the server, this may 
an unwarranted generalization.  I am not sufficiently expert in either
XMPP or TLS renegotiation to know what the trade-offs would be.

s5.3.5: 'However, the third case is sufficiently rare that XMPP entities
   SHOULD NOT blindly attempt TLS renegotiation.' (the third case is 
separating client authentication from server authentication):
I don't understand what is being said here.  Is this an opinion that the
risk of client credential leakage is sufficiently low that the expense of 
two stage negotiation is not really worth it?  Or is this something that 
should be controlled by a configurable client policy?  Or is this an
implementation choice?  Some clarification would help.

s5.4.3.1, rule 3: Why should the receiving entity not request a certificate?
Why would the initiating entity not send its certificate if it had one? (Is 
this related to the renegotiation issue from the last comment?)

s5.4.3.1, rule 4: How should the receiving entity make the choice given the 
'to' attribute (presumably some field in the certificate)? What happens if
it decides not to follow this should?

s5.4.3.2, para 2: The comment about not sending </stream> relates to the 
comment on s4.3.2 above. However, the reasoning for not sending </stream>
when the TCP connection is closed due to TLS failure differs from reasoning
given elsewhere (e.g., s5.3.5, para 8 - here the error is said to be at the
TLS layer and not the XMPP layer so that it is not appropriate to send
</stream>).
A consistemt story is needed here.  

s5.4.3.3, item 5:  Given that SASL support is mandatory, why is offering the
SASL 
feature not a MUST?  Is this to do with possible future alternatives?

s6.3.8, para 1: I suspect the two SHOULDs in this para ought to be MUSTs.
What
other options does initiating entity have if it does/does not want to act on
behalf of another entity and SASL is being used?  However maybe all this
should be 
qualified by 'If SASL is being used...'

s6.4.1, para 5: 'If the receiving entity supports SASL, the stream features
   SHOULD include an advertisement for support of SASL negotiation,...':
s6.2 says that XMPP entities MUST support SASL. I think that this sentence 
probably needs to be turned round making it something like 'Unless support for
SASL
negotiation is only allowed once STARTLS negotiation has been completed, the 
stream features MUST include an advertisement for the support of SASL.'  Aside
from
the STARTLS case (or equivalent cases with future secure stream mechanisms),
what 
happens if the SASL advertisement is not made?  

s6.4.2, last para: '... the server SHOULD discard the ongoing handshake...':
What happens if the server (receiving entity) chooses not to
comply with the new proposed SASL mechanism choice?

s6.4.5, paras 4 and 6:  'SHOULD close the stream':  What would the entity do
instead?

s6.4.6, para 1: The two SHOULDs in this paragraph ought to be MUSTs.  (if
there is 
a 'from' address then the receiving entity MUST do the correlation - what else 
could it do?  If it does the match and they don't match, what would do other
than
terminate the attempt?)

s8.1.2.1, item 2; also s10.5.1 and s10.5.2:  s8.1.2.1, item 2 appears to
introduce 
the idea that the <domainname> is the 'bare JID' of a server.  This concept
does 
not appear anywhere else in this document (or from a brief Google search
elsewhere).  
It was not used in RFC 3920.  It appears that the term is used once but not
explicitly 
defined in the addressing draft - there it is used for the
<localpart@domainname> form.
It also doesn't match the 'definition' of Bare JID in
s10.5.3.2.  Previously in this document this was known as the 'hostname'.
However 
s10.5.1 refers to a 'Mere Domain' as a JID and s10.5.2 to a
<domain/resourcepart> as a 
JID.  This seems rather inconsistent although technically it is covered by the
addressing 
draft.  This all probably be helped by summarizing the terminology used in the
addressing
draft in the terminology section, and defining the term 'bare JID' properly
(in the addressing
draft). 

s8.1.2.1, items 2 and 4:  As written, both items refer to stanzas 'generated
by the
server'.  I take it that the distinction is that item 2 refers to messages
from the 
server that are not generated by the server as a proxy for the client account
but 
are some sort of generic control message or the like, whereas item 4 refers to
server
generated messages that are generated by the server as such a proxy for the
client 
account.  The wording is somewhat confusing at present.  Presumably the lack
of a 
'from' attribute is intended to distinguish item 1 messages from item 4
messages -
this could be spelt out more positively if this is correct.

s13.7.1.1. Both instances of item 4: 'The signatureAlgorithm SHOULD be the 
PKCS #1 version 1.5 signature algorithm with SHA-256 as defined by
[PKIX-ALGO].':
What happens if it isn't? 

s13.7.1.1: First item 5: 'The certificate SHOULD include an Authority 
Information Access (AIA) extension that specifies the address of an Online
Certificate Status Protocol [OCSP] responder.'
What happens if it doesn't?

s13.9.5 and s15: The requirement for support of TLS-NEG if TLS Renegotiation
is to be allowed 
should be mentioned here.  This is also a conformance requirement.

s14: I seem to remember that giving a working group as a registrant contact
for an URI is deprecated as it is likely to go out of existence fairly
shortly.

Nits/editorial comments:


s3.2.1, bullet labeled '5': 'The initiating entity uses the IP
address(es) from the first successfully resolved hostname...': 'first'
implies that it depends on how fast the answers come back.  Presumably
what is wanted is the IP address of the highest priority SRV record that
is successfully resolved, maybe subject to some local policies?

s3.2.1,bullets labelled '3' and '8':  Bullet 3 suggests that the s3.2.2
fallaback process 'SHOULD' be used whereas bullet 8 has an implicit MAY
(that might be better if was explicit).  Why the distinction?
[Presumably, having read on, there is little point in trying the first
possibility in the fallback if the initiating entity has already tried
the default port, but the alternatives might still work.] Section 3.2.2
could mention that there is no point in trying the default ports if they
were already tried under s3.2.1.

s3.2.2: A pointer to s15.7 where the default port numbers are registered
would be useful.

s3.2.4: Where is the appropriate XML stanza for defining add-on services
defined?

s4.1, Definition of XML Stanza: The term/title 'iq'/'IQ' is not expanded
until s8.2.3 in this document where it is expanded as info/query.  It
would be useful to mention this here (first occurrence) and indeed maybe
to put all three of message, presence and iq into the terminology
section.  The usage of IQ stanzas would then be clearer and their use
in examples would be more transparent - currently they just appear with
no introduction (e.g, in s4.1 and then in s4.8.3.11).

s4.1, final set of bullets: Adding section references to the five topics
would be helpful.

s4.2.2, first para: A pointer to s4.6.5 which is very prescriptive about
how to do the version comparison would be useful.

Figure 3: It would help if the figure could be made one or two lines
shorter so that the title appears on the same page.  (e.g., move the
bottom line 'yes' and 'no' above the transition lines that they apply
to, remove one line below the 'process complete' box.)

s4.6.1, para 8 (also s4.6.2):  Although it may seem a bit pedantic, it
is probably desirable to clarify that by 'hostname' the document means a
fully qualified DNS name.  The term is used ambiguously by many
operating systems. (The document could also stick to the 'domainname' term
used in the addressing draft.)

s4.6.6: The entries for to/initiating to receiving and from/receiving to
initiating are hostnames which are rstricted forms of JID.  The other two to
and from
entries are hostnames rather than arbitrary JIDs for the server-to-server
case.
[But the comments on s8.1.2.1 item 2 and ss10.5.1/2 under minor issues
may have some bearing here.]

[Comment withdrawn:  I missed the definition of whitespace in the terminology
section:
s5.3.3 and s6.3.5: Does 'whitespace' mean all the 'white space' items
referred to in Section 2.10 of the current version of the XML spec
(i.e., including line separators).  The term whitespace is used earlier
whan referring just to spaces used in the context of keep alives.  [It
turns out much later that s11.7 clarifies this but maybe doesn't totally
resolve s2.10.] Note that the XML specification uses 'white space'
rather than 'whitespace'.]

s6.4., last para:  For consistency and avoiding confusion between XMPP
server and SASL server: s/server/receiving entity/

s8.2.1: A reference to s10.3.1, where the meaning of a message with 'to'
attribute is explained, would clarify the first SHOULD.  Adding 'if
possible' to route or deliver it to the intended recipient' would
explain the second SHOULD which shoudl probably be owercase 'should' as this
isn't a conformance issue but a matter of whether the connections are
available.

s8.2.2: The second and third SHOULDs probably need 'if possible' to
explain them and they should probably be lowercase 'should' as this
isn't a conformance issue but a matter of whether the connections are
available.

s8.3.2/s8.3.3:  All the error specifications in s8.3.3 contain a SHOULD
specification of the error type recommended.  A note in s8.3.2 or s8.3.3
that the values of the error type given in s8.3.3 are the usual expected
error type, but that there are conceivably circumstances when some other
value would be more appropriate. (the only one I copuld think of so far,
apart from the remote-server-timeout case already in the draft, was
associated with payment-required where a modify or cancel error type
might be appropriate if the reason was not that the requestor wasn't 
actually authorized but that it didn't have enough credit.)

s11.4: 'A client SHOULD NOT rely on the ability to send data that does
not conform to the schemas':  This is not controllable by the protocol.
s/SHOULD NOT/should not/

s11.4: What should the client do if it doesn't ignore non-conformant
elements?

s8.3.3.15, Security Note: s/communicated/communicate/ 
 
s14. The IANA Considerations mostly duplicate those in RFC 3920.  Is the
intention that IANA should update the references to RFC3920 to point at
this document?  If so some statement should be made about this. 

s14: Presumably 'XXXX' is to be replaced by the RFC nnnn identifier for
the eventual RFC.  This isn't stated.  Should the various Namespace
names refer to document sections?