Document: draft-ietf-eai-imap-utf8-07 Reviewer: David L. Black Review Date: August 31, 2009 IETF LC End Date: August 31, 2009 IESG Telechat Date: September 10, 2009 Summary: This draft is on the right track, but has open issues, described in the review. The draft appears to be in good shape, although one has to be an IMAP expert to understand all of its implications (and I am not an IMAP expert). I found two open issues: - The header upconversion behavior specification for non-UTF-8 mailstores appears to be incomplete. - The recommendation to support MIME header upconversion for "Other widely deployed MIME charsets" strikes me as too vague to be useful guidance to implementers. Major issues: Section 3.2, upconversion behavior specification appears to be incomplete: If the mailstore is not UTF-8 header native and the SELECT or EXAMINE command with UTF-8 header modifier succeeds, then the server MUST return results as if the mailstore was UTF-8 header native with upconversion requirements as described in Section 8. What happens if a header that is not upconverted is accessed with a UTF-8 comparison string (e.g., by SEARCH)? I presume that no matches occur courtesy of the charset mismatch, but that needs to be explained, as it will be a surprise to users. Section 8 lists a number of 8859 character sets for which upconversion of MIME headers MUST be supported, and then says "Other widely deployed MIME charsets SHOULD be supported." How does an implementer figure out which character sets those would be? As an alternative, I suggest saying something along the lines of: any server-supported character set that is a superset of ASCII should be supported for upconversion. That probably leads to fewer client surprises caused by UTF-8 not working as expected. Minor issues: Section 3.1, next to last paragraph needs a couple of RFC 2119 keywords: Mailbox names must comply with the Net-Unicode Definition (section 2 of MUST >-->^^^^ [RFC5198]) with the specific exception that they may not contain MUST NOT >----------------------------------------->^^^^^^^ control characters (0000-001F, 0080-009F), delete (007F), line separator (2028) or paragraph separator (2029). The ABNF in this draft is extensions to ABNF specified elsewhere. I hope that the combined ABNF grammars have been run through an ABNF checker, but didn't see any mention of that in the IESG comment log. This would normally be covered by a shepherd's report, but I did not see one. Section 7 recommends that all IMAP clients be modified to display a clear error when the server advertises UTF8=ONLY. What's the expected behavior of existing, unmodified clients? Nits/editorial comments: Section 2 ought to introduce what's being added to the protocol. Adaptations of the first two sentences in Section 10 (IANA Considerations) would suffice. While not strictly a security consideration, it would be useful for section 11 to point out the potential for user confusion caused by SEARCH command match strings that have different UTF-8 representations but display identically or similarly (strings that look like they should match don't). idnits 2.11.12 found a few things (I've deleted a couple of obviously incorrect "Missing Reference:" warnings): Checking nits according to http://www.ietf.org/ID-Checklist.html: ------------------------------------------------------------------------ ---- ** There is 1 instance of too long lines in the document, the longest one being 14 characters in excess of 72. Miscellaneous warnings: ------------------------------------------------------------------------ ---- == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). Checking references for intended status: Experimental ------------------------------------------------------------------------ ---- == Unused Reference: 'RFC2045' is defined on line 475, but no explicit reference was found in the text == Unused Reference: 'RFC2183' is defined on line 486, but no explicit reference was found in the text ** Obsolete normative reference: RFC 1341 (Obsoleted by RFC 1521)