Document: draft-bryan-metalinkhttp-19 Reviewer: Ben Campbell Review Date: 2011-02-11 IETF LC End Date: 2011-02-11 Summary: This document is on the right track for a proposed standard, but there are open issues and editorial issues that should be addressed prior to publication. --- Major issues: none. --- Minor issues: -- Section 1, paragraph 1, first sentence: Is this really intended as an alternative to Metalink/XML? It seems like more of a complementary approach than an alternative one. -- section 2, third paragraph: If they must have the same eTag policy, should this document not specify at least one mandatory-to-implement policy? -- section 2, last paragraph: "Metalink clients MAY make use of digital signatures if they are offered." That seems too weak. We usually recommend that people check signatures on downloads when available. Wouldn't we want an automated system to treat this as at least a SHOULD? -- section 3, 4th paragraph: "[[Some organizations have many mirrors. Only send a few mirrors, or only use the Link header fields if Want-Digest is used?]]" Open issue? -- section 3.3, 1st paragraph:"([draft-ietf-ftpext2-hash] allows for FTP mirrors to be coordinated and provide file hashes)." Do you mean to allow such FTP servers to be preferred? -- section 4.1 "This is particularly useful for providing metadata such as cryptographic hashes of parts of a file, allowing a client to recover from partial errors (see Section 7.1.2)." Seems like this might warrant a SHOULD. -- section 5, last paragraph: "Metalink clients MAY support the use of OpenPGP signatures." MAY seems weak here. This means that the publisher of a file has no reasonable expectation that clients will check the signature. -- section 6, 3rd paragraph: "Digest: SHA-256=..." Example needs more context -- section 7, 4th paragraph from end: "Downloads from mirrors that do not have the same file size as the Metalink server are considered unusable and the client can deal with it as it sees fit." What if the client sees fit to use it? Is there a normative "client SHOULD NOT" implied here? -- section 7.1.1, 1st paragraph:"the requirement is that merging of ranges from multiple responses must be verified" The requirement for what? Is this intended to be normative? -- section 7.1.2, 2nd paragraph:"If no partial cryptographic hashes are available, then the client MUST fetch the complete object from other mirrors." Does this imply that the client also MUST fetch from other mirrors if the hashed are present but fail verification? -- section 8, last paragraph: "The size of chunks chosen by the client should be sufficiently large that the chunk request header fields and reponse header fields represent neglible overhead, and sufficiently large that they can be pipelined effectively without needing a very high rate of chunk requests." Can you provide guidance on what "sufficiently large", and "negligible overhead" mean, either in terms of examples or real numbers? "Note that Range requests impose an overhead on servers and clients need to be aware of that and not abuse them." What constitutes abuse? Can you provide more specific guidance on when range should not be used, since all the other guidance seems to suggest using it? -- section 10.2: What can an implementor do to mitigate spoofing? -- section 10.4: This section should include a warning that clients only have a MAY level requirement to support signature verification. "Metalinks should include digital signatures, as described in Section 5." Normative? -- section 11.1, [BITTORRENT] Is bittorrent.org an acceptable SDO for the purposes of normative references? (I'm not saying it's not--just asking) [draft-ietf-ftpext2-hash] -- is this a normative downref? --- Nits/editorial comments: -- section 2, 4th paragraph: "HTTP mirror servers SHOULD share the same ETag policy as the originating Metalink server." -- section 2, 5th paragraph: "Metalink clients use the mirrors provided by a Metalink server with Link header fields [RFC5988]." Sentence seems ambiguous. Do you mean "_in_ Link header fields." -- section 3, 2nd paragraph: "A brief Metalink server response with two mirrors only:" Sentence fragment -- section 3, last paragraph: "fieldss" Typo. (Note same typo repeats throughout) "Such a decision could be a hard-coded limit, a random selection, based on file size, or based on server load." I assume this is not meant to be exhaustive? -- section 3.3, last paragraph: Paragraph is redundant with similar normative statements in section 2 -- section 4.1: "as shown in Section 4." .. as shown in the example in section 4. -- section 6, 1st paragraph: "Metalink servers MUST provide Instance Digests in HTTP [RFC3230] for files they describe with mirrors via Link header fields." Redundant with previously mentioned normative requirement. -- section 7, paragraph starting with: "If no range limit was given in the original request then work from the tail of the object (the first request is still running and will eventually catch up), otherwise continue after the range requested in the first request." Implied "you". Restate as "the client works from the tail" (Please avoid use of 2nd person pronouns in an RFC, implied or otherwise) (Note that this usage is sprinkled throughout ) "If no Range was provided," ... in the original request,... Also, be careful with the passive voice in this paragraph. Who undertakes HEAD requests first? Who undertakes Range-based requests? -- section 7, paragraph 10: "If-Match conditions based on the ETag SHOULD be used..." Clients should use If-Match conditions... "If no indication of ETag syncronisation/ knowledge is given then If-Match should not be used, and optimally there will be an Instance Digest in the mirror response which we can use to detect a mismatch early, and if not then a mismatch won’t be detected until the completed object is verified." Is there any way other than the pref parameter to indicate this? If not, it would be better to say "if no pref parameter is present..." Is this sentence intended to be normative? Also, "... The client should not use..." Finally, the sentence is hard to follow--please break it up. -- section 7.1.1, general: This whole section switches to an almost stream-of-consciousness style. It has very long sentences with many comma separated clauses that are hard to follow. It really needs to be rewritten to be more readable and precise. -- section 7.1.1, paragraph 3: Paragraph seems self contradictory. How is guaranteeing correct content different than verifying integrity? -- paragraph 4: "This guarantees that a mismatch will be detected by using only the synchronized ETag from a master server and mirror server, even alerted by the mirror servers themselves by responding with an error, preventing accidental merges of ranges from different versions of files with the same name." I can't parse this sentence. "This even includes many malicious attacks where the data on the mirror has been replaced by some other file, but not all." Antecedent of "this" is ambiguous -- 2nd to last paragraph: This is the first mention of e idea of slave and master mirror servers. Is this the same as preferred and normal mirrors? Please use consistent terms. -- section 7.1.2, 2nd paragraph: "If the object cryptographic hash does not match the Instance Digest then fetch the Metalink/XML if available, where partial file cryptographic hashes can be found, allowing detection of which server returned incorrect data." I can't parse this sentence. -- section 8, paragraph after bullet list: Please avoid first person pronouns in an RFC. "Based on laboratory experiments,..." Reference? "we suggest a good default number of simultaneous connections is probably four," That's very weak language--can you remove "we suggest" and "probably"? -- Section 9, 1st paragraph: "registration to the Link Relation Type registry" Can you provide the URL to the registry? -- section 10.2 "In that case, this could deceive unaware downloaders that they are downloading a malicious or worthless file." Sentence does not make sense. Does the attacker with to deceive them into downloading a malicious file, or convince them that they are when they are not?