Draft: draft-ietf-sipping-profile-datasets-01 Reviewer: Dale R. Worley Review Date: 11 Sept. 2008 Review Deadline: 10 Sept. 2008 Status: pre-WGLC Summary: This draft is on the right track but has open issues, described in the review.=20 I have the following major concerns: 1. The draft is not tightly written, in that it talks *about* the user agent profile dataset system, but it does not carefully define the system based on prior knowledge. E.g., several significant terms are not defined in the glossary; the "application profile" is mentioned in a few places, but its source and purpose are nowhere described; clear distinction is not kept between the profile dataset system, the various profile dataset schemas, and individual profile dataset instances. All of these problems seem to be of the sort caused by authors who are so familiar with the subject matter as to be unaware of what the reader might not know, and could be cured by a careful editing. 2. The intended scope of the RFC is unclear. One possibility is that the RFC is intended to be a conceptual guide to the profile dataset system, giving little more information than that profiles are separated into datasets and that the UA must merge multiple profiles, and listing all of the information that must be specified in the dataset definition RFCs. If this possibility is chosen, much of the material in the draft should not be present, as it is long-winded examples of matters about which the dataset definitions have complete freedom: e.g., the allowed value spaces of properties, the common setting attributes, the merge algorithms. In short, essentially all of section 5 should be omitted because it is not normative or binding on the writer of a dataset definition. The other possibility (and the one that I vastly prefer) is that the RFC defines the operation of the profile dataset system clearly enough that from it one could from it implement a UA profile dataset toolkit, a toolkit into which one could plug the dataset schemas to produce (without further programming effort) software that would handle all the processing that is discussed in the draft. To implement this second possibility would require that the draft specify a number of matters that it does not at present specify, including: - the datatypes allowed for the settings, and how they are specified by the dataset schemas In particular, the draft seems to envision that all settings are either "scalar" datatypes or are subsets of values of a "scalar" datatype. The allowed set of scalar datatypes is not specified. (Section 5.12 has a placeholder for this specification.) The subset types are not clearly defined, so I'm certain that various odd special cases will lead to interoperability problems. How the subset types would be specified in the dataset schemas is not specified. - what merge algorithms will be allowed, and how they are to be specified The least desirable choice is that each dataset definition will have a free hand to define its merge algorithms in English, as that makes it impossible to build a toolkit to handle the merge process. However it seems unlikely that this RFC could specify any set of merge algorithms without the danger of omitting some algorithm that would turn out to be essential. One possibility is that the specification could define that the merge algorithms are to be specified in the schemas using XSLT or another standardized language. In any case, the authors need to decide what the intention of the RFC is and adjust the text to match. The remaining two problems are questions regarding the overall data model used for profile information: That all the data can be modeled as triples "schema-URN/setting-name -> scalar value". 3. The handling of non-scalar settings is not clear. In all UAs that I know of, some aspects of the configuration information are conceptually a "structure containing an array of structures". E.g., there are a number of "lines", and for each line, the UA needs a set of data: user-part, domain, outgoing-proxy, auth-user, auth-password, registration-interval, etc. There is no clear mapping from this conceptual structure into the datatypes that seem to be envisioned: named scalars and named sets of scalars. 4. The handling of repeated datasets is not clear. In the framework, it seems to be assumed that each of the 4 sources of profiles (user, application, device, local-network) will provide a set of datasets, and within each set, no two profiles with have the same schema URN. The schema URNs allow the UA to determine which datasets are to be merged with each other, resulting in a set of merged datasets, each of which has a unique schema URN. From the merged datasets, datasets that are not needed by the UA are discarded, and from thost that are not discarded, the non-supported extensions are discarded. However this system assumes that it would never be meaningful to have more than one (merged) dataset with the same schema URN. In practice, this means that if a UA can contain an "entity" that is configured by a dataset, that no UA will ever want to contain more than one entity of that type, as in that case, there would be no way to provide configuration to each of the entities separately. However, this assumption of no duplication seems (to me) to be unlikely to hold in practice.