<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc compact="yes"?>
<?rfc subcompact="yes"?>
<?rfc toc="yes"?>
<rfc ipr="full3667" docName="draft-ietf-sip-content-indirect-mech-04">
	<front>
		<title abbrev="Content Indirection in SIP Messages">
A Mechanism for Content Indirection in Session Initiation Protocol
(SIP) Messages
		</title>
		<author initials="D" surname="Willis" fullname="Dean Willis" role="editor">
			<organization>dynamicsoft Inc.</organization>
			<address>
				<email>dean.willis@softarmor.com</email>
				<uri>http://www.softarmor.com</uri>
			</address>		
		</author>				
		<date month="July" year="2004"/>
		
		<area>Transport</area>
		<workgroup>Session Initiation Protocol</workgroup>
		<keyword>indirect</keyword>
		<keyword>content</keyword>
		<keyword>I-D</keyword>
		<keyword>Internet-Draft</keyword>
		<keyword>SIP</keyword>
		<abstract>
			<t> This document proposes an extension to the URL MIME External- Body Access-Type to satisfy the content indirection requirements for SIP. These extensions are aimed at allowing any MIME part in a SIP message  to be referred to indirectly via a URI.</t>
		</abstract>
	</front>
	<middle>
		<section title="Terminology">
			<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",    "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119  <xref target="RFC2119"/>
			</t>
		</section>
		<section title="Introduction">
			<t>The purpose of the Session Initiation Protocol <xref target="RFC3261"/> (SIP) is to create, modify, or terminate sessions with one or more participants. SIP messages, like HTTP, are sytnactically composed of a start line, one or more headers, and an optional body. Unlike HTTP, SIP is not designed as a general purpose transport of data.</t>
			<t>There are numerous reasons why it might be desirable to indirectly specify the content of the SIP message body. For bandwidth limited 
applications such as cellular wireless, indirection provides a means to annotate the (indirect) content with meta-data which may be used by the recipient to determine whether or not to retrieve the content over the resource limited link.</t>
			<t>It is also possible that the content size to be transferred might potentially overwhelm intermediate signaling proxies, thereby  unnecessarily increasing network latency. For time-sensitive SIP applications, this may be unacceptable. Indirect content can remedy this by moving the transfer of this content out of the SIP signaling network and into a potentially separate data transfer channel.</t>
			<t>There may also be scenarios where the session related data (body) that needs to be conveyed does not directly reside on the endpoint or User Agent. In such scenarios, it is desirable to have a mechanism whereby the SIP message can contain an indirect reference to the
   desired content. The receiving party would then use this indirect reference to retrieve the content via a non-SIP transfer channel such as HTTP, FTP, or LDAP.</t>
			<t>The purpose of content indirection is purely to provide an alternative transport mechanism for SIP MIME body parts.  With the exception of the transport mechanism, indirected body parts are equivalent, and should have the same treatment, as in-line body parts.</t>
			<t>Previous attempts at solving the content indirection problem made use of the text/uri-list <xref target="RFC2169"/> MIME type. While attractive for its simplicity (a list of URIs delimted by end-of-line markers), it fails to satisfy a number of the requirements for a more general purpose
   content indirection mechanism in SIP. Most notably lacking is the ability to specify various attributes on a per-URI basis. These attributes might include version information, the MIME type of the referenced content, etc.</t>
			<t>In searching for a replacement for the text/uri-list MIME type, RFC2017 defines a strong candidate. RFC2017 <xref target="RFC2017"/> defines an extension to   the message/external-body MIME type originally defined in RFC2046 <xref target="RFC2046"/>. The extension that RFC2017 makes is to allow a generic URI to specify the location of the content rather than protocol specific parameters for FTP, etc. as originally defined in RFC2046. While providing most of the functionality needed for a SIP content indirection mechanism, RFC2017 by itself is not a complete solution. This document will specify the usage of RFC2017 necessary to fulfill the requirments outlined for content indirection.</t>
			<t>The requirements can be classified as applying either to the URI which indirectly references the desired content or to the content  itself. Where possible, existing MIME parameters and entity headers are used to satisfy those requirements. MIME (Content-Type) parameters are the preferred manner of describing the URI while entity headers are the preferred manner of describing the (indirect) content. See RFC 2045 <xref target="RFC2045"/> for a description of most of  these entity headers and MIME parameters.</t>
		</section>
		<section title="Example Use Cases">
			<t>

   There are several example users of such a content indirection
   mechanism. These are examples only and are not intended to limit the
   scope or applicability of the mechanism.
</t>
			<section title="Presence Notification">
				<t>
   The information carried in a presence document could potentially
   exceed the recommended size for a SIP (NOTIFY) request, particularly
   if the document carries aggregated information from multiple
   endpoints. In such a situation, it would be desirable to send the
   NOTIFY request with an indirect pointer to the presence document
   which could then be retrieved by, for example, HTTP.
</t>
				<figure title="Example information flow for presence notification">
					<artwork><![CDATA[

             Watcher                 Presence Server
                |                           |
                |         SUBSCRIBE         |
                |-------------------------->|
                |          200 OK           |
                |<--------------------------|
                |                           |
                |          NOTIFY           |
                |-------------------------->|
                |          200 OK           |
                |<--------------------------|
                |                           |
                |      NOTIFY (w/URI)       |
                |<--------------------------|
                |           200             |
                |-------------------------->|
                |                           |
                |         HTTP GET          |
                |-------------------------->|
                |                           |
                | application/cpim-pidf+xml |
                |<--------------------------|
                |                           |

]]></artwork>
				</figure>
				<t>

   In this example, the presence server returns an HTTP URI pointing to
   a presence document on the presence server which the watcher can then
   fetch using an HTTP GET.
</t>
			</section>
			<section title="Document Sharing">
				<t>
   During an instant messaging conversation, a useful service is
   document sharing wherein one party sends an IM (MESSAGE request) with
   an indirect pointer to a document which is meant to be rendered by
   the remote party. Carrying such a document directly in the MESSAGE
   request is not appropriate for most documents. Furthermore, the
   document to be shared may reside on a completely independent server
   from the originating party.
</t>
				<figure title="Example information flow for document sharing">
					<artwork><![CDATA[

               UAC                  UAS         Web Server
                |                    |                |
                |   MESSAGE w/URI    |                |
                |------------------->|                |
                |        200         |                |
                |<-------------------|                |
                |                    |                |
                |                    |    HTTP GET    |
                |                    |--------------->|
                |                    |   image/jpeg   |
                |                    |<---------------|
                |                    |                |

]]></artwork>
				</figure>
				<t>
   In this example, a user wishes to exchange a JPEG image that she has
   stored on her web server with another user she has a IM conversation
   with. The JPEG is intended to be rendered inline in the IM
   conversation. The recepient of the MESSAGE request launches a HTTP
   GET request to the web server to retrieve the JPEG image.
</t>
			</section>
		</section>
		<section title="Requirements">
			<t>
				<list style="symbols">
					<t>
      It MUST be possible to specify the location of content via a URI. Such URIS MUST be conformnt with RFC2396
      <xref target="RFC2396"/> or its successors, such as <xref target="I-D.fielding-uri-rfc2396bis"/>.</t>
					<t>
      It MUST be possible to specify the length of the indirect content.
</t>
					<t>

      It MUST be possible to specify the type of the indirect content.
</t>
					<t>

      It MUST be possible to specify the disposition of each URI
      independently.
</t>
					<t> It MUST be possible to label each URI to identify if and when the content referred to by that URI has changed. Applications of this  mechanism may send the same URI more than once. The intention of  this requirement is to allow the receiving party to determine if the content referenced by the URI has changed without having to actually retrieve that content. Example ways the URI could be labelled include a sequence number, timestamp, version number, etc. When used with HTTP, an entity-tag (ETAG) mechanism as defined in RFC2068 <xref target="RFC2068"/>" may be appropriate. Note that we are not labeling the URI itself, but the content to which the URI refers, and that the label is therefore effectively "metadata" of the content itself.</t>
					<t>

      It MUST be possible to specify the timespan for which a given URI
      is valid. This may or may not be the same as the lifetime for the
      content itself.

</t>
					<t>
      It MUST be possible for the UAC and the UAS to indicate support of
      this content indirection mechanism. A fallback mechanism SHOULD be
      specified in the event that one of the parties is unable to
      support content indirection.

</t>
					<t>
      It MUST be possible for the UAC and UAS to negotiate the type of
      the indirect content when using the content indirection mechanism.

</t>
					<t>
      It MUST be possible for the UAC and UAS to negotiate support for
      URI scheme(s) to be used in the content indirection mechanism.
      This is in addition to the ability to negotiate the content type.

</t>
					<t>
      It SHOULD be possible to ensure the integrity and privacy of the
      URI when it is received by the remote party.

</t>
					<t>
      It MUST be possible to process the content indirection without
      human intervention.

</t>
					<t>
      It MUST allow for indirect transference of content in any SIP
      message which would otherwise carry that content as a body.
</t>
				</list>
			</t>
		</section>
		<section title="Application of RFC2017 to the Content Indirection Problem">
			<t>

   The following text describes the application of RFC2017 to the
   requirements for content indirection.
</t>
			<section title="Specifying support for content indirection">
				<t>
   A UAC/UAS may indicate support for content indirection through an
   Accept header containing the message/external-body MIME type. The
   UAC/UAS must supply additional values in the Accept header to
   indicate the content types that it is willing to accept either
   directly or through content indirection. User-Agents supporting
   content indirection MUST support content indirection of the
   application/sdp MIME type.
</t>
				<figure>
					<artwork><![CDATA[
   For example:


         Accept: message/external-body, image/*, application/sdp
]]></artwork>
				</figure>
			</section>
			<section title="Mandatory support for HTTP URI">
				<t>
   Applications which use this content indirection mechanism MUST
   support at least the HTTP URI scheme. Additional URI schemes MAY be
   used, but a UAC/UAS MUST support receiving a HTTP URI for indirect
   content if it advertises support for content indirection.
</t>
				<t>
   The intention is to establish a baseline of support to further
   strengthen interoperability. Implementors may design for the most
   common case (HTTP) without having to worry about negotiation of
   support for this particular URI scheme.
</t>
			</section>
			<section title="Rejecting content indirection">
				<t>
   If a UAS receives a SIP request which contains a content indirection
   payload, and the UAS cannot or does not wish to support such a
   content type, it MUST reject the request with a 415 Unsupported Media
   Type response as defined in section 21.4.13 of SIP 
   <xref target="RFC3261"/>. In
   particular, the UAC should note the absence of the message/
   external-body MIME type in the Accept header of this response to
   indicate that the UAS does not support content indirection.
</t>
			</section>
			<section title="Specifying the location of the content via a URI">
				<t>
   The URI for the indirect content is specified in a "URI" parameter of
   the message/external-body MIME type. An access-type parameter
   indicates that the external content is referenced by a URI.
</t>
				<figure>
					<artwork><![CDATA[
   For example:


         Content-Type: message/external-body;
                       access-type="URL";
                       URL="http://www.example.com/the-indirect-content"
]]></artwork>
				</figure>
			</section>
			<section title="Specifying versioning information for the URI">
				<t> In order to determine whether or not the content indirectly
   referenced by the URI has changed, a Content-ID entity header is
   used. The syntax of this header is defined in RFC2045 
   <xref target="RFC2045"/>. Changes in
   the underlying content referred to by a URI MUST result in a change
   in the Content-ID associated with that URI. Multiple SIP messages
   carrying URI that refer to the same content SHOULD reuse the same
   Content-ID to allow the receiver to cache this content and avoid
   unnecessary retrievals. The Content-ID is intended to be globally
   unique and SHOULD be temporally unique across SIP dialogs.</t>

			<figure>
					<artwork><![CDATA[
   For example:


         Content-ID: <4232423424@www.example.com>
]]></artwork>
				</figure>
			</section>
			<section title="Specifying the lifetime of the URI">
				<t>
   The URI supplied by in Content-Type header is not required to be
   accessible or valid for an indefinite period of time.  Rather, the
   supplier of the URI MUST specify the time period for which this URI
   is valid and accessible.  This is done through an "EXPIRATION"
   parameter of the Content-Type.  The format of this expiration
   parameter is a RFC1123 date-time value.  This is further restricted
   in this application to use only GMT time, consistent with the Date:
   header in SIP.  This is a mandatory parameter. Note that the
   date-time value can range from minutes to days or even years.
</t>
				<figure>
					<artwork><![CDATA[
   For example:


         Content-Type: message/external-body;
                       expiration="Mon, 24 June 2002 09:00:00 GMT"

]]></artwork>
				</figure>
			</section>
			<section title="Specifying the type of the indirect content">
				<t>
   To support existing SIP mechanisms for the negotiation of content
   types, a Content-Type entity header SHOULD be present in the entity
   (payload) itself. If the protocol (scheme) of the URI supports its
   own content negotiation mechanisms (e.g. HTTP), this header may be
   omitted. The sender MUST however be prepared for the receiving party
   to reject content indirection if the receiver is unable to negotiate
   an appropriate MIME type using the underlying protocol for the URI
   scheme.
</t>
				<figure>
					<artwork><![CDATA[
   For example:


         Content-Type: message/external-body; access-type="URL";
                       expiration="Mon, 24 June 2002 09:00:00 GMT";
                       URL="http://www.example.com/the-indirect-content"
         <CRLF>
         Content-Type: application/sdp
         <CRLF>

]]></artwork>
				</figure>
			</section>
			<section title="Specifying the size of the indirect content">
				<t>
   When known in advance, the size of the indirect content should be
   supplied via a size parameter on the Content-Type header. This is an
   extension of RFC2017 but in line with other access types defined for
   the message/external-body MIME type in RFC2046. The content size is
   useful for the receiving party to make a determination about whether
   or not to retrieve the content. As with directly supplied content, a
   UAS may return a 513 error response in the event the content size is
   too large. This is an optional parameter.
</t>
				<figure>
					<artwork><![CDATA[
   For example:


         Content-Type: message/external-body; access-type="URL";
                       expiration="Mon, 24 June 2002 09:00:00 GMT";
                       URL="http://www.example.com/the-indirect-content";
                       size=4123

]]></artwork>
				</figure>
			</section>
			<section title="Specifying the purpose of the indirect content">
				<t>
   A Content-Disposition entity header SHOULD be present for all
   indirect content. In the absence of an an explicit
   Content-Disposition header, a content disposition of "session" should
   be assumed.
</t>
				<figure>
					<artwork><![CDATA[
   For example:

         Content-Type: message/external-body; access-type="URL";
                       expiration="Mon, 24 June 2002 09:00:00 GMT";
                       URL="http://www.example.com/the-indirect-content"
         <CRLF>
         Content-Type: image/jpeg
         Content-Disposition: render

]]></artwork>
				</figure>
			</section>
			<section title="Specifying multiple URIs for content indirection">
				<t>

   If there is a need to send multiple URIs for the purpose of content
   indirection, an appropriate multipart MIME type 
   <xref target="RFC2046"/> should be used.
   Each URI should be contained in a single entity. Indirect content may
   be mixed with directly supplied content. This is particularly useful
   with the multipart/alternative MIME type.
</t>
				<figure>
					<artwork><![CDATA[
   For example:


        MIME-Version: 1.0
        Content-Type: multipart/mixed; boundary=boundary42

        --boundary42
        Content-Type: text/plain; charset=us-ascii

        The company announcement for June, 2002 follows:
        --boundary42
        Content-Type: message/external-body;
                      access-type="URL";
                      expiration="Mon, 24 June 2002 09:00:00 GMT";
   		   URL="http://www.example.com/announcements/07242002";
   		   size=4123

        Content-Type: text/html
        Content-Disposition: render

        --boundary42--

]]></artwork>
				</figure>
			</section>

			<section title="Specifying a hash value for the indirect content">
				<t>If the specific content being referenced by the indirection is known to the sender, and the sender wishes the recipient to be able to validate that this content has not been altered from that intended by the sender, the sender includes a SHA-1 <xref target="RFC3174"/> hash of the content. If  included, the hash is encoded by extending the MIME syntax <xref target="RFC2046"/> to include a "hash" parameter for the content type "message/external-body", the value of which is a base-64 enoding of the hash. </t>
				<figure>
					<artwork><![CDATA[
   For example:

         Content-Type: message/external-body;
                       access-type="URL";
                       expiration="Mon, 24 June 2002 09:00:00 GMT";
                       URL="http://www.example.com/the-indirect-content.au";
                       size=52723
                       hash=10AB568E91245681AC1B
         <CRLF>

]]></artwork>
				</figure>

			</section>
			
			
			<section title="Supplying additional comments about the indirect content">
				<t>Optional, freeform text may be supplied to comment on the indirect content. This should be supplied in a Content-Description entity
header. This text may be displayed to the end user but MUST NOT used by other elements to determine disposition of the body, as such as usage would result in unreviewed extension to the COntent-type and Content-disposition header field functions.</t>
				<figure>
					<artwork><![CDATA[
   For example:

         Content-Type: message/external-body;
                       access-type="URL";
                       expiration="Mon, 24 June 2002 09:00:00 GMT";
                       URL="http://www.example.com/the-indirect-content";
                       size=52723
         <CRLF>
         Content-Description: Multicast gaming session
]]></artwork>
				</figure>
			</section>
			<section title="Relationship to Call-Info, Error-Info, and Alert-Info Headers">
				<t>
   SIP <xref target="RFC3261"/>
   defines three headers which are used to supply additional
   information with regard to a session, a particular error response, or
   alerting. All three of these headers allow the UAC or UAS to indicate
   additional information through a URI. They may be considered a form
   of content indirection. The content indirection mechanism defined in
   this document is not intended as a replacement for these headers.
   Rather, the headers defined in SIP MUST be used in preference to this
   mechanism where applicable because of the well defined semantics of
   those headers.
</t>
			</section>
		</section>
		<section title="Examples">
			<section title="Single Content Indirection">
				<figure>
					<artwork><![CDATA[

        INVITE sip:boromir@example.com SIP/2.0
        From: <sip:gandalf@nwt.com>;tag=347242
        To: <sip:boromir@example.com>
        Call-ID: 3573853342923422@nwt.com
        CSeq: 2131 INVITE
        Accept: message/external-body application/sdp
        Content-Type: message/external-body;
                      ACCESS-TYPE=URL;
                      URL="http://www.nwt.com/party/06/2002/announcement";
   		   EXPIRATION="Sat, 20 Jun 2002 12:00:00 GMT"
   		   size=231
        Content-Length: ...

        Content-Type: application/sdp
        Content-Disposition: session
        Content-ID: <4e5562cd1214427d@nwt.com>

]]></artwork>
				</figure>
			</section>
			<section title="Multipart MIME with Content Indirection">
				<figure>
					<artwork><![CDATA[

        MESSAGE sip:boromir@example.com SIP/2.0
        From: <sip:gandalf@nwt.com>;tag=34589882
        To: <sip:boromir@example.com>
        Call-ID: 9242892442211117@nwt.com
        CSeq: 388 MESSAGE
        Accept: message/external-body, text/html, text/plain, 
                image/*, text/x-emoticon
        MIME-Version: 1.0
        Content-Type: multipart/mixed; boundary=zz993453

        --zz993453
        Content-Type: message/external-body;
                      access-type="URL";
                      expiration="Mon, 24 June 2002 09:00:00 GMT";
   		   URL="http://www.nwt.com/company_picnic/image1.png"
   		   size=234422

        Content-Type: image/png
        Content-ID: <9535035333@nwt.com>
        Content-Disposition: render
        Content-Description: Kevin getting dunked in the wading pool

        --zz993453
        Content-Type: message/external-body;
                      access-type="URL";
                      expiration="Mon, 24 June 2002 09:00:00 GMT";
   		   URL="http://www.nwt.com/company_picnic/image2.png"
   		   size=233811

        Content-Type: image/png
        Content-ID: <1134299224244@nwt.com>
        Content-Disposition: render
        Content-Description: Peter on his tricycle

        --zz993453--


]]></artwork>
				</figure>
			</section>
		</section>
		<section title="Security Considerations">
			<t>Any content indirection mechanism introduces additional security concerns. By its nature, content indirection requires an extra processing step and information transfer. There are a number of potential abuses of a content indirection mechanism:</t> 

			<t>
				<list style="symbols">
					<t>Content indirection allows the initiator to choose an alternative protocol with weaker security or known vulnerabilities for the content transfer. For example, asking the recipient to issue an HTTP request which results in a Basic authentication challenge.</t>
					<t>Content indirection allows the initiator to ask the recipient to consume additional resources in the information transfer and content processing, potentially creating an avenue for denial of service attacks. For example, an active FTP URL consuming 2 connections for every indirect content message.</t>
					<t>Content indirection could be used as a form of port scanning attack where the indirect content URL is actually a bogus URL pointing to an internal resource of the recipient. The response to the content indirection request could reveal information about open (and vulnerable) ports on these internal resources.</t>
					<t>A content indirection URL can disclose sensitive information about the initiator such as an internal user name (as part of an HTTP
      URL) or possibly geolocation information.</t>
				</list>
			</t>
			
			<t>Fortunately, all of these potential threats can be mitigated through careful screening of both the indirect content URIs that are received
   as well as those that are sent. Integrity and privacy protection of the indirect content URI can prevent additional attacks as well.</t>
   
			<t>For confidentiality, integrity, and authentication, this content indirection mechanism relies on the security mechanisms outlined in    RFC3261. In particular, the usage of S/MIME as defined in section 23 of RFC3261 provides the necessary mechanism to ensure integrity
   protection and privacy of the indirect content URI and associated parameters.</t>
   
			<t>Securing the transfer of the indirect content is the responsibility of the underlying protocol used for this transfer. If HTTP is used,  applications implementing this content indirection method MUST support the HTTPS URI scheme for secure transfer of content and must support the upgrading of connections to TLS using starttls. Note that a failure to complete HTTPS or starttls (for example, due to cert or encryption mismatch) after having accepted the indirect content in the SIP request is not the same as rejecting the SIP request, and may require additional user-user communication for correction. </t>
			
			<t>Access control to the content referenced by the URI is not defined by this specification. Access control mechanisms may be defined by the protocol for the scheme of the indirect content URI.</t>

			<t>If the UAC knows the content in advance, the UAC SHOULD include a hash parameter in the content indirection. The hash parameter is a base64-encoded SHA-1 hash of the indirected content. <xref target="RFC3174"/> If a hash value is included, the recipient MUST check the indirect content against that hash and indicate any mismatch to the user.</t>
	
			<t>In addition, if the hash parameter is included, and the target URI involves setting up a security context using certificates, the UAS MUST ignore the results of the certificate validation procedure, and instead verify that the hash of the (canonicalized) content received matches the hash presented in the content-indirection hash parameter.</t>

			<t>If the hash parameter is NOT included, the sender SHOULD use only schemes which offer message integrity (such as https:). When the hash parameter is not included and security using certificates is used, the UAS MUST verify any server certificates using the UAS's list of trusted top-level certificate authorities.</t>

			<t>If hashing of indirected content is not used, the possibility exists that the content returned to the recipient by exercise of the indirection has been altered from that intended by the sender.</t>

		</section>
		
		<section title="IANA Considerations">
			<t>This document raises no new IANA considerations.</t>
		</section>
	
	    <section title="Contributions">
			<t>It should be noted that the vast majority of this document, including editorship through the first IESG review, was provided by Sean Olson, seanol@microsoft.com</t>
			    </section>
	</middle>

	<back>
		<references title="Normative References">
			<?rfc include="reference.RFC.2017" ?>
			<?rfc include="reference.RFC.2045" ?>
			<?rfc include="reference.RFC.2046" ?>
			<?rfc include="reference.RFC.2068" ?>
			<?rfc include="reference.RFC.2119" ?>
			<?rfc include="reference.RFC.2169" ?>
			<?rfc include="reference.RFC.2396" ?>
            <?rfc include="reference.RFC.3174" ?>
			<?rfc include="reference.RFC.3261" ?>
			
		</references>
		<references title="References References">
			<?rfc include="reference.I-D.fielding-uri-rfc2396bis" ?>
		</references>
	</back>
</rfc>
