<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<?rfc toc="yes"?>
<rfc ipr="full3667" docName="draft-sparks-sip-nit-actions-03">

<front>

<title abbrev="SIP non-INVITE Actions">
Actions addressing identified issues with the Session Initiation Protocol's non-INVITE Transaction
</title>

<author initials="R." surname="Sparks" fullname="Robert J. Sparks">
 <organization>Xten</organization>
 <address>
  <postal>
   <street>5100 Tennyson Parkway</street>
   <street>Suite 1000</street>
   <city>Plano</city> <region>TX</region> <code>75024</code> 
  </postal>
  <email>RjS@xten.com</email>
  </address>
</author>

<date month="Jan" year="2005"/>

<abstract>
 <t>
  This draft describes modifications to the Session Initiation Protocol (SIP)
  to address problems that have been identified with the SIP
  non-INVITE transaction. These modifications reduce the probability
  of messages losing the race condition inherent in the non-INVITE
  transaction and reduce useless network traffic. They also improve
  the robustness of SIP networks when elements stop responding. These
  changes update behavior defined in RFCs 3261.
 </t>
</abstract>

</front>
<middle>
<section title="Introduction">
<t>
There are a number of unpleasant edge conditions created by the
SIP non-INVITE transaction (NIT) model's fixed duration. The negative
aspects of some of these are exacerbated by the effect provisional
responses have on the non-INVITE transaction state machines. These
problems are documented in <xref target="I-D.sparks-sip-nit-problems"/>.
In summary:
<list>
  <t>A non-INVITE transaction must complete immediately or risk
     losing a race</t>
  <t>Losing the race will cause the requester to stop sending
     traffic to the responder (the responder will be temporarily
     blacklisted)</t>
  <t>Provisional responses can delay recovery from lost final
     responses</t>
  <t>The 408 response is useless for the non-INVITE transaction</t>
  <t>As non-INVITE transactions through N proxies time-out,
     there can be an O(N^2) storm of the useless 408 responses</t>
</list>
</t>
<t>This draft specifies updates to <xref target="RFC3261">RFC 3261</xref>
to improve the behavior
of SIP elements when these edge conditions arise. 
</t>
</section>

<section title="Improving the situation when responses are only delayed">
<t>
  There are two goals to achieve when we constrain the problem to
  those cases where all elements are ultimately responsive and networks
  ultimately deliver messages:
  <list style="symbols">
     <t>Reduce the probability of losing the race, preferably to the point
        that it is negligible</t>
     <t>Reduce or eliminate useless messaging</t>
  </list>
</t>
<section title="Action 1: Make the best use of provisional responses">
<t><list style="symbols">
<t>Disallow non-100 provisionals to non-INVITE requests</t>
<t>Disallow 100 Trying to non-INVITE requests before Timer E reaches T2 (for UDP hops)</t>
<t>Allow 100 Trying after Timer E reaches T2 (for UDP hops)</t>
<t>Allow 100 Trying for hops over reliable transports</t>
</list></t>


<t> Since non-INVITE transactions must complete rapidly 
(<xref target="I-D.sparks-sip-nit-problems"/>), 
any information beyond "I'm here" (which can be provided
by a 100 Trying) can be just as usefully delayed to the final response. Sending
non-100 provisionals wastes bandwidth.</t>

<t>As shown in
<xref target="I-D.sparks-sip-nit-problems"/>, 
sending any provisional response inside
a NIT before Timer E reaches T2 damages recovery from failure of an unreliable
transport.</t>

<t> Without a provisional, a late final response is the same as no response at
all and will likely result in blacklisting the late responding element 
(<xref target="I-D.sparks-sip-nit-problems"/>). 
If an element is delaying its final response at all,
sending a 100 Trying after Timer E reaches T2 prevents this blacklisting
without damaging recovery from unreliable transport failure.  </t>

<t>Blacklisting on a late response occurs even over reliable transports. Thus,
if an element processing a request received over a reliable transport is
delaying its final response at all, sending a 100 Trying well in advance of the
timeout will prevent blacklisting. Sending a 100 Trying immediately will not
harm the transaction as it would over UDP, but a policy of always sending such
a message results in unneccessary traffic. A policy of sending a 100 Trying
after the period of time in which Timer E reaches T2 had this been a UDP hop is
one reasonable compromise.</t>

</section>
<section title="Action 2: Remove the useless late-response storm">
<t><list style="symbols">
<t>Disallow 408 to non-INVITE requests</t>
<t>Absorb stray non-INVITE responses at proxies</t>
</list></t>
 <t> A 408 to non-INVITE will always arrive too late to be useful
     (<xref target="I-D.sparks-sip-nit-problems"/>), 
     The client already has full knowledge
     of the timeout. The only information this message would convey
     is whether or not the server believed the transaction timed out.
     However, with the current design of the NIT, a client can't do
     anything with this knowledge. Thus the 408 simply wasting
     network resources and contributes to the response bombardment
     illustrated in 
     <xref target="I-D.sparks-sip-nit-problems"/>. 
</t>
 <t>
    Late non-INVITE responses by definition arrive after the client
    transaction's Timer F has fired and the client transaction has
    entered the Terminated state. Thus, these responses cannot be
    distinguished from strays. Changing the protocol behavior to
    prohibit forwarding non-INVITE stray responses stops the late
    response storm. It also improves the proxy's defenses against
    malicious users counting on the RFC 3261 requirement to forward
    such strays.
 </t>
</section>
</section>
<section title="Improving the situation when an element is not going to respond">
<t>When we expand the scope of the problem to also deal with element or network
failure, we have more goals to achieve:
<list style="symbols">
<t>Identifying when an element is non-responsive</t>
<t>Minimizing or eliminating falsely identifying responsive elements as non-responsive</t>
<t>Avoiding non-responsive elements with future requests</t>
</list></t>
<t>
Action 1 helps with the first two goals, dramatically improving an element's
ability to distinguish between failure and delayed response from the next 
downstream element. 
Some response, either provisional or final, will almost certainly be 
received before the transaction times out. So, an element can more safely
assume that no response at all indicates the peer is not available and follow
the existing requirements in <xref target="RFC3261"/> and <xref target="RFC3263"/>
for that case.
</t>
<t>
Achieving the third goal requires more agressive changes to the protocol. As noted
in <xref target="I-D.sparks-sip-nit-problems"/>, future non-invite transactions are
likely to fail again unless the implementation takes steps beyond what is defined
in <xref target="RFC3261"/> and <xref target="RFC3263"/> to remember non-responsive
destinations between transactions. Standardizing these extra steps is left to future
work.
</t>
</section>
<section title="Normative Updates to RFC 3261">
<section title="Action 1">
<t>A SIP element MUST NOT send any provisional response with a 
   Status-Code other than 100 to a non-INVITE request.</t>
<t>A SIP element MUST NOT respond to a non-INVITE request with a 
   Status-Code of 100 
   over any unreliable transport, such as UDP, 
   before the amount of time it takes a client transaction's Timer E
   to be reset to T2.</t> 
<t>A SIP element MAY respond to a non-INVITE request with a Status-Code of 100
   over a reliable transport at any time.</t>
<t>Without regard to transport, a SIP element MUST respond to a 
   non-INVITE request with a 
   Status-Code of 100 if it has not otherwise responed after the amount of
   time it takes a client transaction's Timer E to be reset to T2.
   </t>
</section>
<section title="Action 2">
<t>A transaction-stateful SIP element MUST NOT send a response with 
   Status-Code of 408 to a non-INVITE request. As a consequence, an 
   element that can not respond before the transaction 
   expires will not send a final response at all.</t>
<t>A transaction-stateful SIP proxy MUST NOT send any response to
   a non-INVITE request unless it has a matching server transaction
   that is not in the Terminated state. As a consequence, this proxy
   will not forward any "late" non-INVITE response.</t>
</section>
</section>
	
<section title="Security Considerations">
<t>
This document makes a number of small changes to the core SIP specification <xref target="RFC3261"/> to improve the robustness of SIP non-INVITE transactions. Many of these actions also prevent flooding and denial-of-service attacks.
</t><t>
One change prohibits proxies and User Agents from sending 408 responses to non-INVITE transactions. Without this change, proxies automatically generate a storm of useless responses as described in <xref target="I-D.sparks-sip-nit-problems"/>.  An attacker could capitalize on this by enticing User Agents to send non-INVITE requests to a black hole (through social engineering or DNS poisoning) or by selectively dropping responses.
</t><t>
Another change prohibits proxies from forwarding late responses. Without this change, an attacker could easily forge messages which appear to be late responses. All proxies compliant with RFC 3261 are required to forward these responses, wasting bandwidth and CPU and potentially overwhelming target User Agents (especially those with low speed connections).
</t><t>
The remainder of these changes do not affect the security of the SIP protocol.
</t>
</section>
	
<section title="IANA Considerations">
<t>
This document requires no action by IANA.
</t>
</section>

<section title="Contributors">
<t>
Rohan Mahy provided the Security Considerations section.
</t>
</section>
	
</middle>

<back>
<references title="References">
<?rfc include="../rfcrefs/reference.RFC.3261" ?>
<?rfc include="../rfcrefs/reference.RFC.3263" ?>
<?rfc include="reference.I-D.sparks-sip-nit-problems"?>
</references>
</back>
</rfc>
