SIP Forum                                                     H. Kaplan 
SFSIW-1 Paper                                               Acme Packet 
Intended status: Informational                        November 28, 2007 
    
    
            A Brief List of Common SIP Interoperability Issues 
                    draft-kaplan-sip-interop-issues-00 
    
    
Copyright Notice 
    
   Copyright (C) Hadriel Kaplan (2007). 
    
Abstract 
    
   This document identifies several commonly found interoperability 
   issues with SIP, for the purpose of stimulating discussion at the 
   first SIP Forum SIP Interoperability Workshop. 
 
Table of Contents 
    
   1.    Introduction................................................1 
   2.    Applicability...............................................2 
   3.    General Interoperability Issues.............................2 
      3.1.   Configurable settings..................................2 
      3.2.   Legacy RFCs and expired drafts usage...................2 
      3.3.   Response code issues...................................3 
      3.4.   SIP field lengths......................................4 
      3.5.   SIP and TEL URI formats................................4 
   4.    Specific Interoperability Issues............................5 
      4.1.   Offer-less Invites and Re-Invites......................5 
      4.2.   REGISTER response behavior.............................5 
      4.3.   DTMF Exchange methods..................................6 
      4.4.   IETF vs. 3GPP uses of Service-Route header.............6 
      4.5.   Competing NAT traversal techniques.....................6 
      4.6.   Call-hold signaling....................................7 
      4.7.   Early and on-hold media................................7 
   5.    References..................................................8 
   Author's Address..................................................9 
    
    
1. Introduction 
    
   SIP has grown both in terms of vendor/customer adoption and protocol 
   complexity, with numerous implementations, and differing 
   assumptions, leading to numerous interoperability issues.  Unlike 
   some other protocols, it suffers from a lack of either a single 
   dominant vendor, or of a single autocratic standards body.  The 
   large number of vendors involved, from different regions of the 
   World, and the differences in needs and wants of the customers of 
 
 
Kaplan                   Expires May 1, 2008                 [Page 1] 




                     SIP Interoperability Issues        November 2007 
 
 
   those vendors, has led to a complicated interoperability problem 
   space. 
    
   This paper is a brief list of some of the more common 
   interoperability issues my company has encountered in recent time.  
   It is not an exhaustive list by any means, and while SBC products 
   try to "fix" these issues, there may be better ways of addressing 
   them in the long term, so they don't need to be "repaired".  
    
    
2. Applicability 
    
   This draft is focused on SIP interop issues only.  Although interop 
   issues exist for SDP, MIME, RTP, RTCP, etc., they are out of scope 
   for this document. 
 
 
3. General Interoperability Issues 
    
3.1. Configurable settings 
    
   One of the most difficult challenges with interworking SIP devices 
   is the fact that so much of the protocol machinery and extension 
   usage is provisioned, vs. dynamically learned.  Most people think of 
   this in the context of endpoint configuration, but there is 
   significant provisioning performed on proxies, app servers, and 
   other middle-boxes - to the extent that one cannot simply say 
   "device X interoperates with device Y" without specifying the 
   configuration of each device at each end, and those in between. 
    
   This is not a new problem - HTTP and other protocols have similar 
   issues - but the number of hops a SIP message traverses, variety of 
   device implementations and available extensions is so much greater 
   for SIP than perhaps any protocol before, at such an early stage of 
   the protocol's life, that it endangers the adoption of SIP itself.  
   I am not sure this is really addressable, short of defining very 
   specific profiles a la the SIP Forum SIP-Connect spec. [Note that 
   the attempts at doing so thus far have not been completely plug-and-
   play, in my opinion, because they still define/allow optional 
   behavior] 
 
3.2. Legacy RFCs and expired drafts usage 
    
   There are numerous cases where legacy (obsoleted) RFCs and expired 
   drafts end up in deployed systems and present interoperability 
   problems for newer systems.  While it is doubtful this is a fixable 
   problem, one wonders if, for the most common cases, it would not be 
   beneficial to document them so that newer systems are designed to 
   expect the legacy behavior as well as the new.  Ignoring legacy 
 
 
Kaplan                    Expires - May 2007                  [Page 2] 




                     SIP Interoperability Issues        November 2007 
 
 
   usage has not seemed to succeed so far in general.  The problem is 
   that systems are deployed, and new vendors trying to sell products 
   need to make their systems work with the legacy ones, not the other 
   way around.   
    
   For example, the Diversion header outlined in [draft-diversion] is 
   far more prevalent than [RFC4244] History-Info, it seems.  The 
   problem with this type of legacy usage is one cannot merely support 
   receiving the older syntax, because one needs to figure out what to 
   generate as well, as a request gets forwarded (e.g., convert 
   Diversion to History-Info, or add Diversions, etc.).  
    
3.3. Response code issues  
    
   There are numerous reasons a given response code may be sent, and in 
   some cases more than one response code may be appropriate, which has 
   led to differing expectations and behaviors.  The need to resolve 
   such conflicts between domains of proxies has led to middle-boxes 
   changing the response codes, which may well exacerbate the problem 
   in the future. 
    
   In general the interoperability problems that arise are where the 
   upstream proxies or UAC perform automatic re-attempts to alternate 
   paths for certain response codes but not others, and such action 
   cannot be known in advance to the downstream device.  For example a 
   404 Not Found is commonly returned by a proxy when it cannot find a 
   route to the target for any number of reasons, and this response 
   causes some upstream nodes to try alternate paths and some not to.  
   Because a 404 can be returned for a variety of reasons, some of 
   which should cause a re-route and some not, some vendors send 
   different response codes than 404 for those conditions: response 
   codes which are more explicit about whether a re-route is the 
   appropriate action.   
    
   Another example is 503, which seems to cover everything from 
   temporary overload conditions, administrative-down state, permanent 
   failure, and as a catch-all for anything not easily identified by 
   other codes.  Some devices treat this response code as a semi-
   permanent condition for the next-hop, and avoid sending any 
   subsequent requests to the next-hop for a sustained period of time, 
   which may or may not be the correct action to take.  Unfortunately 
   the upstream nodes have no idea which downstream proxy actually 
   generated the 503. 
    
   [See "REGISTER response behavior" section for related problems] 
    



 
 
Kaplan                    Expires - May 2007                  [Page 3] 




                     SIP Interoperability Issues        November 2007 
 
 
3.4. SIP field lengths 
    
   While the RFCs do not define any maximum lengths for SIP header 
   fields (values, parameters, etc.), the reality of computing 
   technology is such that vendors often do feel compelled to impose 
   maximum lengths for received fields.  Whether it's due to security 
   concerns, product architecture, logging constraints, or whatever, 
   the fact is there are many systems which cannot or will not handle 
   fields as large as other systems can generate.  Although [RFC3261] 
   does define some specific response codes (413/414/513) for this 
   case, it does not fix the underlying interop issue.  Devices cannot 
   simply stop sending larger fields based on a SIP response code.   
    
   This issue has been appearing more frequently lately, with the use 
   of embedded cookies in URIs and parameter growth.  I believe a BCP 
   may help here, if it can define some recommended minimum blob 
   lengths that any SIP device should be able to accept, for defined 
   and unknown blobs.  Customers can then demand their vendors to 
   comply with the BCP. 
    
3.5. SIP and TEL URI formats 
    
   Despite all RFC wording to the contrary, the SIP URI format has seen 
   widespread use as essentially the semantic equivalent of the TEL 
   URI, albeit with different syntax.  Many provider systems treat 
   sip:16035551212@example.net as logically equivalent to 
   tel:+16035551212, even though the former has local scope to 
   example.net only, and the latter has global scope.  Part of the 
   reason for this, I believe, is that originating UA's have no real 
   way of knowing when a URI should be one or the other - the user 
   pressed digit buttons and hit "send", and all the UAC can do is send 
   the request to sip:[digits]@[local-domain].  It doesn't know the 
   numbers pressed were global in scope, or even E.164 numbers.  Only 
   the routing proxies know this, and even then they only know the 
   numbers they're each responsible for.  Thus we see the domain 
   portion of SIP URIs getting replaced by middle-boxes at provider 
   boundaries, if the username portion looks like an E.164 number.   
    
   Furthermore, many systems have either been designed or provisioned 
   to handle only one scheme type (i.e., SIP URIs).  This has led to 
   cases where requests are rejected unless the appropriate URI scheme 
   is used, and frequently that single common scheme needs to be used 
   in more than just the request-URI (e.g., To and From URI's as well).  
   This wholesale replacement of schemes and domain names in URIs leads 
   to interop issues when the same URIs are expected to be used for 
   end-to-end purposes, in headers or XML bodies the middle-boxes do 
   not or cannot change.  The most recent example is [RFC4474] sip-
   identity. 
    
 
 
Kaplan                    Expires - May 2007                  [Page 4] 




                     SIP Interoperability Issues        November 2007 
 
 
    
4. Specific Interoperability Issues 
    
4.1. Offer-less Invites and Re-Invites 
    
   Although this is clearly a device implementation issue (i.e., a 
   "bug"), we have seen numerous devices from different vendors have 
   trouble handling Invites or re-Invites without SDP.  For initial 
   Invites without SDP, often the root cause for failure is that 
   specific request routing or admission decision logic of intermediate 
   devices depends on the SDP; for example devices which route calls 
   based on codec, or bandwidth allocation devices, or 3PCC transcoding 
   devices which themselves send out offer-less Invites but didn't 
   expect to receive such. (Apparently they never considered that a 
   call could cross two such systems!) 
    
   For re-Invites, the delayed SDP offer model is performed for very 
   specific use cases which are common, but were simply not envisioned 
   by the developers of the UA's.   
    
   The only recommendation that this paper puts forth is that the 
   offer-less Invite usage be specifically documented in a separate RFC 
   or BCP, in the hopes that vendors will be aware of their usage and 
   customers ask for compliance to the RFC explicitly.  Don't hide this 
   in a generic "Invite call flows BCP". 
    
4.2. REGISTER response behavior 
    
   Another form of the interop problems that arise from responses is 
   the behavior of UA's with regard to Registration and Subscribe 
   response handling.  For example, only a minority of UA's properly 
   support 3xx redirects for REGISTER, even though it would be a useful 
   mechanism for load-balancing.  For REGISTER requests specifically, 
   it would be beneficial if there was explicit documentation of what 
   actions should be performed by the UAC. 
    
   To reinforce this point, consider that UA's perform Registrations 
   and Subscriptions in a fairly automatic fashion with little user 
   interaction, and so the way in which they treat specific response 
   codes can have dramatic consequences.  For example, it is not well-
   defined what a UA should do when its REGISTER is rejected with a 
   404, or even 503, and hardly any UA's honor the Retry-After header.  
   A very few UA's will give up altogether and wait for user input; 
   some UA's will wait a few minutes and try again, indefinitely; some 
   will re-attempt their Registration almost immediately, even faster, 
   and never give up.  This creates numerous problems in large network 
   deployments, and has led SBC vendors to implement various protection 
   schemes - from dynamic hardware ACLs, to even sending a 200 ok just 
   to shut the UA up. 
 
 
Kaplan                    Expires - May 2007                  [Page 5] 




                     SIP Interoperability Issues        November 2007 
 
 
    
4.3. DTMF Exchange methods 
    
   One of the most basic expectations of functionality that consumers 
   expect from "phone calls" is DTMF, yet this still has 
   interoperability issues in the real world.  [RFC2833] defines how to 
   perform DTMF notification in the media-plane, while [RFC4730] KPML 
   defines how to perform such in the signaling-plane.  Unfortunately, 
   the most common signaling-plane mechanism we have found is exchanged 
   in INFO messages - but it is not documented in an RFC and lacks the 
   ability to perform negotiation of support. (Whether adding such 
   support will succeed remains to be seen) 
    
4.4. IETF vs. 3GPP uses of Service-Route header 
    
   While there are undoubtedly entire classes of interop problems 
   associated with the IETF vs. 3GPP/TISPAN models, only one is 
   mentioned here: [RFC3608] Service-Route.  The [RFC3608] mechanism 
   defined by the IETF leads an IETF-compliant UA to route requests 
   based on the received Service-Route header, but in 3GPP the Service-
   Route header does not include the P-CSCF first-hop proxy that must 
   actually be traversed. [note: and what's more, for IMS-AKA, the P-
   CSCF's port changes after Registration, so the Service-Route would 
   be wrong if it did include the P-CSCF] 
    
   Furthermore, the [RFC3608] states that the Service-Route applies to 
   the entire Address-of-Record, which implies the same one for all 
   contacts of that AoR.  In specific load-balancing and visited 
   network scenarios, however, two registering contacts for the same 
   AoR may traverse two different sets of outbound Proxies or even 
   Registrars and need different Service-Routes per contact.  Some 
   Registrars, Proxies, and/or UA's comply with the RFC verbatim and 
   essentially break the path of one of the contacts. 
 
4.5. Competing NAT traversal techniques 
    
   Until the mechanism in [sip-outbound] achieves widespread 
   deployment, vendors employ multiple techniques for NAT traversal of 
   SIP signaling, which can lead to interoperability problems.  Many 
   server-side vendors employ a REGISTER refresh approach, whereby the 
   UA is told a short REGISTER expires time in order to keep the NAT 
   pinhole open; other client-side vendors attempt to auto-discover a 
   NAT exists, by looking at the Via received parameter in responses, 
   or assuming a local rfC1918 address means the UA is behind a NAT, or 
   by having user-settable check-boxes, and send either OPTIONS or CRLF 
   or proprietary Methods to keep the pinhole open.   
    
   Unfortunately when the client and server-side implementations don't 
   agree, it sometimes leads to unexpected consequences, such as attack 
 
 
Kaplan                    Expires - May 2007                  [Page 6] 




                     SIP Interoperability Issues        November 2007 
 
 
   detection and dynamic blacklisting on the server side, or for media 
   NAT traversal to fail (e.g., if the server-side does not believe the 
   UA is behind a NAT because it fixes itself for signaling, but not 
   media).  In response to this server-side vendors have created 
   counter-measures to make the UA not detect it's behind a NAT, when 
   it really is.  Hopefully [sip-outbound] will do away with this 
   continual arms race.  
    
4.6. Call-hold signaling 
    
   The legacy mechanism defined in [RFC2543] for call-hold by setting 
   the SDP connection address to 0.0.0.0 is unfortunately far from 
   obsolete in usage, despite the superior direction attribute concept 
   of [RFC3264].  To increase interoperability, some devices send both 
   types in the re-Invite, which defeats the purpose of using a 
   direction attribute (e.g., keeping RTCP flowing).  Other vendors 
   send the direction attribute first, and if the SDP answer does not 
   mirror it they use the legacy approach, which leads to extraneous 
   signaling overhead. An IETF recommendation/BCP for this is probably 
   warranted.  In hindsight [RFC3264] should have been backwards 
   compatible (e.g., still using the 0.0.0.0 syntax with some new 
   attribute for on-hold connection address, which would be ignored by 
   legacy devices but used by newer ones). [note: I recognize this is 
   SDP not SIP, but it's a big deal and was caused by rfc2543] 
    
4.7. Early and on-hold media 
    
   Several issues with early-media were discussed in [stucker-early-
   media] and [stucker-middleboxes] which are yet to be resolved.  It 
   is not clear if there is consensus that there even is a problem, but 
   there is one.  The fact is that there are things that go bump in the 
   night (or bump in the wire, as it were).  There are constrained 
   network resources, issues with fraud, and general security concerns; 
   and architectures which "solve" these issues with gates.  What's 
   more, NATs themselves cause similar issues. 
    
   Furthermore, forking and on-hold scenarios have led to issues with 
   the media that is played.  For example one not-uncommon on-hold 
   scenario leads to a media server sending music RTP to the on-hold 
   party, which works fine in a closed environment but breaks down when 
   the call put on hold also traversed the PSTN or another domain, 
   whereby multiple parties end up sending music.  Some UA's choose one 
   stream to render, others play both simultaneously with poor results.  
   The issue, I believe, is that these media servers send media without 
   actually being part of the SDP offer/answer exchange as UA's, and 
   instead assume they can simply send media as an unidentified third 
   party (which is technically valid, but not realistically sound).  
   The "correct" thing for them to do I believe is to be true B2BUA's. 
    
 
 
Kaplan                    Expires - May 2007                  [Page 7] 




                     SIP Interoperability Issues        November 2007 
 
 
    
5.   References 
    
   [RFC2543]  Rosenberg, J., Schulzrinne, H., Handley, M., and E. 
              Schooler, "SIP: Session Initiation Protocol", RFC 2543, 
              March 1999. 
    
   [RFC2833]  Schulzrinne, H., Taylor, T., "RTP Payload for DTMF 
              Digits, Telephony Tones, and Telephony Signals", RFC 
              4733, December 2006. 
    
   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 
              A., Peterson, J., Sparks, R., Handley, M., and E. 
              Schooler, "SIP: Session Initiation Protocol", RFC 3261, 
              June 2002. 
    
   [RFC3264]  Rosenberg, J., Schulzrinne, H., "An Offer/Answer Model 
              with the Session Description Protocol (SDP)", RFC 3264, 
              June 2002. 
    
   [RFC3608]  Willis, D., Hoeneisen, B., "Session Initiation Protocol 
              (SIP) Extension Header Field for Service Route Discovery 
              During Registration", RFC 3608, October 2003. 
    
   [RFC3966]  Schulzrinne, H., "The tel URI for Telephone Numbers", RFC 
              3966, December 2004. 
    
   [RFC4244]  Barnes, M., "An Extension to the Session Initiation 
              Protocol (SIP) for Request History Information", RFC 
              4244, November 2005. 
    
   [RFC4474]  Peterson, J., Jennings, C., "Enhancements for 
              Authenticated Identity Management in the Session 
              Initiation Protocol (SIP)", RFC 4474, August 2006. 
    
   [RFC4730]  Burger, E., Dolly, M., "A Session Initiation Protocol 
              (SIP) Event Package for Key Press Stimulus (KPML)", RFC 
              4730, November 2006. 
    
   [sip-outbound]  Jennings, C., Mahy, R., "Managing Client Initiated 
              Connections in the Session Initiation Protocol (SIP)", 
              draft-ietf-sip-outbound-11.txt, 2007. 
    
   [draft-diversion]  Levy, S., Yang, J.R., "Diversion Indication in 
              SIP", draft-levy-sip-diversion-08.txt, August 2004. 
    
   [stucker-early-media]  Stucker, B., "Coping with Early Media in the 
              Session Initiation Protocol (SIP)", draft-stucker-
              sipping-early-media-coping-03.txt, October 2006. 
 
 
Kaplan                    Expires - May 2007                  [Page 8] 




                     SIP Interoperability Issues        November 2007 
 
 
    
   [stucker-middleboxes]  Stucker, B., Tschofenig, H., "Analysis of 
              Middlebox Interactions for Signaling Protocol 
              Communication along the Media Path", draft-sipping-
              stucker-media-path-middleboxes-00.txt, November 2007. 
    
 
Author's Address 
    
   Hadriel Kaplan 
   Acme Packet 
   71 Third Ave. 
   Burlington, MA 01803, USA 
   Email: hkaplan@acmepacket.com 



































 
 
Kaplan                    Expires - May 2007                  [Page 9]