Packet Analysis and software like Wireshark are powerful first-responder tools.
Recently had a customer open a support request with the primary complaint being that the corporate office could not call the branch office over the VoIP network.
The branch office could call the corporate office and for 3-5 minutes afterwards the corporate office could call the branch but otherwise corporate could not call the branch office (caller would hear 8 seconds of silence and then voice-mail).
As the customers network engineer, at first I naturally did what any sensible network engineer would do… I blamed the phone vendor! Especially since at the exact same time phone calls were failing, this customer was from the corporate office, able to ping the exact same branch office phone for which calls were failing!
However, I guess doing what’s in the customers best interest got the best of me and so I enabled Cisco’s packet-capture feature on the customer’s ASA firewall and captured traffic on the corporate firewall to observe what a good call looked like…
In the good call trace above, notice Packet#9 and Packet#10 and the rest… this is the phone in the branch office responding with normal SIP call setup messages.
I then re-captured traffic on the firewall to observe what a bad call looked like…
In the bad call trace above, notice the absence of those normal SIP call setup messages? All we see in the bad call trace is the corporate office VoIP server trying to invite the branch office phone to a call.
With this network x-ray in hand, I unfortunately could no longer deny the fact that it could not be a problem for the phone system vendor because it was clear the SIP Invite packet (Packet#5) was being sent by the VoIP server.
After doing another trace on the branch office firewall I observed that the SIP Invite packet never showed up.
After analyzing the corporate Firewall configuration it turned out that an ACL for NATing non-VPN Internet-in traffic was being matched first in the order of ACL entries over the VPN ACL for identifying traffic to be encrypted in the VPN tunnel bound for the branch office.
After re-arranging the above NAT ACL’s under the VPN ACL’s in order of processing the customer issues was resolved.
The ASA default 3-minute timeout on SIP Invite connections was the reason for the somewhat intermittent problem where the corporate office could call the branch office for a short period of time after the branch office originated a call.
Had Wireshark not been used as a first-responder tool, it’s almost certain the network engineer (me) would have blamed the VoIP vendor and vice versa. And the customer would have gone for days trying to resolve the issue and could have gotten a whopping bill from both vendors.
No comments:
Post a Comment