Tuesday, March 4, 2014

Another Day in the Trenches


Wow past couple days been having really strange issues between the DataCenter where hosted VoIP server live and customer VPNs.

This morning have a Customer VPN go down (been up and running in operation for over 6 months with no problems) so I do some verification on both sides (I’m able to SSH into customer Firewall)…

DataCenter: 
# show crypto isakmp sa
AM_WAIT_MSG3

Customer: 
# show crypto isakmp sa
AM_WAIT_MSG2

So basically the DataCenter Firewall received the initial DH public key sent to it and responded but the response did not get to the customers Firewall.

So I do some ping tests…

 

image

DataCenter Firewall is unable to ping the Customer Firewall.

But DataCenter Firewall CAN ping the Customer default gateway (and vice versa).

From a 3rd off-site location (my remote office) I am able to ping ALL IPs!!

So the issue is very unusual and only exists between DataCenter firewall and Customer firewall.

So I can Brighthouse and the second the technician picks up the phone is starts working again.

Nobody can explain why.

Monday, March 3, 2014

Encrypt, Decrypt, Ucrypt, WeAllcrypt

 

Wow very strange VPN-related issue today. Customer opens trouble-ticket and complains that network connectivity is failing (VoIP phones failing to register with hosted CME thru VPN tunnel).

Hmm… let’s see if we can ping the outside interface of the customer firewall. Yup no problem there.

OK… well let’s see if we can SSH to the customers firewall – nope SSH times out. Strange. Can ping, cannot SSH. So I gain access to the customers firewall inside interface through an alternate network connection and reload. After reload I can successfully SSH. Strange.

OK… let’s see if IKE PHASE1 SA is up… ‘show crypto isakmp sa’. Yup we got PHASE1 ‘AM_ACTIVE’ on both side.

OK… let’s clear the IPSEC SA counters and then see if IPSEC is encrypting and de-encrypting on both sides. Nope we got some problems…

Data-Center:
#pkts encaps: 276
#pkts decaps: 272

Customer:
#pkts encaps: 354
#pkts decaps:
0

How weird… the Data-Center shows it encapping but the customer firewall shows it’s not receiving those encaps (decaps are zero).

How can that be?!? PHASE1 SA is up and PHASE2 SA is up but customer side is not decapping?

Let’s pull a Wireshark capture on the Data-Center outside interface and see what we see…

Untitled

Welp we are in fact sending AND receiving IPSEC ESP traffic to and from both ends.

So why is the customer firewall not decapping? Good question. Don’t know. And don’t have a Cisco service contract so can’t call TAC.

Solution was to build a L2L IPSEC tunnel on a secondary firewall at customers location.