Frequent IPsec disconnects with 2.2


  • We have seen frequent disconnects on IPsec.
    Both sides are pfSense 2.2, the current configuration is:

    • IKEv2
    • Mutual RSA
    • AES 256 / SHA 384 / DH 14 (2048)
    • Lifetime 28800

    Multiple P2

    • ESP
    • AES256-CGM / auto
    • SHA 384 / PFS 14
    • Lifetime 3600

    The log below is from while we cannot ping hosts on the other end, and when the tunnel on the status page says "Connecting"

    Feb 1 09:16:53 	charon: 12[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:16:53 	charon: 06[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:16:49 	charon: 06[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:16:49 	charon: 12[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:16:45 	charon: 12[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:16:45 	charon: 06[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:16:43 	charon: 06[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:16:43 	charon: 12[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:16:37 	charon: 12[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:16:37 	charon: 06[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:16:27 	charon: 13[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:16:27 	charon: 06[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:16:24 	charon: 06[NET] sending packet: from 1.1.1.1[500] to 2.2.2.2[500] (440 bytes)
    Feb 1 09:16:24 	charon: 06[IKE] retransmit 4 of request with message ID 0
    Feb 1 09:16:24 	charon: 06[IKE] <con1|24> retransmit 4 of request with message ID 0
    Feb 1 09:16:22 	charon: 06[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:16:22 	charon: 13[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:16:07 	charon: 13[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:16:07 	charon: 06[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:16:00 	charon: 06[NET] sending packet: from 1.1.1.1[500] to 2.2.2.2[500] (440 bytes)
    Feb 1 09:16:00 	charon: 06[IKE] retransmit 3 of request with message ID 0
    Feb 1 09:16:00 	charon: 06[IKE] <con1|24> retransmit 3 of request with message ID 0
    Feb 1 09:15:58 	charon: 06[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:15:58 	charon: 11[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:15:53 	charon: 11[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:15:53 	charon: 06[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:15:49 	charon: 06[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:15:49 	charon: 11[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:15:47 	charon: 11[NET] sending packet: from 1.1.1.1[500] to 2.2.2.2[500] (440 bytes)
    Feb 1 09:15:47 	charon: 11[IKE] retransmit 2 of request with message ID 0
    Feb 1 09:15:47 	charon: 11[IKE] <con1|24> retransmit 2 of request with message ID 0
    Feb 1 09:15:45 	charon: 11[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:15:45 	charon: 06[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:15:40 	charon: 15[NET] sending packet: from 1.1.1.1[500] to 2.2.2.2[500] (440 bytes)
    Feb 1 09:15:40 	charon: 15[IKE] retransmit 1 of request with message ID 0
    Feb 1 09:15:40 	charon: 15[IKE] <con1|24> retransmit 1 of request with message ID 0
    Feb 1 09:15:39 	charon: 15[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:15:39 	charon: 06[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:15:36 	charon: 06[CFG] ignoring acquire, connection attempt pending
    Feb 1 09:15:36 	charon: 15[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:15:36 	charon: 15[NET] sending packet: from 1.1.1.1[500] to 2.2.2.2[500] (440 bytes)
    Feb 1 09:15:36 	charon: 15[ENC] generating IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) N(FRAG_SUP) ]
    Feb 1 09:15:36 	charon: 15[IKE] initiating IKE_SA con1[24] to 2.2.2.2
    Feb 1 09:15:36 	charon: 15[IKE] <con1|24> initiating IKE_SA con1[24] to 2.2.2.2
    Feb 1 09:15:36 	charon: 06[KNL] creating acquire job for policy 1.1.1.1/32|/0 === 2.2.2.2/32|/0 with reqid {1}
    Feb 1 09:15:36 	charon: 14[KNL] unable to delete SAD entry with SPI cd9cb978: No such file or directory (2)
    Feb 1 09:15:36 	charon: 14[KNL] unable to delete SAD entry with SPI 00000fca: No such file or directory (2)
    Feb 1 09:15:36 	charon: 14[IKE] giving up after 5 retransmits
    Feb 1 09:15:36 	charon: 14[IKE] <con1|21> giving up after 5 retransmits
    Feb 1 09:14:20 	charon: 16[NET] sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (172 bytes)
    Feb 1 09:14:20 	charon: 16[NET] sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (540 bytes)
    Feb 1 09:14:20 	charon: 16[IKE] retransmit 5 of request with message ID 2
    Feb 1 09:14:20 	charon: 16[IKE] <con1|21> retransmit 5 of request with message ID 2
    Feb 1 09:13:38 	charon: 16[NET] sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (172 bytes)
    Feb 1 09:13:38 	charon: 16[NET] sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (540 bytes)
    Feb 1 09:13:38 	charon: 16[IKE] retransmit 4 of request with message ID 2
    Feb 1 09:13:38 	charon: 16[IKE] <con1|21> retransmit 4 of request with message ID 2
    Feb 1 09:13:15 	charon: 16[NET] sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (172 bytes)
    Feb 1 09:13:15 	charon: 16[NET] sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (540 bytes)
    Feb 1 09:13:15 	charon: 16[IKE] retransmit 3 of request with message ID 2
    Feb 1 09:13:15 	charon: 16[IKE] <con1|21> retransmit 3 of request with message ID 2
    Feb 1 09:13:10 	charon: 16[KNL] creating rekey job for ESP CHILD_SA with SPI c7e7c5d1 and reqid {1}
    Feb 1 09:13:02 	charon: 16[NET] sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (172 bytes)
    Feb 1 09:13:02 	charon: 16[NET] sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (540 bytes)
    Feb 1 09:13:02 	charon: 16[IKE] retransmit 2 of request with message ID 2
    Feb 1 09:13:02 	charon: 16[IKE] <con1|21> retransmit 2 of request with message ID 2</con1|21></con1|21></con1|21></con1|21></con1|21></con1|24></con1|24></con1|24></con1|24></con1|24>
    

  • Bump


  • Do you have "prefer old SAs" enabled on either side? Should be disabled on both.


  • This is disabled on both sides


  • Do you have logs from the opposite side at the same time as what you showed above? Doesn't seem to be anything too telling there, it was trying to negotiate and seemingly got now answer.

    Try adding a new tunable under System>Advanced, Tunables, for net.inet.ipsec.debug and set it to 1. That'll log some additional information to the system log that might be useful.

    What type of hardware is this? You using AES-NI?


  • I will do that, thanks.
    This is ESX 5.5 on both sides, AES-NI is enabled, but not used by pfSense AFAIK. (Processors would be capable)


  • Attached logs from both Firewall endpoints as requested and with the debug option = 1

    FW-LEFT.txt
    FW-RIGHT.txt


  • I see you have compression enabled on both sides can you check the value of net.inet.ipcomp.ipcomp_enable tunable?

    Just issue sysctl net.inet.ipcomp.ipcomp_enable and see its value.
    Since i see your compression enabled i expect that value to be 1 can you set that to 1 if its not and see if it improves situation?

    Also check if you have any settings for this on system->advanced->tunable


  • I actually disabled compression before this test and I had rebootet both hosts because I thought it could be the issue.

    It did not change the situation, I have the disconnects/timeouts with it being enabled or disabled.

    The current value for net.inet.ipcomp.ipcomp_enable is 0 on both hosts currently.
    No settings in Tunables.

    Should I still enable it again and make sure the value is 1?


  • Yes since hosts are still trying to use that!


  • OK, I have enabled it again and made sure the sysctl call returned 1.

    Will post another set of logs when it happens the next time


  • Attached another set of logs after a disconnect.
    This time with compression ON

    I can also see this on the console:
    ipcomp_output_cb: compressions was useless 104 - 20 <= 86

    1.1.1.1.txt
    2.2.2.2.txt