IPSEC erreur tunnel



  • bonjour à tous,

    Contexte :
    Milieu professionnel
    Expertise : je ne suis pas débutant en matière de réseau ni d'utilisation des systèmes unix. Je suis certifié Redhat et Windows.
    J'utilise quasi au quotidien des systèmes Windows, Linux et FreeBSD. 
    J'utilise la solution pfsense depuis 2 ans à titre personnel et professionnel (en production) depuis 1 an.
    Je ne suis pas expert mais je comprends le sujet et ce que je fais.

    N'hésitez pas à me corriger si ma compréhension n'est pas correcte, nous sommes humains après tout :)

    Besoin :
    je rencontre depuis plusieurs semaines un problème avec une configuration IPSEC entre 2 sites.
    Les deux connexions internet sont stables avec un bon temps de réponse (+/- 20-30 ms)
    Le tunnel est monté et stable, le renouvellement des clés se passent sans problème.
    Les routeurs sont installés dans un envirronement de production et servent à renforcer la disponibilité entre 2 sites.
    Le routeur B est utilisé avec 5 autres routeurs pfsense version 2.3.3 release p1 et le meme type de configuration IPSEC.
    Le routeur B est le point central de 'la toile'.

    Le problème se situe (à ma connaissance) au niveau du protocol ESP.
    Les entrées SAD utilisées/conservées sur le routeur B sont parfois trop nombreuses pour 1 configuration.
    En forçant le redémarrage d'un tunnel pour lequel j'ai trop d'entrées (nommons tunnel du site C vers B)
    me permet de résoudre le problème de mon tunnel entre le site A et B.
    Dans les logs je retrouve aléatoirement pour les 6 tunnels (dans ce cas-ci la configuration n°6):

    <con6|2534>unable to query SAD entry with SPI c69555cb: No such file or directory (2)
    <con6|2534>unable to delete SAD entry with SPI c69555cb: No such file or directory (2)</con6|2534></con6|2534>

    Schémas :

    Routeur A             Routeur B
      (X)–-------internet---------(X)

    Configuration routeur A

    version 2.3.3 release p1
    wan : 91.X.X.X  connexion PPPoE fourni par le FAI
    lan : 192.168.50.0/24

    configuration ipsec
    conn con3
            fragmentation = yes
            keyexchange = ikev2
            reauth = yes
            forceencaps = no
            mobike = no

    rekey = yes
            installpolicy = yes
            type = tunnel
            dpdaction = restart
            dpddelay = 10s
            dpdtimeout = 60s
            auto = route
            left = 109.Y.Y.Y
            right = 91.X.X.X
            leftid = 109.Y.Y.Y
            ikelifetime = 28800s
            lifetime = 3600s
            ike = aes256-sha1-modp1024!
            esp = aes256-sha1,aes192-sha1,aes128-sha1!
            leftauth = psk
            rightauth = psk
            rightid = 91.X.X.X
            rightsubnet = 192.168.50.0/24
            leftsubnet = 192.168.10.0/24

    Configuration routeur B

    version 2.3.3 release p1
    wan : 109.Y.Y.Y ip fixe /30 fourni pas le FAI
    lan : 192.168.10.0/24

    configuration ipsec :
    conn con1
    fragmentation = yes
    keyexchange = ikev2
    reauth = yes
    forceencaps = no
    mobike = no

    rekey = yes
    installpolicy = yes
    type = tunnel
    dpdaction = restart
    dpddelay = 10s
    dpdtimeout = 60s
    auto = route
    left = 91.X.X.X
    right = 109.Y.Y.Y
    leftid = 91.X.X.X
    ikelifetime = 28800s
    lifetime = 3600s
    ike = aes256-sha1-modp1024!
    esp = aes256-sha1,aes192-sha1,aes128-sha1!
    leftauth = psk
    rightauth = psk
    rightid = 109.Y.Y.Y
    rightsubnet = 192.168.10.0/24
    leftsubnet = 192.168.50.0/24

    Question :
    Ma configuration est-elle correcte ? Comment puis-je l'améliorer pour diminuer voir résoudre mon problème ?

    Est-il possible de régler/paramétriser/gérer le nombre d'entrée SAD/ ?
    Si oui dans quelle limite ? Et existe-t-il une limite ?

    Actuellement, le contournement du problème est facile, il me suffit de forcer le redémarrage de chaque tunnel sur mon routeur B via
    un script dans cron lancé en dehors des heures de bureau, mais cela ne résoud pas le problème.

    Logs :

    Routeur A :

    Jun 12 11:41:30 routerA charon: 08[IKE] <con1|66>INFORMATIONAL request with message ID 0 processing failed
    Jun 12 11:41:35 routerA charon: 13[IKE] <con1|66>retransmit 2 of request with message ID 0
    Jun 12 11:41:35 routerA charon: 13[NET] <con1|66>sending packet: from 91.X.X.X[500] to 109.Y.Y.Y[500] (336 bytes)
    Jun 12 11:41:35 routerA charon: 05[KNL] creating acquire job for policy 91.X.X.X/32|/0 === 109.Y.Y.Y/32|/0 with reqid {3}
    Jun 12 11:41:48 routerA charon: 13[IKE] <con1|66>retransmit 3 of request with message ID 0
    Jun 12 11:41:48 routerA charon: 13[NET] <con1|66>sending packet: from 91.X.X.X[500] to 109.Y.Y.Y[500] (336 bytes)
    Jun 12 11:41:56 routerA charon: 08[KNL] creating acquire job for policy 91.X.X.X/32|/0 === 109.Y.Y.Y/32|/0 with reqid {3}
    Jun 12 11:41:56 routerA charon: 13[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:42:09 routerA charon: 13[KNL] creating acquire job for policy 91.X.X.X/32|/0 === 109.Y.Y.Y/32|/0 with reqid {3}
    Jun 12 11:42:09 routerA charon: 08[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:42:11 routerA charon: 08[IKE] <con1|66>retransmit 4 of request with message ID 0
    Jun 12 11:42:11 routerA charon: 08[NET] <con1|66>sending packet: from 91.X.X.X[500] to 109.Y.Y.Y[500] (336 bytes)
    Jun 12 11:42:18 routerA charon: 13[KNL] creating acquire job for policy 91.X.X.X/32|/0 === 109.Y.Y.Y/32|/0 with reqid {3}
    Jun 12 11:42:18 routerA charon: 08[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:42:40 routerA charon: 08[KNL] creating acquire job for policy 91.X.X.X/32|/0 === 109.Y.Y.Y/32|/0 with reqid {3}
    Jun 12 11:42:40 routerA charon: 13[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:42:45 routerA charon: 13[KNL] creating acquire job for policy 91.X.X.X/32|/0 === 109.Y.Y.Y/32|/0 with reqid {3}
    Jun 12 11:42:45 routerA charon: 14[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:42:53 routerA charon: 14[IKE] <con1|66>retransmit 5 of request with message ID 0
    Jun 12 11:42:53 routerA charon: 14[NET] <con1|66>sending packet: from 91.X.X.X[500] to 109.Y.Y.Y[500] (336 bytes)
    Jun 12 11:43:06 routerA charon: 07[KNL] creating acquire job for policy 91.X.X.X/32|/0 === 109.Y.Y.Y/32|/0 with reqid {3}
    Jun 12 11:43:06 routerA charon: 08[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:43:24 routerA charon: 07[NET] <con1|66>received packet: from 109.Y.Y.Y[500] to 91.X.X.X[500] (40 bytes)
    Jun 12 11:43:24 routerA charon: 07[ENC] <con1|66>payload type NOTIFY was not encrypted
    Jun 12 11:43:24 routerA charon: 07[ENC] <con1|66>could not decrypt payloads
    Jun 12 11:43:24 routerA charon: 07[IKE] <con1|66>integrity check failed
    Jun 12 11:43:24 routerA charon: 07[IKE] <con1|66>INFORMATIONAL request with message ID 0 processing failed</con1|66></con1|66></con1|66></con1|66></con1|66></con1|66></con1|66></con1|66></con1|66></con1|66></con1|66></con1|66></con1|66></con1|66>

    Routeur B :

    Jun 12 11:41:26 routerB charon: 01[IKE] <con3|2575>retransmit 5 of request with message ID 0
    Jun 12 11:41:26 routerB charon: 01[NET] <con3|2575>sending packet: from 109.Y.Y.Y[500] to 91.X.X.X[500] (336 bytes)
    Jun 12 11:41:26 routerB charon: 04[NET] error writing to socket: Permission denied
    Jun 12 11:41:37 routerB charon: 01[KNL] creating acquire job for policy 109.Y.Y.Y/32|/0 === 91.X.X.X/32|/0 with reqid {17}
    Jun 12 11:41:37 routerB charon: 08[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:42:15 routerB charon: 01[KNL] creating acquire job for policy 109.Y.Y.Y/32|/0 === 91.X.X.X/32|/0 with reqid {17}
    Jun 12 11:42:15 routerB charon: 09[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:42:37 routerB charon: 09[KNL] creating acquire job for policy 109.Y.Y.Y/32|/0 === 91.X.X.X/32|/0 with reqid {17}
    Jun 12 11:42:37 routerB charon: 10[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:42:42 routerB charon: 10[IKE] <con3|2575>giving up after 5 retransmits
    Jun 12 11:42:42 routerB charon: 10[IKE] <con3|2575>peer not responding, trying again (2/3)
    Jun 12 11:42:42 routerB charon: 10[IKE] <con3|2575>initiating IKE_SA con3[2575] to 91.X.X.X
    Jun 12 11:42:42 routerB charon: 10[ENC] <con3|2575>generating IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) N(FRAG_SUP) N(HASH_ALG) N(REDIR_SUP) ]
    Jun 12 11:42:42 routerB charon: 10[NET] <con3|2575>sending packet: from 109.Y.Y.Y[500] to 91.X.X.X[500] (336 bytes)
    Jun 12 11:42:42 routerB charon: 04[NET] error writing to socket: Permission denied
    Jun 12 11:42:46 routerB charon: 09[IKE] <con3|2575>retransmit 1 of request with message ID 0
    Jun 12 11:42:46 routerB charon: 09[NET] <con3|2575>sending packet: from 109.Y.Y.Y[500] to 91.X.X.X[500] (336 bytes)
    Jun 12 11:42:46 routerB charon: 04[NET] error writing to socket: Permission denied
    Jun 12 11:42:53 routerB charon: 15[IKE] <con3|2575>retransmit 2 of request with message ID 0
    Jun 12 11:42:53 routerB charon: 15[NET] <con3|2575>sending packet: from 109.Y.Y.Y[500] to 91.X.X.X[500] (336 bytes)
    Jun 12 11:42:53 routerB charon: 04[NET] error writing to socket: Permission denied
    Jun 12 11:43:06 routerB charon: 09[IKE] <con3|2575>retransmit 3 of request with message ID 0
    Jun 12 11:43:06 routerB charon: 09[NET] <con3|2575>sending packet: from 109.Y.Y.Y[500] to 91.X.X.X[500] (336 bytes)
    Jun 12 11:43:06 routerB charon: 04[NET] error writing to socket: Permission denied
    Jun 12 11:43:18 routerB charon: 09[KNL] creating acquire job for policy 109.Y.Y.Y/32|/0 === 91.X.X.X/32|/0 with reqid {17}
    Jun 12 11:43:18 routerB charon: 09[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:43:29 routerB charon: 12[IKE] <con3|2575>retransmit 4 of request with message ID 0
    Jun 12 11:43:29 routerB charon: 12[NET] <con3|2575>sending packet: from 109.Y.Y.Y[500] to 91.X.X.X[500] (336 bytes)
    Jun 12 11:43:29 routerB charon: 04[NET] error writing to socket: Permission denied
    Jun 12 11:43:47 routerB charon: 14[KNL] creating acquire job for policy 109.Y.Y.Y/32|/0 === 91.X.X.X/32|/0 with reqid {17}
    Jun 12 11:43:47 routerB charon: 07[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:44:11 routerB charon: 07[IKE] <con3|2575>retransmit 5 of request with message ID 0
    Jun 12 11:44:11 routerB charon: 07[NET] <con3|2575>sending packet: from 109.Y.Y.Y[500] to 91.X.X.X[500] (336 bytes)
    Jun 12 11:44:11 routerB charon: 04[NET] error writing to socket: Permission denied
    Jun 12 11:44:12 routerB charon: 13[KNL] creating acquire job for policy 109.Y.Y.Y/32|/0 === 91.X.X.X/32|/0 with reqid {17}
    Jun 12 11:44:12 routerB charon: 08[CFG] ignoring acquire, connection attempt pending
    Jun 12 11:44:33 routerB charon: 08[KNL] creating acquire job for policy 109.Y.Y.Y/32|/0 === 91.X.X.X/32|/0 with reqid {17}
    Jun 12 11:44:33 routerB charon: 06[CFG] ignoring acquire, connection attempt pending</con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575></con3|2575>

    Egalement dans le routeur B, comme cité plus haut, je retrouve :

    Jun 12 10:34:23 routerB charon: 08[CFG] <con6|2566>selected peer config 'con6'
    Jun 12 10:34:23 routerB charon: 08[IKE] <con6|2566>authentication of '194.Z.Z.Z' with pre-shared key successful
    Jun 12 10:34:23 routerB charon: 08[IKE] <con6|2566>received ESP_TFC_PADDING_NOT_SUPPORTED, not using ESPv3 TFC padding
    Jun 12 10:34:23 routerB charon: 08[IKE] <con6|2566>authentication of '109.Y.Y.Y' (myself) with pre-shared key
    Jun 12 10:34:23 routerB charon: 08[IKE] <con6|2534>destroying duplicate IKE_SA for peer '194.Z.Z.Z', received INITIAL_CONTACT
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c69555cb: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c76aaa53: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c71ba1a6: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c2ad329a: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI cf8496fa: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c2f57537: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c49944c5: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI cec7a50c: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c02bb364: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI ce28510b: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c25d4530: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c1c9afb6: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c91262c9: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI c523fbae: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to query SAD entry with SPI cd2fb749: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI cd2fb749: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c523fbae: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c91262c9: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c1c9afb6: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c25d4530: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI ce28510b: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c02bb364: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI cec7a50c: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c49944c5: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c2f57537: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI cf8496fa: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c2ad329a: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c71ba1a6: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c76aaa53: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[KNL] <con6|2534>unable to delete SAD entry with SPI c69555cb: No such file or directory (2)
    Jun 12 10:34:23 routerB charon: 08[IKE] <con6|2566>IKE_SA con6[2566] established between 109.Y.Y.Y[109.Y.Y.Y]…194.Z.Z.Z.Z[194.Z.Z.Z.Z]
    Jun 12 10:34:23 routerB charon: 08[IKE] <con6|2566>scheduling reauthentication in 27782s
    Jun 12 10:34:23 routerB charon: 08[IKE] <con6|2566>maximum IKE_SA lifetime 28322s</con6|2566></con6|2566></con6|2566></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2534></con6|2566></con6|2566></con6|2566></con6|2566>

    Ce qui me renseigne la configuration à relancer.

    Vous manque-t-il des éléments ? D'autres informations ? L'explication du problème est-elle claire ?

    Merci pour votre lecture jusqu'au bout :D

    Cordialement,

    Mathieu



  • bonjour à tous,

    Je viens avec de nouveaux éléments.

    Voici en attaché ce que je retrouve dans le routeur principal. une couleur correspond à un routeur.

    Je pense avoir trouvé mais sans certitude la réponse à mon problème dans le rfc3706
    http://www.ietf.org/rfc/rfc3706.txt

    Je vous reviens prochainement.

    Cordialement,

    Mathieu




  • Bonjour à tous,

    Comme dit plus haut, je pense avoir trouvé la réponse dans la RFC mais mon problème n'est pas résolu pour autant.
    Un coté du tunnel peut avoir un DPD élevé (configuration par défaut de pfsense delay=10sec, maxfailure=5)
    tandis que l'autre coté du tunnel n'est pas obligé d'avoir cette configuration, elle peut etre de delay=60sec, maxfailure=5

    la modification est faite en production sur 2 tunnels avec lesquels je recontre le plus souvent l'erreur et cela ne solutionne pas mon problème.

    Si l'un d'entre vous à une idée, je suis preneur :)

    Ce sujet étant posté dans la partie francophone, il serait peut être judicieux de l'exposer à la partie anglophone, qu'en pensez-vous ? (je me charge de la traduction)

    Cordialement,

    Mathieu