Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Odd network behavior after upgrading from 1.2 to 1.2.3

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    14 Posts 3 Posters 5.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • W
      wallabybob
      last edited by

      The header length errors: the reported length seems to generally differ from a correct length (>= 20) by a single bit. I'd explore to see if there is a memory error or power supply error causing an intermittent memory error. That the header length error occurs intermittently and in bursts is particularly nasty. Perhaps running memtest86 (or memtest86+) for an extended period might give some clues. Perhaps reseating the memory cards might help. Does your memory have ECC and is it enabled?

      The block out on em2: Is 195.41.114.39.143 your mail server? Are these packets also blocked because they are badly formed (e.g. incorrect header length)? If not, can you suggest why they are blocked. Since firewall rules allegedly are applied only on reception I'm left to suspect the problem is in the packet formatting rather than the packet matching a firewall rule.

      1 Reply Last reply Reply Quote 0
      • M
        madsenandersc
        last edited by

        A memory error is not very likely since a) it is indeed ECC RAM and b) the problem persists even if I do a failover to the secondary router. Granted, both physical servers may have a memory error but I don't really believe in it… :)

        I've had a windows running with the filter output for a longer period and I'm seeing header lengths of 0, 4, 8, 12 and 16. I've noticed that the problem starts with a header length of 0, then progresses to 4, 8, 12 and eventually 16 before the problem clears again.

        195.41.114.39.143 is our mailserver and I'm unable to see why they are blocked unless someone can tell me what is in the default rule in pfSense. This is the raw output from the firewall for some of the blocked packets:

        Jun 15 14:30:39 pf: 004489 rule 561/0(match): block out on em2: (tos 0x0, ttl 63, id 54638, offset 0, flags [none], proto TCP (6), length 93) 195.41.114.39.143 > 83.92.176.48.52980: P, cksum 0x11bd (correct), 1723590538:1723590591(53) ack 1602018465 win 1728
        Jun 15 14:30:39 pf: 000122 rule 561/0(match): block out on em2: (tos 0x0, ttl 63, id 3707, offset 0, flags [none], proto TCP (6), length 40) 195.41.114.39.143 > 83.92.176.48.52980: F, cksum 0x1619 (correct), 53:53(0) ack 1 win 1728
        Jun 15 14:30:39 pf: 001928 rule 561/0(match): block out on em2: (tos 0x0, ttl 63, id 17419, offset 0, flags [none], proto TCP (6), length 93) 195.41.114.39.143 > 83.92.176.48.52980: FP, cksum 0x11bc (correct), 0:53(53) ack 1 win 1728
        Jun 15 14:30:40 pf: 194432 rule 561/0(match): block out on em2: (tos 0x0, ttl 63, id 42404, offset 0, flags [none], proto TCP (6), length 93) 195.41.114.39.143 > 83.92.176.48.52980: FP, cksum 0x11bc (correct), 0:53(53) ack 1 win 1728

        Thanks for your help so far, BTW, it is very much appreciated.

        Best regards,
        Anders C. Madsen

        1 Reply Last reply Reply Quote 0
        • W
          wallabybob
          last edited by

          @madsenandersc:

          A memory error is not very likely since a) it is indeed ECC RAM

          And ECC is enabled in the BIOS? (On a couple of my home systems installing ECC memory is not sufficient to enable ECC, it has to be specifically enabled in the BIOS. But those systems are not DELL systems so they are probably not good predictors for the behaviour of DELL systems.)

          Is em2 a common component of the reports? How is em2 different from your other interfaces (e.g. different chipset?, different bus type?, different bus?) Please provide the output from the shell command pciconf -l

          Thanks for your help so far, BTW, it is very much appreciated.

          Doesn't seem I've helped much yet, but thanks for the appreciation.

          1 Reply Last reply Reply Quote 0
          • M
            madsenandersc
            last edited by

            @wallabybob:

            And ECC is enabled in the BIOS? (On a couple of my home systems installing ECC memory is not sufficient to enable ECC, it has to be specifically enabled in the BIOS. But those systems are not DELL systems so they are probably not good predictors for the behaviour of DELL systems.)

            The server will not even boot with non-ECC RAM so that is a given. :)

            Is em2 a common component of the reports? How is em2 different from your other interfaces (e.g. different chipset?, different bus type?, different bus?) Please provide the output from the shell command pciconf -l

            Well, em2 is present in many of the reports but since it is WAN, them majority of the traffic is passing through it. Interfaces em0 through em6 are identical (Intel PRO/1000 cards).

            pciconf -l

            hostb0@pci0:0:0:0: class=0x060000 card=0x80868086 chip=0x25d88086 rev=0x92 hdr=0x00
            pcib1@pci0:0:2:0: class=0x060400 card=0x00000000 chip=0x25e28086 rev=0x92 hdr=0x01
            pcib6@pci0:0:3:0: class=0x060400 card=0x00000000 chip=0x25e38086 rev=0x92 hdr=0x01
            pcib7@pci0:0:4:0: class=0x060400 card=0x00000000 chip=0x25e48086 rev=0x92 hdr=0x01
            pcib9@pci0:0:5:0: class=0x060400 card=0x00000000 chip=0x25e58086 rev=0x92 hdr=0x01
            pcib10@pci0:0:6:0: class=0x060400 card=0x00000000 chip=0x25f98086 rev=0x92 hdr=0x01
            pcib11@pci0:0:7:0: class=0x060400 card=0x00000000 chip=0x25e78086 rev=0x92 hdr=0x01
            hostb1@pci0:0:16:0: class=0x060000 card=0x01b81028 chip=0x25f08086 rev=0x92 hdr=0x00
            hostb2@pci0:0:16:1: class=0x060000 card=0x01b81028 chip=0x25f08086 rev=0x92 hdr=0x00
            hostb3@pci0:0:16:2: class=0x060000 card=0x01b81028 chip=0x25f08086 rev=0x92 hdr=0x00
            hostb4@pci0:0:17:0: class=0x060000 card=0x80868086 chip=0x25f18086 rev=0x92 hdr=0x00
            hostb5@pci0:0:19:0: class=0x060000 card=0x80868086 chip=0x25f38086 rev=0x92 hdr=0x00
            hostb6@pci0:0:21:0: class=0x060000 card=0x80868086 chip=0x25f58086 rev=0x92 hdr=0x00
            hostb7@pci0:0:22:0: class=0x060000 card=0x80868086 chip=0x25f68086 rev=0x92 hdr=0x00
            pcib12@pci0:0:28:0: class=0x060400 card=0x01b81028 chip=0x26908086 rev=0x09 hdr=0x01
            uhci0@pci0:0:29:0: class=0x0c0300 card=0x01b81028 chip=0x26888086 rev=0x09 hdr=0x00
            uhci1@pci0:0:29:1: class=0x0c0300 card=0x01b81028 chip=0x26898086 rev=0x09 hdr=0x00
            uhci2@pci0:0:29:2: class=0x0c0300 card=0x01b81028 chip=0x268a8086 rev=0x09 hdr=0x00
            uhci3@pci0:0:29:3: class=0x0c0300 card=0x01b81028 chip=0x268b8086 rev=0x09 hdr=0x00
            ehci0@pci0:0:29:7: class=0x0c0320 card=0x01b81028 chip=0x268c8086 rev=0x09 hdr=0x00
            pcib14@pci0:0:30:0: class=0x060401 card=0x00000000 chip=0x244e8086 rev=0xd9 hdr=0x01
            isab0@pci0:0:31:0: class=0x060100 card=0x00000000 chip=0x26708086 rev=0x09 hdr=0x00
            atapci0@pci0:0:31:1: class=0x01018a card=0x01b81028 chip=0x269e8086 rev=0x09 hdr=0x00
            pcib2@pci0:4:0:0: class=0x060400 card=0x00000000 chip=0x35008086 rev=0x01 hdr=0x01
            pcib5@pci0:4:0:3: class=0x060400 card=0x00000000 chip=0x350c8086 rev=0x01 hdr=0x01
            pcib3@pci0:5:0:0: class=0x060400 card=0x00000000 chip=0x35108086 rev=0x01 hdr=0x01
            pcib4@pci0:5:1:0: class=0x060400 card=0x00000000 chip=0x35148086 rev=0x01 hdr=0x01
            em0@pci0:7:0:0: class=0x020000 card=0x135e8086 chip=0x105e8086 rev=0x06 hdr=0x00
            em1@pci0:7:0:1: class=0x020000 card=0x135e8086 chip=0x105e8086 rev=0x06 hdr=0x00
            em2@pci0:8:1:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00
            em3@pci0:8:2:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00
            em4@pci0:9:0:0: class=0x020000 card=0x135e8086 chip=0x105e8086 rev=0x06 hdr=0x00
            em5@pci0:9:0:1: class=0x020000 card=0x135e8086 chip=0x105e8086 rev=0x06 hdr=0x00
            pcib8@pci0:10:0:0: class=0x060400 card=0x00000000 chip=0x032c8086 rev=0x09 hdr=0x01
            mpt0@pci0:11:8:0: class=0x010000 card=0x1f091028 chip=0x00541000 rev=0x01 hdr=0x00
            pcib13@pci0:2:0:0: class=0x060400 card=0x00000000 chip=0x01031166 rev=0xc3 hdr=0x01
            bce0@pci0:3:0:0: class=0x020000 card=0x01b81028 chip=0x164c14e4 rev=0x12 hdr=0x00
            vgapci0@pci0:14:13:0: class=0x030000 card=0x01b81028 chip=0x515e1002 rev=0x02 hdr=0x00

            Doesn't seem I've helped much yet, but thanks for the appreciation.

            Sometimes just asking the right questions is a big part of finding the solution. :)

            At least I have a feeling of going forward right now - that is a nice change from the last five days.

            Best regards,
            Anders C. Madsen

            1 Reply Last reply Reply Quote 0
            • W
              wallabybob
              last edited by

              @madsenandersc:

              Interfaces em0 through em6 are identical (Intel PRO/1000 cards).

              There is quite a number of different chipsets used in PRO/1000 cards with different bus interfaces and different physical interfaces. Not all "em" NICs are the same.

              Note that em2 and em3 have different card numbers and chip numbers from em0, em1, em4 and em5. Can you move WAN to another interface and does that make a difference?

              1 Reply Last reply Reply Quote 0
              • W
                wallabybob
                last edited by

                If my decoding is correct em0, em1, em4 and em5 have the 82571 chip while em2 and em3 have the 82541 chip. From data on the Intel web site it appears the 82541 was first introduced in Q3 2003 and the 82571 in Q3 2005. The 82541 has one port per chip, the 82571 two ports per chip. I think there are significant enough differences to warrant trying another interface to see if you get a different result.

                1 Reply Last reply Reply Quote 0
                • M
                  madsenandersc
                  last edited by

                  @wallabybob:

                  If my decoding is correct em0, em1, em4 and em5 have the 82571 chip while em2 and em3 have the 82541 chip. From data on the Intel web site it appears the 82541 was first introduced in Q3 2003 and the 82571 in Q3 2005. The 82541 has one port per chip, the 82571 two ports per chip. I think there are significant enough differences to warrant trying another interface to see if you get a different result.

                  Your decoding is probably correct - at least as far as the number of ports goes, so the rest is also very likely true.

                  I tried the following:

                  • Moved WAN to em0 - no difference.
                  • Moved WAN back to em2 to keep the number of variables to a minimum.
                  • Did a completely fresh install of the firmware, using version 1.2.2-RELEASE in case the problem was in the BSD 7.2 drivers or software.
                  • Restored configuration from backup and fired it up. Same problem, although it took about two hours before it surfaced - after that things were just as weird as they used to be with 1.2.3-RELEASE.
                  • Did a completely fresh install of the firmware using version 1.2-RELEASE and restored the configuration from backup.
                  • Fired it up and watched all the red lights disappear one by one. Mail and webservers are back to normal, SSH and VPN is stable, packets are no longer being dropped by the default rule.

                  I have absolutely no clue to why it is so, but for some reason our Dell PowerEdge 2900 with two single port Intel PRO/1000 cards and two dual port Intel PRO/1000 cards are incompatible with pfSense above version 1.2-RELEASE. It may be that we have configured something in a non-standard way or it may be a problem with the firmware in the NIC's or the motherboard BIOS or something else, but there is no doubt that it is the case. We will consider what to do from here; although we've been very happy with pfSense, it is probably not a good idea to run software that we know we cannot upgrade in the future, so alternatives will have to be considered.

                  Wallabybob, thanks a million for your help - regardless of the outcome I'm very grateful for your time.

                  Best regards,
                  Anders C. Madsen

                  1 Reply Last reply Reply Quote 0
                  • W
                    wallabybob
                    last edited by

                    @madsenandersc:

                    it is probably not a good idea to run software that we know we cannot upgrade in the future

                    That is probably a premature judgement if you haven't yet tried pfSense 2.0 BETA

                    Wallbybob, thanks a million for your help - regardless of the outcome I'm very grateful for your time.

                    Thanks. Its a puzzling problem. Anecdotal evidence suggests quite a number of people are using Intel PRO/1000 NICs on pfSense without seeing this problem.

                    1 Reply Last reply Reply Quote 0
                    • M
                      madsenandersc
                      last edited by

                      @wallabybob:

                      @madsenandersc:

                      it is probably not a good idea to run software that we know we cannot upgrade in the future

                      That is probably a premature judgement if you haven't yet tried pfSense 2.0 BETA

                      You're right - but to be honest my fear is that we're looking at some kind of incompatibility with FreeBSD 7.x which I assume is the foundation for pfSense 2.0 (haven't checked yet). Right now I guess I'm just so relieved to have our systems stable again that I don't want to touch a single thing on the routers. Ever. :)

                      Wallbybob, thanks a million for your help - regardless of the outcome I'm very grateful for your time.

                      Thanks. Its a puzzling problem. Anecdotal evidence suggests quite a number of people are using Intel PRO/1000 NICs on pfSense without seeing this problem.

                      I know, and that was actually why we chose those cards in the first place: In general, Intel PRO/1000 is perceived to be about as thoroughly tested as it comes. Frankly I have a hard time believing that they are the cause of all the problems but on the other hand I can't see where else to look. SMP? Dell BIOS/MB? A ton of those out there as well, most likely humming along nicely too.

                      Best regards,
                      Anders C. Madsen

                      1 Reply Last reply Reply Quote 0
                      • jimpJ
                        jimp Rebel Alliance Developer Netgate
                        last edited by

                        @madsenandersc:

                        You're right - but to be honest my fear is that we're looking at some kind of incompatibility with FreeBSD 7.x which I assume is the foundation for pfSense 2.0 (haven't checked yet). Right now I guess I'm just so relieved to have our systems stable again that I don't want to touch a single thing on the routers. Ever. :)

                        2.0 is based on what will be FreeBSD 8.1. A major difference from 7.x in many regards.

                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        1 Reply Last reply Reply Quote 0
                        • M
                          madsenandersc
                          last edited by

                          @jimp:

                          2.0 is based on what will be FreeBSD 8.1. A major difference from 7.x in many regards.

                          Ah - that was good news indeed. OK, we'll give it a whirl once it's been released and see how it goes.

                          Best regards,
                          Anders C. Madsen

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.