Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Strange WRITE_DMA errors when switching on network port

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    5 Posts 2 Posters 5.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • E
      ehuk
      last edited by

      Hi Guys,

      I have two pfsense boxes on which CARP was working fine. However I have now changed my switches from two Cisco 2950s to two Cisco 3750Gs that are stacked.

      I have one interface that we run all VLANs on, the first firewall and first switch run fun, the port is a dot1q trunk.

      However as soon as i turn the switchport on the sw2 (connected to fw2) i see the following errors:

      Jun 15 18:18:04 kernel: ad0: FAILURE - WRITE_DMA status=51 <ready,dsc,error>error=4 <aborted>dma=0x06 LBA=1129359
      Jun 15 18:18:04 kernel: g_vfs_done():ad0s1a[WRITE(offset=578191360, length=16384)]error = 5
      Jun 15 18:18:04 kernel: ad0: FAILURE - WRITE_DMA status=51 <ready,dsc,error>error=4 <aborted>dma=0x06 LBA=1505711
      Jun 15 18:18:04 kernel: g_vfs_done():ad0s1a[WRITE(offset=770883584, length=16384)]error = 5
      Jun 15 18:18:05 kernel: ad0: FAILURE - WRITE_DMA status=51 <ready,dsc,error>error=4 <aborted>dma=0x06 LBA=2258575
      Jun 15 18:18:05 kernel: g_vfs_done():ad0s1a[WRITE(offset=1156349952, length=16384)]error = 5
      Jun 15 18:18:11 kernel: ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=3387471
      Jun 15 18:18:11 kernel: ad0: FAILURE - WRITE_DMA status=51 <ready,dsc,error>error=4 <aborted>dma=0x06 LBA=5645583
      Jun 15 18:18:11 kernel: g_vfs_done():ad0s1a[WRITE(offset=2890498048, length=16384)]error = 5
      Jun 15 18:18:11 kernel: ad0: FAILURE - WRITE_DMA status=51 <ready,dsc,error>error=4 <aborted>dma=0x06 LBA=6398287
      Jun 15 18:18:11 kernel: g_vfs_done():ad0s1a[WRITE(offset=3275882496, length=16384)]error = 5
      Jun 15 18:18:11 kernel: ad0: FAILURE - WRITE_DMA status=51 <ready,dsc,error>error=4 <aborted>dma=0x06 LBA=7151599
      Jun 15 18:18:11 kernel: g_vfs_done():ad0s1a[WRITE(offset=3661578240, length=16384)]error = 5
      Jun 15 18:18:12 kernel: ad0: FAILURE - WRITE_DMA status=51 <ready,dsc,error>error=4 <aborted>dma=0x06 LBA=7527343
      Jun 15 18:18:12 kernel: g_vfs_done():ad0s1a[WRITE(offset=3853959168, length=16384)]error = 5
      Jun 15 18:18:18 kernel: ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=376655

      Once I turn the switchport off the errors disappear, but obviously I can't access my vlans.

      I have tried everything I can think of, including reinstalling pfsense, and even creating a whole new config.

      Any ideas what is causing this?

      Many Thanks,</aborted></ready,dsc,error></aborted></ready,dsc,error></aborted></ready,dsc,error></aborted></ready,dsc,error></aborted></ready,dsc,error></aborted></ready,dsc,error></aborted></ready,dsc,error>

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        Your HDD controller might be sharing an IRQ with that port, you can check with:

        vmstat -i
        

        At a shell prompt or Diagnostics > Command

        You might have to change some options in the BIOS to fix that, or shut off DMA for the hard drive.

        Usually that error is indicative of a hard drive, cable, or controller error (typically one of them is faulty) but if it only happens when you enable something else, there may be hope.

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • E
          ehuk
          last edited by

          Hi Jimp,

          Thanks for that information, I have disabled DMA and the exact same thing is happening. When I enable the switchport the errors appear.

          I don't think it is sharing an IRQ either. The HDD is a 4GB CF Card connected with a Sata-CF Converter, and has been working fine until upgrading to 1.2.3 and changing our switches.

          Do you have any idea what else could be causing the problem?

          Output from vmstat -i

          $ vmstat -i
          interrupt                          total      rate
          irq1: atkbd0                          12          0
          irq14: ata0                        2539        10
          irq16: re3 uhci3                      35          0
          irq18: re1 uhci2                    521          2
          irq19: re2 uhci1                    3634        14
          irq23: uhci0 ehci0                    1          0
          cpu0: timer                      500965      1995
          irq256: re0                        5466        21
          cpu1: timer                      500908      1995
          cpu3: timer                      500908      1995
          cpu2: timer                      500909      1995
          Total                            2015898      8031

          Many Thanks,

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            Try editing /boot/loader.conf and adding this line:

            hw.ata.ata_dma=0
            

            And then reboot

            CF converters are not known for their great DMA compatibility…

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • E
              ehuk
              last edited by

              Hi Jimp,

              I tried that, it did reduce the errors but they were still there. As a last ditch attempt I stuck in a 160gb SATA disk i had laying around and that worked perfectly. So it must have been something strange with the converter.

              Strange thing is, I have the exact same setup on my primary firewall, with a 4GB CF card and converter, upgraded that to 1.2.3 and worked without any problems. So I am not sure why I had issues with the backup firewall, it would be a very strange coincidence if there was a hardware failure at the same time as upgrading the software.

              Either way things are back up and running, thanks for your help, much appreciated.

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.