• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Web server pages rendering slowly going out WAN but fine internally - HELP PLS!

Scheduled Pinned Locked Moved NAT
32 Posts 7 Posters 12.9k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J
    jobsoft
    last edited by Feb 2, 2007, 12:55 AM

    Hello,

    I am brand new to pfsense/pf, a little familiar with freebsd, 25 year unix veteran.  I recently setup pfsense with WAN, LAN, and DMZ.  Everything seemed hunky-dory until I went to access my web-based stuff from my home across cable internet.  before I installed pfsense, i was running fedora/shorewall with just LAN and WAN - web servers on LAN and WAN served up lighting fast on the same internet connection.  Once I swapped in pfsense, remote desktop via the inbound NAT plugs seemed a little more delayed, but not much to really be bothered with.  remote ssh sessions seemed fine.  file transfers maybe a little (in terms of average Kbytes/sec to my office at home).  while i did not compare before and after empirically, nothing sent up red flags.

    now, the web-based stuff.  a couple of sites all of a sudden took 1-2 minutes (at times) to serve up to my remote browser at my house!  Take a look at:

    http://www.barfield.com
    http://www.jobsoft.com

    The first one used to render in 1-3 seconds with no delays with "Items Remaining…"  Now, it seems to hang in finishing the render to the browser.  I even tried this from another site in Vermont - same exact behavior.  The problem seems to become more apparent if you F5 refresh for it to render fresh again.

    So, I started searching these archives and Google and tried everything I could remotely relate to what would cause it to be like this.  It really appears to be in the area of packet fragmentation, sizing, timing, etc, as, even though the hardware is older, there is nothing that really indicates the box in under any loading stress, certainly not rendering these measly web site pages!

    Here are some specifics:

    Hardware: AMD K62-350, Abit KT7A MB, 384MB RAM, 3 nics

    Notice no shared IRQs and all are 100BaseTx - One is Full duplex as the comcast SMC gateway device is a 10/100 switch.  I did try forcing that one to HD - no difference.  I swapped around the roles (ie, xl0 to WAN and xl1 to LAN, etc) - no difference.  I played with Device Polling off/on - no difference.  I tried to clear fragmented bit - no difference.

    The problem reminds me of an MTU issue we encountered last summer on an AIX box when we moved to gigabit interfaces and switches.  A web app running on Apache 1.2 started behaving a lot like what I am getting here.  We upgraded to Apache 1.3 and that problem went away.  Never really got to the bottom of why.  Packet sniffs showed the AIX box with apache 1.2 sending out packets > MTU on the gigabit interface.  End result was the same behavior as with my web servers now.  I also tried it from a web server on the LAN and not the DMZ - no difference.

    Below are various system information dumps FWIW.  I was going to try and do some packet captures with tcpdump from a server on the DMZ and from the pfsense WAN port to see what might be happening using something like Wireshark (formerly ethereal).

    Overall, though, I am stumped.  As you can see below, the system loading is nil.  there are a few errors on the DMZ and some collisions, so, I wonder if a NIC is bad somewhere.

    I decided to go ahead and post here as maybe one of you can nip this in the bud before I go chasing all sorts of things!  :-)  What else can I look at?  If some of you you are getting the same delayed completion on the above sites, how does one go about figuring out why it is behaving so???

    My next moves will be to try all different NICs, but, I suspect that won't solve it as when moving everything around, it seemed to make no difference.  I might also try a more current, beefier server just on the off chance bus throughput is a problem.  but, of course, why does VNC, RDP, ssh, file xfer, etc, all seem to be acceptable and only the web pages aren't.

    There has to be something fundamental and probably easy to address.  FWIW, when remote to an internal XP box with RDP, there seems to be absolutely no issues between the LAN and DMZ on render these same pages!  The problem only surfaces when the packets go across the WAN.  Again, before PFsense, these sites rendered just fine across the same WAN.

    Thanks very much!!!

    ===============================================================
    Output: dmesg | grep ^xl

    xl0: <3Com 3c905-TX Fast Etherlink XL> port 0xd800-0xd83f irq 10 at device 18.0 on pci0
    xl0: Ethernet address: 00:60:97:d0:14:fe
    xl1: <3Com 3c905B-TX Fast Etherlink XL> port 0xdc00-0xdc7f mem 0xe8801000-0xe880107f irq 5 at device 19.0 on pci0
    xlphy0: <3Com internal media interface> on miibus1
    xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
    xl1: Ethernet address: 00:10:4b:37:d3:5d

    xl2: <3Com 3cSOHO100-TX OfficeConnect> port 0xe000-0xe07f mem 0xe8800000-0xe880007f irq 11 at device 20.0 on pci0
    xlphy1: <3Com internal media interface> on miibus2
    xlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
    xl2: Ethernet address: 00:50:04:76:95:f5
    xl0: link state changed to UP
    xl1: link state changed to UP
    xl2: link state changed to UP
    xl0: link state changed to DOWN
    xl0: link state changed to UP

    ===============================================================
    Output: netstat -i

    Name    Mtu Network      Address              Ipkts Ierrs    Opkts Oerrs  Coll
    xl0    1500 <link#1>00:60:97:d0:14:fe    49546    0    50706    0    0
    xl0    1500 70-90-228-184 70-90-228-189-Nas      786    -      788    -    -
    xl0    1500 fe80:1::260:9 fe80:1::260:97ff:        0    -        2    -    -
    xl1    1500 <link#2>00:10:4b:37:d3:5d    28018    0    27273    0    19
    xl1    1500 192.168.1    gate                  2048    -    2312    -    -
    xl1    1500 fe80:2::210:4 fe80:2::210:4bff:        0    -        1    -    -
    xl2    1500 <link#3>00:50:04:76:95:f5    27100    35    25759    0    76
    xl2    1500 172.21/24    172.21.0.2            315    -      775    -    -
    xl2    1500 fe80:3::250:4 fe80:3::250:4ff:f        0    -        1    -    -
    pflog 33208 <link#4>0    0        0    0    0
    lo0  16384 <link#5>0    0        0    0    0
    lo0  16384 your-net      localhost              32    -        0    -    -
    lo0  16384 localhost    ::1                      0    -        0    -    -
    lo0  16384 fe80:5::1    fe80:5::1                0    -        0    -    -
    pfsyn  2020 <link#6>0    0        0    0    0

    ===============================================================
    Output: top (first few lines)

    last pid:  4046;  load averages:  0.08,  0.04,  0.01                                                                      up 0+00:49:46  18:07:35
    39 processes:  1 running, 38 sleeping
    CPU states:  1.8% user,  0.0% nice,  0.6% system,  0.0% interrupt, 97.6% idle
    Mem: 29M Active, 8832K Inact, 22M Wired, 12M Buf, 309M Free
    Swap: 1024M Total, 1024M Free

    PID USERNAME  THR PRI NICE  SIZE    RES STATE    TIME  WCPU COMMAND
    4044 root        1  96    0  2280K  1532K RUN      0:00  0.17% top
      314 root        1  96    0  2480K  2136K select  0:07  0.00% inetd
    2842 root        1  96    0  2344K  1596K select  0:04  0.00% top
      376 root        1  4    0 21092K 18328K accept  0:03  0.00% php
      516 root        1  8  20  1648K  1160K wait    0:02  0.00% sh

    ===============================================================
    Output: ifconfig

    ifconfig

    xl0: flags=8843 <up,broadcast,running,simplex,multicast>mtu 1500
            options=8 <vlan_mtu>inet 70.90.228.189 netmask 0xfffffff8 broadcast 70.90.228.191
            inet6 fe80::260:97ff:fed0:14fe%xl0 prefixlen 64 scopeid 0x1
            ether 00:60:97:d0:14:fe
            media: Ethernet autoselect (100baseTX <full-duplex>)
            status: active
    xl1: flags=8843 <up,broadcast,running,simplex,multicast>mtu 1500
            options=9 <rxcsum,vlan_mtu>inet 192.168.1.254 netmask 0xffffff00 broadcast 192.168.1.255
            inet6 fe80::210:4bff:fe37:d35d%xl1 prefixlen 64 scopeid 0x2
            ether 00:10:4b:37:d3:5d
            media: Ethernet autoselect (100baseTX)
            status: active
    xl2: flags=8843 <up,broadcast,running,simplex,multicast>mtu 1500
            options=9 <rxcsum,vlan_mtu>inet 172.21.0.2 netmask 0xffffff00 broadcast 172.21.0.255
            inet6 fe80::250:4ff:fe76:95f5%xl2 prefixlen 64 scopeid 0x3
            ether 00:50:04:76:95:f5
            media: Ethernet autoselect (100baseTX)
            status: active
    pflog0: flags=100 <promisc>mtu 33208
    lo0: flags=8049 <up,loopback,running,multicast>mtu 16384
            inet 127.0.0.1 netmask 0xff000000
            inet6 ::1 prefixlen 128
            inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
    pfsync0: flags=41 <up,running>mtu 2020
            pfsync: syncdev: lo0 maxupd: 128</up,running></up,loopback,running,multicast></promisc></rxcsum,vlan_mtu></up,broadcast,running,simplex,multicast></rxcsum,vlan_mtu></up,broadcast,running,simplex,multicast></full-duplex></vlan_mtu></up,broadcast,running,simplex,multicast></link#6></link#5></link#4></link#3></link#2></link#1>

    1 Reply Last reply Reply Quote 0
    • S
      sullrich
      last edited by Feb 2, 2007, 1:03 AM

      Try swapping xl0 and xl2's roles.

      1 Reply Last reply Reply Quote 0
      • J
        jobsoft
        last edited by Feb 2, 2007, 1:12 AM

        I have done that already!  :-)  I have tried all the possible combinations I could think of.  It is always the same behavior.

        One thing comes to mind.  If I can get the tcpdump from both sides LAN/DMZ and then the WAN, what could I look for in Wireshark that might indicate a timeout/retry/fragment/etc that could cause this sort of delay that then trips up the browser???

        1 Reply Last reply Reply Quote 0
        • G
          Gandalf
          last edited by Feb 2, 2007, 1:13 AM

          This isn't a pfSense problem but DNS problem (perhaps you forgot port 53 tcp/udp ??)
          take a look at http://www.dnsstuff.com/tools/dnsreport.ch?domain=barfield.com you will see that the Nameserver located at 12.107.230.110 is not responding and since it's your primary DNS server the delay became normal, Fix your DNS will fix your website :)

          1 Reply Last reply Reply Quote 0
          • S
            sullrich
            last edited by Feb 2, 2007, 1:14 AM

            Yes, that would be my 3rd test.

            Next test would be to lower the MTU to around 1400 on the WAN.  Test again, if the situation improves, keep moving the MTU back higher and higher until you find the sweet spot that works the best.

            1 Reply Last reply Reply Quote 0
            • J
              jobsoft
              last edited by Feb 2, 2007, 1:34 AM

              I can see that on DNS for an initial, but, that should cache once things get rolling.  I will check on that, but, again, nothing has changed except swapping fedora/shorewall for pfsense.  Well, what used to be servers on WAN under shorewall are now servers on DMZ.  I am sure if I moved them back to WAN they would work fine.

              I forgot to mention that my DMZ is setup with 1:1 NAT with a public IP mapped to each DMZ server in the same way they were originally setup and known on WAN under shorewall.  I first was thinking only of DMZ being the culprit, but, then it occurred to me to try a web server that was setup with Inbound NAT to a LAN box and the same behavior was present.

              I will try the MTU suggestion too.

              Mark

              1 Reply Last reply Reply Quote 0
              • J
                jobsoft
                last edited by Feb 2, 2007, 1:45 AM

                also, I am certain that I have no double NAT going on as the SMC switches automatically to bridge mode as soon as it detects one of the configured Public IPs on the LAN ports (which I thought was pretty slick).  The SMC gatway has been in the picture for a long time anyways.

                One thing that I wonder about is if the MTU corrected/compensated for the problem, a) why would it even be an issue and b) how does that play into causing the problematic behavior with delivering HTML to remote browsers?  Why would it be an issue on the WAN but not on the LAN?  Just strikes me as curious and I would like to get my hands around it.

                1 Reply Last reply Reply Quote 0
                • J
                  jobsoft
                  last edited by Feb 2, 2007, 2:53 AM

                  OK, DNS is not an issue on other web pages on the "home domain", and they have the same issues, so, while it may be a contributing factor initially on www.barfield.com, it still an aside to the main issue.

                  I tried various settings on the MTU and it made no difference at all.  :-(

                  Thanks though for all your suggestions and thoughts so far!

                  1 Reply Last reply Reply Quote 0
                  • S
                    sullrich
                    last edited by Feb 2, 2007, 3:01 AM

                    Couple other things that I would check:

                    Status -> Interfaces .. See any errors or collisions?

                    1 Reply Last reply Reply Quote 0
                    • J
                      jobsoft
                      last edited by Feb 2, 2007, 3:14 AM

                      There are a few In errors on the xl2 (LAN) and some collisions on each.  The web server with www.barfield.com is on xl1 (DMZ), so, the In errs on LAN should not affect DMZ–>WAN.  I did not display WAN again here as it has no errors and no collisions.

                      I suppose an error or collision would trash an outbound http packet, but, would it cause it to delay so much???  I suppose also that a stream of http would attempt to max out the packet size, so, this could be a problem that manifest itself near or at the MTU.  I need to look at some tcpdumps and see.

                      LAN interface (xl2)
                      Status up
                      MAC address 00:50:04:76:95:f5
                      IP address 192.168.1.254 
                      Subnet mask 255.255.255.0
                      Media 100baseTX
                      In/out packets 90732/106145 (40.64 MB/16.73 MB)
                      In/out errors 24/0
                      Collisions 74

                      DMZ interface (xl1)
                      Status up
                      MAC address 00:10:4b:37:d3:5d
                      IP address 172.21.0.2 
                      Subnet mask 255.255.255.0
                      Media 100baseTX
                      In/out packets 76319/74887 (11.61 MB/16.21 MB)
                      In/out errors 0/0
                      Collisions 142

                      1 Reply Last reply Reply Quote 0
                      • S
                        sullrich
                        last edited by Feb 2, 2007, 3:24 AM Feb 2, 2007, 3:22 AM

                        One other thing is to verify that the speed and duplex are matching up on all pieces of equipment.

                        BTW: both sites loaded in under 15 seconds here.

                        1 Reply Last reply Reply Quote 0
                        • B
                          billm
                          last edited by Feb 2, 2007, 3:24 AM

                          FWIW, both those sites come up instantly for me.  Seems like your customers shouldn't notice.  The issue is only when you try to access them from behind the same firewall right?

                          –Bill

                          pfSense core developer
                          blog - http://www.ucsecurity.com/
                          twitter - billmarquette

                          1 Reply Last reply Reply Quote 0
                          • J
                            jobsoft
                            last edited by Feb 2, 2007, 3:35 AM

                            no, from behind the same firewall (all on LAN), they are fine.  It is from WAN from my house and my partner 's office in Vermont (I had him try) (both PCs from remote are themselves behind NAT), it was the same.  And, sometimes they do pop up much quick and at other times they drag.  That was why I suggested the F5 to refresh and see the varying performance.

                            While I agree they may not notice, when it does take a while to load, it looks broken and some times I have even had the browser time out and just leave the spots with broken images icons.  Not good.  Some times it has timed out when not enough HTML was delivered to even render the page intelligently.

                            The crux of the issue here is that the previous Fedora/Shorewall setup had no problems.  Clearly SOMETHING in the chain with pfsense (and this very well could descend through m0n0wall to freebsd to the xl drivers).  it could also be something else hardware wise.  but nonetheless, there is a degradation and the new setup has to be contributing to it.

                            pfsense and/or m0n0wall are super cool tools!!  And, what I am doing is nothing major.  And, surely others have similar setups without issues or all kinds of heck would be all over these forums.  so, my culprit is I think atypical which is going to make it all the more elusive!  :

                            I really want to try and stay with pfsense, AND there has to be some way to at least define why the pages are rendering the way they are from a packet-level view.

                            1 Reply Last reply Reply Quote 0
                            • Y
                              yoda715
                              last edited by Feb 2, 2007, 7:25 AM

                              I went to both of your webpages and both appeared within 3 seconds. I continually hit shift-refresh (which reloads entire webpage) and noticed no hit in performance. Can you confirm that this happens at all times of day? Only thing I can think of based upon what I've read so far is that it might be utilization related. Meaning that there might be 10,000 people trying to pull up your webpage at the same time you were, and that caused it to slow down. Just a theory. Test this by going to the webpage at different times of the day. 12pm, 9pm, 1am, etc. See if that points to anything.

                              1 Reply Last reply Reply Quote 0
                              • J
                                jobsoft
                                last edited by Feb 2, 2007, 10:46 AM

                                Very interesting indeed.  What is your Internet setup configuration there?  Since the two places that I tested it from (here and from Vermont) were also on cable internet and both behind NAT routers (each with a Linksys WRT54G running dd-wrt v23 SP2!).  Both behaved the same way.  I did ask the people at Barfield to test it out advise if performance or other problems and they said it look great to them too.

                                So, I wonder if the WRT54G's could be a factor in this anomaly???  I will have to rig my laptop direct to my cable this morning and see.

                                Also, as yet another "try this", I pulled up firefox here at my house from a fedora linux desktop and tried www.jobsoft.com.  same thing!  :-(  But, I went ahead and captured some screen shots for the page render "progress" after the 1st, 2nd and 3rd minutes:

                                http://www.jobsoft.com/Screenshot_Jobsoft_Design_and_Development_1st_Minute.png
                                http://www.jobsoft.com/Screenshot_Jobsoft_Design_and_Development_2nd_Minute.png
                                http://www.jobsoft.com/Screenshot_Jobsoft_Design_and_Development_3rd_Minute.png

                                This is what I get no matter when I try it and from where and what here in the house behind the WRT54G NAT router.  Notice in the 3rd minute the browser had given up and was "Done".

                                I can also remote VNC to a linux desktop at a customers site that has T1 and a cisco router this moming as well and see what it does from there.

                                Thanks for the feedback!  It has helped to shift focus a bit.

                                1 Reply Last reply Reply Quote 0
                                • J
                                  jobsoft
                                  last edited by Feb 2, 2007, 10:50 AM

                                  One quick followup.  Since I can packet capture on each end of this through the same event period, can anyone suggest what I might look for in wireshark that would be enlightening as to not necessarily what caused the problem to begin with, but what packet situation is resulting in the delays???

                                  1 Reply Last reply Reply Quote 0
                                  • J
                                    jobsoft
                                    last edited by Feb 2, 2007, 1:48 PM

                                    OK,

                                    I have done the tcpdumps from 3 places:

                                    http://www.jobsoft.com/packet-watch-dmz-filtered.cap
                                    http://www.jobsoft.com/packet=watch-eth0-filtered.cap
                                    http://www.jobsoft.com/packet-watch-wan-filtered.cap

                                    All tcpdumps were 'tcpdump -s 1500 -i <iface>-w <capfile>.cap' and run simultaneously while I exercised the web pages from windows and linux here at my house.  The anomalies did manifest themselves.

                                    I then brought all 3 into Wireshark and filtered out only the packets to/from the web server and my external cable ip address and then saved those filtered sets back to the files above.  I am making them available above as well in case anyone else wants to peek at them too, however, I certainly am already!  :-)

                                    DMZ was on the pfsense box xl1/DMZ interface.  WAN was on the xl0/WAN interface.  ETH0 was on the linux server at my house that I had firefox running from and it was listening on the inside wired lan.

                                    What I did discover on the ETH0 stood out was several of the larger packets with HTTP payloads had checksum errors.  While I have only just looked at these initially, something like that would trigger a retry.  I also saw some "TCP DUP ACKs".  What I will have to go back and do is trace one of these packets with the failed checksum back through WAN and DMZ and the see what followed.  Ideally, if I could correlate the pauses in page rendering with the HTML contained in these retried packets, that would at least tie the browser behavior to the packet conditions.  When I hook up my laptop direct to cable and then capture packets in the same way (just wireshark directly off the laptop on my house side).

                                    The whole thing still puzzles me.  ???</capfile></iface>

                                    1 Reply Last reply Reply Quote 0
                                    • H
                                      hoba
                                      last edited by Feb 2, 2007, 2:21 PM

                                      Do you see lots of errors or collisions at status>interfaces at one of the nics?

                                      1 Reply Last reply Reply Quote 0
                                      • J
                                        jobsoft
                                        last edited by Feb 2, 2007, 3:36 PM

                                        netstat -i

                                        Name    Mtu Network      Address              Ipkts Ierrs    Opkts Oerrs  Coll
                                        xl0    1500 <link#1>00:60:97:d0:14:fe  792325    0  813918    0    0
                                        xl0    1500 fe80:1::260:9 fe80:1::260:97ff:        0    -        2    -    -
                                        xl0    1500 70-90-228-184 70-90-228-189-Nas    4167    -    5948    -    -

                                        xl1    1500 <link#2>00:10:4b:37:d3:5d  595782    26  554584    0  842
                                        xl1    1500 fe80:2::210:4 fe80:2::210:4bff:        0    -        1    -    -
                                        xl1    1500 172.21/24    172.21.0.2            3187    -    7657    -    -
                                        xl2    1500 <link#3>00:50:04:76:95:f5  280118  133  294471    0  402
                                        xl2    1500 fe80:3::250:4 fe80:3::250:4ff:f        0    -        1    -    -
                                        xl2    1500 192.168.1    gate                  1885    -    2897    -    -
                                        pflog 33208 <link#4>0    0        0    0    0
                                        lo0  16384 <link#5>9    0        9    0    0
                                        lo0  16384 your-net      localhost              445    -        0    -    -
                                        lo0  16384 localhost    ::1                      0    -        0    -    -
                                        lo0  16384 fe80:5::1    fe80:5::1                0    -        0    -    -
                                        pfsyn  2020 <link#6>0    0        0    0    0

                                        Some Ierrs on xl1 (DMZ) and xl2 (LAN - not being considered at the moment) - none on xl0 (WAN)</link#6></link#5></link#4></link#3></link#2></link#1>

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          sullrich
                                          last edited by Feb 2, 2007, 3:48 PM

                                          Ahh yes.  Checksum offloading errors.

                                          From a shell:

                                          ifconfig xl0 -rxsum
                                          ifconfig xl1 -rxsum
                                          ifconfig xl2 -rxsum

                                          These seem like older cards, eh?  I bet the checksum offloading is busted in FreeBSD.

                                          1 Reply Last reply Reply Quote 0
                                          10 out of 32
                                          • First post
                                            10/32
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                            This community forum collects and processes your personal information.
                                            consent.not_received