Dashboard gateways can't gather data - 2.0-BETA4 (i386) built on Fri Oct 1



  • Subject says it all.
    Image attached.



  • Hi.

    I also have that error. I assumed it was related to the WIMAX service and the fact that my ISP is using DHCP to assign me a static IP Address.

    …waiting to see.

    Jits.



  • There were some very large changes to gateways and monitoring on Oct 1. You probably have a snapshot build before those changes were made. You didn't post the build time.

    You can find out by comparing the build time to the commit logs at rcs.pfsense.org, or just wait a day and get a new snap.

    GB



  • No it's still broke. If you re-save one of your gateway settings it will work. I can't believe some of this stuff goes unnoticed in these recent snaps



  • @gnhb:

    There were some very large changes to gateways and monitoring on Oct 1. You probably have a snapshot build before those changes were made. You didn't post the build time.

    You can find out by comparing the build time to the commit logs at rcs.pfsense.org, or just wait a day and get a new snap.

    GB

    Well, it's built on Fri Oct 1 02:32:27 EDT 2010.
    I'm going to update right now, so we will see if it's fixed.

    EDIT - UPDATE:

    Still broke with 2.0-BETA4  (i386) built on Fri Oct 1 19:15:57 EDT 2010 snapshot.


  • Rebel Alliance Developer Netgate

    It's a known issue, it should be fixed shortly.



  • hi,

    mine shows status ok; 2.0-BETA4 (i386) built on Sat Oct 2 16:42:16 EDT 2010 FreeBSD 8.1-RELEASE-p1

    cheers,



  • Status not working for one of two dynamic gateways (Oct 5 build).



  • The recent changes to gateways code have changed where monitor IP data is stored in some cases, so you will certainly need to re-save all your gateway entries that save a monitor IP for dynamic gateways, (or delete and re-create.)

    townsenk, not all developers have all ISP connectivity configurations, so we can't test every permutation. That's why this is still BETA code and why issues go unnoticed. We'll get it sorted out.

    GB



  • Everthing seemed to come together nicely on the latest snapshot and the issues I reported are now working. Your efforts are apprecieted



  • Mine is still broken on 2.0-BETA4 (i386) built on Wed Oct 6 09:26:29 EDT 2010 FreeBSD 8.1-RELEASE-p1
    I have two "gateways" one that does AV Scanning and then the default gateway.  The regular gateway always works, but the AV "Gateway" (really a local machine that then connects through the default gateway) shows up wrong.
    The table gets messed up as shown.



  • Rebel Alliance Developer Netgate

    @BlueMatt:

    Mine is still broken on 2.0-BETA4 (i386) built on Wed Oct 6 09:26:29 EDT 2010 FreeBSD 8.1-RELEASE-p1
    I have two "gateways" one that does AV Scanning and then the default gateway.  The regular gateway always works, but the AV "Gateway" (really a local machine that then connects through the default gateway) shows up wrong.
    The table gets messed up as shown.

    Did you edit/save on each gateway after you updated to the new snapshot? IIRC if your dynamic gateway entry shows up red, you may also want to try deleting it and seeing if it comes back OK. (gnhb may know better about that bit)



  • @jimp:

    Did you edit/save on each gateway after you updated to the new snapshot? IIRC if your dynamic gateway entry shows up red, you may also want to try deleting it and seeing if it comes back OK. (gnhb may know better about that bit)

    No, the problem still happens.  Also, the dynamic gateway is actually pinging 8.8.8.8, because my dynamic provider does not ping (it only recently started showing up as the actual gateway instead of the ping target, though I'm assuming this is a new feature, because it is much more helpful to see the gateway and not the target)



  • n1ko posted a link to his config in the other thread on this topic.
    "Changing Gateway configuration results in broken gateway" (http://forum.pfsense.org/index.php/topic,28599.0.html)

    He has two dhcp gateways and his /tmp/apinger.status file has only one line, for the first gateway, so that's the source of the "always just gathering data" problem. (The /tmp/apinger.status is the source for the data on the dashboard and in Status menu=>Gateways page.)

    I'm not sure if its an apinger bug, but it seems like it is.

    I've seen what bluematt is reporting, but I think that's just a data parsing problem, or html problem somewhere.

    GB


  • Rebel Alliance Developer Netgate

    Yeah the formatting is just a minor problem with the AJAX update of the gateway widget. Cosmetic only.

    I have two dynamic gateways (PPPoE DSL and DHCP Cable) right now and a good status for both, but I'm on a snap from Saturday.



  • This would be fixed if apinger was restarted at some point.  If you just killall -9 apinger; /usr/local/sbin/apinger -c /var/etc/apinger.conf
    it starts working again.



  • I will confirm this fixes the status and the cosmetic layout of the table. I am monitoring this to see if it also resolves my outbound load balancing too.

    (using cron to restart the pinger hourly might be a good idea)


  • Rebel Alliance Developer Netgate

    I committed a couple fixes just now that will correct the AJAX updating of the gateways widget. There were a couple problems before: (1) If the status was "gathering data", it output would become corrupted. (2) The last cell of the table was never updated.



  • The problem here is that for some very odd reason apinger does not quit when sent the TERM signal, it needs the KILL signal.  This only happens after initial boot, but if you restart apinger manually it will die from the TERM signal.  A (potentially bad) option would be to replace all the killbypid with sigkillbypid (*, "KILL") in /etc/inc/gwlb.inc. Maybe send TERM then wait and if it does not quit send KILL.


  • Rebel Alliance Developer Netgate

    Not sure if it will make any difference, but I just recompiled apinger on both of the builders (amd64 and i386) so the next snapshot should have an fresh binary.

    It may have to wait until Ermal can have a look, but he's at EuroBSDCon so his availability is limited for a while.



  • I checked and replacing killbypid("{$g['varrun_path']}/apinger.pid"); with sigkillbypid("{$g['varrun_path']}/apinger.pid","KILL");
    on lines 43 and 225 of /etc/inc/gwlb.inc fixes the issue.  Obviously this is not a good solution, but if anyone wants apinger to work until the actual root of the problem is found, you could do this.



  • Hi there,

    I'm running :

    2.0-BETA4  (i386)
    built on Fri Oct 8 02:16:27 EDT 2010
    FreeBSD 8.1-RELEASE-p1

    I have 3 WAN connections, 2 x PPPoE (1 DSL, 1 VDSL) and 1 DHCP (cable).

    The DHCP/cable connection correctly displays online, while my 2 PPPoE connection still show Gathering Data in the GW Status screen, as well as the GW Groups page. (blue indicator with gathering data)

    It doesn't seem to impact functionality from what I can tell so far, but I have detected a new issue.

    A while back the "disconnect" button for PPPoE interfaces wasn't working, and then it was fixed.  Something has again broken the functionality here as well as a press of disconnect now results in a loss of IP address, but the interface still shows as online with an IP of "0.0.0.0".  As the interface still thinks it is online, there is no "connect" button, and the only way to restore the connection seems to be a reboot.

    So - there definitely seems to be something to PPPoE connections, gateways and the interface screen.

    Thanks,

    – Phob



  • @Phobia:

    Something has again broken the functionality here as well as a press of disconnect now results in a loss of IP address, but the interface still shows as online with an IP of "0.0.0.0".  As the interface still thinks it is online, there is no "connect" button, and the only way to restore the connection seems to be a reboot.

    The same situation with DHCP. But reboot not help. After reboot  IP 0.0.0.0


Log in to reply