Frequent package install issues
-
At home where my internet is often crud, like right now, I get package install issues that now happen much more frequently on 2.2.3-DEVELOPMENT - I guess this is an effect of the change of default connection_timeout from 60 to 5 seconds in download_file().
The disturbing thing is that a "failed" message appears when a file fails to download (good) and the immediately continues (not trying to download other files - to be expected) but seems to barrel through the rest of the attempted install and give a normal-looking "Package reinstalled" message. I expected it would error out and try to uninstall whatever bits of the package it had managed to install:
Removing mailreport components... Package XML... done. Configuration... done. Beginning package installation for mailreport . Downloading package configuration file... done. Saving updated package information... done. Downloading mailreport and its dependencies... Loading package configuration... done. Configuring package components... Loading package configuration... done. Additional files... mail_reports_generate.php failed. Custom commands... Menu items... done. Writing configuration... done. Package reinstalled.
PS: I have got sick of the packet loss and down/up of my current ISP. The payment runs out this week, so changing to another ISP tomorrow - we will see if I get any reliability.
-
The behavior of continuing on regardless after a file download has failed seems to be due to the fact that pkg_fetch_additional_files() has code in it to return status (true/false/-1) at various points. But none of the calls to pkg_fetch_additional_files() checks the return status. It seems that when pkg_fetch_additional_files() was created, previous code was cut/pasted into it but the "links" in the chain of return status checking were not made.
Time for sleep in my TZ. I might have a look at this tomorrow, if no-one else mods it first. -
Well, I've seen it fail at the very beginning on "fetching" the list of packages, and then it successfuly reinstalled (read - miserably failed) all the packages.
-
hm, if you bump the timeout back up, does it then succeed? 5 seconds to establish a HTTPS connection is a really long time even on the worst of Internet connections.
-
Well that's not really the problem, the problem is that it states something got reinstalled when it has not.
I'm randomly having issues with downloading the package list, even in the GUI. Fails to load, works a minute later. I think it's got something to do with IPv6.
-
Well that's not really the problem, the problem is that it states something got reinstalled when it has not.
I'm randomly having issues with downloading the package list, even in the GUI. Fails to load, works a minute later. I think it's got something to do with IPv6.
I was referring to Phil's post, yours does sound diff.
What do you get for:
fetch -6 https://packages.pfsense.org/ fetch -4 https://packages.pfsense.org/
Spot checked that from a variety of places outside our network and it works fine both ways every time.
-
I wish I could get some meaninful info, afraid I'd have to be capturing packages all day long. IMO it's the same sort of thing like people complaining about the dashboard update check, noone could figure it out beyond PEBKAC cases. Is packages.pfsense.org behind nginx as well?
-
I wish I could get some meaninful info, afraid I'd have to be capturing packages all day long. IMO it's the same sort of thing like people complaining about the dashboard update check, noone could figure it out beyond PEBKAC cases.
I've never seen issues there outside of PEBKAC. There's enough PEBKAC there that I tend to ignore those threads, but in the many instances we've run into that with customers it's never turned out to be anything other than some kind of config issue or general Internet connectivity problem.
Is packages.pfsense.org behind nginx as well?
Served by nginx, yes. Not behind as in being reverse proxied or something.
-
Oh well. Maybe it's just that nginx hates me. :D (No trouble with the dashboard though.)
If I find something out, I'll post here or on Redmine…
-
@cmb:
hm, if you bump the timeout back up, does it then succeed? 5 seconds to establish a HTTPS connection is a really long time even on the worst of Internet connections.
I didn't get a chance to try any of this. I was completely sick of my ISP, so I have changed to another ISP - bit more expensive but now I can "ping -t 8.8.8.8" and get solid response. With the previous ISP there was always some random packet loss, and also just disappearing for 30 seconds and coming back.
With the new good ISP packages download and instal fine, both after firmware upgrade and manually from the GUI. Dashboard update check always succeeds… A reliable ISP is always good to have.I do have the old ISP device and access up to Sunday. I am guessing that the connection timing issue is related to packet loss, rather than latency. If a few packets get lost and have to be retransmitted in the protocol setup negotiation then 5 seconds is going to elapse more easily.
Otherwise I will put another pfSense in the upstream WAN path of the good link and give it a dummynet limiter with packet loss - that might help break it in a controlled way.
-
If I find something out, I'll post here or on Redmine…
Please do. Not saying there isn't an issue, but I sure haven't seen one.
-
I didn't get a chance to try any of this. I was completely sick of my ISP, so I have changed to another ISP - bit more expensive but now I can "ping -t 8.8.8.8" and get solid response. With the previous ISP there was always some random packet loss, and also just disappearing for 30 seconds and coming back.
With the new good ISP packages download and instal fine, both after firmware upgrade and manually from the GUI. Dashboard update check always succeeds… A reliable ISP is always good to have.Those short drop outs would be my guess as well. You're close to the exact opposite side of the world from Austin so the latency on an ideal connection is about as bad as it could get under ideal circumstances (still probably ~200-250 ms I'd guess), combine that with even a short drop out, and you'd hit that 5 second timeout without much trouble.
That'd be a good thing to have a controlled test case for, especially as Renato continues pkg work on 2.3 in getting rid of PBIs. There might be more we can do in 2.3 to make it as resilient as possible without causing excessive delays when connectivity has failed.
-
(still probably ~200-250 ms I'd guess)
C:\Users\phil.davis>ping pfsense.org Pinging pfsense.org [208.123.73.69] with 32 bytes of data: Reply from 208.123.73.69: bytes=32 time=314ms TTL=43 Reply from 208.123.73.69: bytes=32 time=315ms TTL=43 Reply from 208.123.73.69: bytes=32 time=309ms TTL=45 Reply from 208.123.73.69: bytes=32 time=336ms TTL=45 Ping statistics for 208.123.73.69: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 309ms, Maximum = 336ms, Average = 318ms
The above is at the better end, from the office on a good quality connection. Typically latency to the USA is from 300 to 400ms, but there are times when it gets worse - often in the evening (6-10pm) when there seems to just be bandwidth bottleneck in/out of Nepal - too many people sitting at home eating rice and lentils while watching YouTube :)
I will try some dummynet tweaking delay and packet loss to make it break, since as you say it would be good to know where the limits are.
IMHO this thread is no longer an impact on 2.2.3 - the 5 second delay works fine for me from Nepal as long as I have an ISP that actually provides a reliable connection. So it should work for anyone else on the planet as long as their ISP also gives decent service.
The only exception I can think of is satellite - are there many pfSense installs with the main/only WAN hanging off the end of a satellite link?