LCDProc 0.5.4-dev
-
Steve,
can you try to use the LCDproc package (I mean not the "-dev" package) and see if the problem occurs?I guess with the "LCDproc" package you didn't have any problem or you were not running it?
Thanks,
Michele -
I don't know if it can help, but some time ago I had a endless startup processes from mailscanner that exausted machine resources every boot.
I noted that at bootup, mailscanner startup was called several times.
May be there is something related with multiple lcdproc scripts/prccess opened.
-
I don't know if it can help, but some time ago I had a endless startup processes from mailscanner that exausted machine resources every boot.
I noted that at bootup, mailscanner startup was called several times.
May be there is something related with multiple lcdproc scripts/prccess opened.Thanks Marcello, but for what I understood it is the only LCDd process that after a certain amount of time eats all the resources…
-
I never ran the original LCdproc package (except while trying to develop my own package and then only for a few hours) because it never included the sdeclcd driver.
Before you added the driver to the LCDproc-dev package all firebox users were running a manual installation that consisted of:
LCDd 0.53
lcdproc client with manual command line options for screens.
The old sdeclcd driver.
A simple startup script that ran the server and client once from /usr/local/etc/rc.dI never saw this crash out on any box. It was distributed as a tarball with an install script. Here.
I never tested this with pfSense 2.0.1.
Steve
-
Hi I having problem with the display running on the alix2d13 hardware.
U204FB-A1 20x4 DisplayWhat is the driver for this LCD?
-
Before you added the driver to the LCDproc-dev package all firebox users were running a manual installation that consisted of:
LCDd 0.53
lcdproc client with manual command line options for screens.
The old sdeclcd driver.
A simple startup script that ran the server and client once from /usr/local/etc/rc.dCan I install the old driver and run the LCDproc-dev scripts against it? Would it work this way? Or maybe replace LCDd instead?
I thinking is the problem is with either the driver or LCDd not the client scripts. I've never seen an issue with the client, but everytime the display has stopped working for me, the LCDd process was at 100%.
-
Just what I'm going to try after work.
You will need both LCDd and sdeclcd.so from the tarball. I've never tried running it since the 2.0.1 update but I see no reason why anything should have changed.Steve
Edit: No compatibility issues, testing now.
-
I've been testing using 0.54 versions of LCDd and sdeclcd.so and there is no difference.
New test driver:
https://github.com/downloads/fmertz/sdeclcd/sdeclcd.so
I removed the call to the process scheduler. Give it a try…
-
Steve,
I watched the LCDd.conf in the tarball and I found no one difference that could cause this… BUT in the same time probably I had the same issue on my secondary machine.
The machine is running the screens: Uptime, Load, States, Mbuf and Interface Traffic (WAN).Can you please select only the Interface traffic (WAN) and tell me if it hangs again? So we exclude one screen...
Thanks,
Michele -
I removed the call to the process scheduler. Give it a try…
Ah, that sounds interesting.
Can you please select only the Interface traffic (WAN) and tell me if it hangs again? So we exclude one
You want me to run only the Interface Traffic screen? Currently I'm running Uptime and Time.
Too many tests, not enough boxes! :P
Steve
-
hehe! sorry buddy, if some watchguard representative sends me a couple of Fireboxes I can test them also! :D
-
New test driver:
https://github.com/downloads/fmertz/sdeclcd/sdeclcd.so
I removed the call to the process scheduler. Give it a try…
Downloaded this driver and going to try it. Left everything else unchanged from the .9 dev package and will see if the driver alone makes any difference in the morning. If the driver doesn't help, I will restore the original .9 driver and change to LCDd 0.53 in stephenw10's manual package. Seems a methodical approach should help me narrow this down.
I haven't had any resource issues other than LCDd locking CPU to 100% until I kill it. Even in my box with only 256M, I still have over 128M free and no swap in use. In fact, today it ran for 8 hours at 100% while I was at work and continued to route and firewall properly.
load averages: 10.06, 9.81, 9.30 101 processes: 13 running, 76 sleeping, 12 waiting CPU: 20.4% user, 0.0% nice, 78.6% system, 1.0% interrupt, 0.0% idle Mem: 62M Active, 12M Inact, 35M Wired, 25M Buf, 125M Free Swap: 512M Total, 512M Free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 12019 nobody 74 r30 3368K 1496K RUN 528:54 100.00% LCDd
Lastly math was off in a previous post, my failures all seem to start at around 9 hours (+/- 1 hour) of uptime (not 16 as previously reported).
I will report status in the morning and with any luck the new driver resolves this.
-
I'm trying to run stuff for at least 24hrs so I can be relatively sure there's no problem / is a problem.
For reference I'm now coming up to 21hrs running LCDd 0.53 with the old driver, no problems.last pid: 21474; load averages: 1.65, 1.37, 1.43 up 10+14:12:56 14:02:28 107 processes: 5 running, 85 sleeping, 1 zombie, 16 waiting CPU: 46.3% user, 0.0% nice, 53.4% system, 0.4% interrupt, 0.0% idle Mem: 58M Active, 18M Inact, 59M Wired, 152K Cache, 59M Buf, 350M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 10 root 171 ki31 0K 8K RUN 168.6H 33.98% idle 31442 root 76 0 43356K 14908K RUN 0:14 3.96% php 30124 root 76 0 43356K 15060K accept 0:27 2.98% php 55054 root 76 0 43356K 15060K ppwait 0:27 2.98% php 51320 root 76 0 43356K 14908K accept 0:16 2.98% php 11 root -32 - 0K 128K WAIT 26:28 0.00% {swi4: clock} 11 root -68 - 0K 128K WAIT 19:58 0.00% {irq18: em0 ath0+} 63361 nobody 44 0 3364K 1460K RUN 10:14 0.00% LCDd
Important thing here is ~10minutes CPU time in 21hours.
Steve
Edit: Actually that looked like too much CPU time (but not much) something odd happened at 4.24am which mat have used some cycles:
Jan 26 04:24:34 LCDd: Connect from host 127.0.0.1:17658 on socket 11 Jan 26 04:24:32 php: lcdproc: Connection to LCDd process lost () Jan 26 04:24:24 LCDd: sock_send: socket write error Jan 26 04:24:24 LCDd: sock_send: socket write error Jan 26 04:24:24 LCDd: sock_send: socket write error Jan 26 04:24:24 LCDd: Client on socket 11 disconnected Jan 26 04:24:24 LCDd: error: huh? Too much data received... quiet down!
It doesn't seem to have effected it though. Recovered from said event without issue.
-
Ok, so 24 hours is up running 0.53 LCDd and the old sdec driver and some interesting results are in!
First off I have had no problem accessing the box during that time and the lcdclient and LCDd processes have run solidly with only one instance of each.Much more interstingly is that twice during the 24hr period the logs show the error in the post above 'too much data received'. It is clear that the process recovers and carries on with seemingly no other effects however it's also clear that during that 'event' the LCDd process uses far more CPU cycles.
After 24hrs:PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 10 root 171 ki31 0K 8K RUN 171.4H 98.00% idle 11 root -32 - 0K 128K WAIT 26:54 0.00% {swi4: clock} 63361 nobody 44 0 3364K 1460K nanslp 25:56 0.00% LCDd
My original estimate was that it should consume around 2minutes in 24hrs.
So is it possible that either the newer driver or LCDd process is unable to recover from the 'too much data received' event?
One other thing I noted is that the LCDd process is run with nice level 0 which doesn't seem right.
Steve
-
Hi, according to the Steve's experience, I am trying to "slow down" a bit the panel… for example I changed TitleSpeed to 5 (as was in the configuration of the tarball package). This impacts on the screens that have a "scrolling"... let's see how it goes, I test it for some days...
Ciao,
Michele -
reading some documentation about this "LCDd: error: huh? Too much data received… quiet down" error, maybe I could add some delay (10, 20ms) between each command the client sends to LCDd...
-
This is certainly a great learning experience! :)
It seems that the 'too much data received' message only exists in 0.53.
It seems to be an error related to the amount of data sent rather than the speed. More than 7168B (perhaps bits?).
Subsequent versions attempt to read all the data into a buffer and process it. My guess is that the buffer is filled and it gets stuck in a loop but we're not seeing any of the warning messages for some reason (even though I have turned the logging level up to 5).
Even in 0.53:} else if (nbytes > (MAXMSG - (MAXMSG / 8))) /* Very noisy client...*/ { sock_send_string(clientSocketMap->socket, "huh? Too much data received... quiet down!\n"); report(RPT_WARNING, "%s: Too much data received on socket %d", __FUNCTION__, clientSocketMap->socket); return -1;
We are seeing the error message sent back to the client but not the warning report.
I'm not sure how internal 'sockets' work so I'm guessing here.Steve
-
It seems to be an error related to the amount of data sent rather than the speed. More than 7168B (perhaps bits?).
could be, but a little delay in sending the data could help in flushing the buffer… don't know, but since fortunately now I am having some problem too I only changed the "scrolling" delay. If with this change I solve the problem I will post an update... but since now I can reproduce the error I have also some investigation to do...
Thanks,
Michele -
Hi I having problem with the display running on the alix2d13 hardware.
U204FB-A1 20x4 DisplayWhat is the driver for this LCD?
I use the hd44780 driver, on a usb port.
The system runs pfsense 2.0.1 i386 on a 4 Gb CF-card
U204FB-A1 20x4 Display (LCD2USB)(Controller hd44780)I have another setup with a
Asus Hummibird AtomD510
4Gb Ram
250Gb HD
U204FB-A1 20x4 Display (LCD2USB)(Controller hd44780)
pfsense 2.0.1 64bitI've no problem with this system.
-
Well I had no luck with the "test" sdeclcd.so driver. Hit 100% CPU after 10:35 uptime. Interestingly I watched it go from 72% at 10 hours to 100% 35 mins later.
I'm now going to install the same config as stephenw10 as well and try. stephenw would you mind reposting the tarball in this thread for ease of finding?
Hi, according to the Steve's experience, I am trying to "slow down" a bit the panel… for example I changed TitleSpeed to 5 (as was in the configuration of the tarball package). This impacts on the screens that have a "scrolling"... let's see how it goes, I test it for some days...
Ciao,
MicheleMy screens default refresh interval is 5 seconds.