LCDProc 0.5.4-dev
-
Ok, so 24 hours is up running 0.53 LCDd and the old sdec driver and some interesting results are in!
First off I have had no problem accessing the box during that time and the lcdclient and LCDd processes have run solidly with only one instance of each.Much more interstingly is that twice during the 24hr period the logs show the error in the post above 'too much data received'. It is clear that the process recovers and carries on with seemingly no other effects however it's also clear that during that 'event' the LCDd process uses far more CPU cycles.
After 24hrs:PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 10 root 171 ki31 0K 8K RUN 171.4H 98.00% idle 11 root -32 - 0K 128K WAIT 26:54 0.00% {swi4: clock} 63361 nobody 44 0 3364K 1460K nanslp 25:56 0.00% LCDd
My original estimate was that it should consume around 2minutes in 24hrs.
So is it possible that either the newer driver or LCDd process is unable to recover from the 'too much data received' event?
One other thing I noted is that the LCDd process is run with nice level 0 which doesn't seem right.
Steve
-
Hi, according to the Steve's experience, I am trying to "slow down" a bit the panel… for example I changed TitleSpeed to 5 (as was in the configuration of the tarball package). This impacts on the screens that have a "scrolling"... let's see how it goes, I test it for some days...
Ciao,
Michele -
reading some documentation about this "LCDd: error: huh? Too much data received… quiet down" error, maybe I could add some delay (10, 20ms) between each command the client sends to LCDd...
-
This is certainly a great learning experience! :)
It seems that the 'too much data received' message only exists in 0.53.
It seems to be an error related to the amount of data sent rather than the speed. More than 7168B (perhaps bits?).
Subsequent versions attempt to read all the data into a buffer and process it. My guess is that the buffer is filled and it gets stuck in a loop but we're not seeing any of the warning messages for some reason (even though I have turned the logging level up to 5).
Even in 0.53:} else if (nbytes > (MAXMSG - (MAXMSG / 8))) /* Very noisy client...*/ { sock_send_string(clientSocketMap->socket, "huh? Too much data received... quiet down!\n"); report(RPT_WARNING, "%s: Too much data received on socket %d", __FUNCTION__, clientSocketMap->socket); return -1;
We are seeing the error message sent back to the client but not the warning report.
I'm not sure how internal 'sockets' work so I'm guessing here.Steve
-
It seems to be an error related to the amount of data sent rather than the speed. More than 7168B (perhaps bits?).
could be, but a little delay in sending the data could help in flushing the buffer… don't know, but since fortunately now I am having some problem too I only changed the "scrolling" delay. If with this change I solve the problem I will post an update... but since now I can reproduce the error I have also some investigation to do...
Thanks,
Michele -
Hi I having problem with the display running on the alix2d13 hardware.
U204FB-A1 20x4 DisplayWhat is the driver for this LCD?
I use the hd44780 driver, on a usb port.
The system runs pfsense 2.0.1 i386 on a 4 Gb CF-card
U204FB-A1 20x4 Display (LCD2USB)(Controller hd44780)I have another setup with a
Asus Hummibird AtomD510
4Gb Ram
250Gb HD
U204FB-A1 20x4 Display (LCD2USB)(Controller hd44780)
pfsense 2.0.1 64bitI've no problem with this system.
-
Well I had no luck with the "test" sdeclcd.so driver. Hit 100% CPU after 10:35 uptime. Interestingly I watched it go from 72% at 10 hours to 100% 35 mins later.
I'm now going to install the same config as stephenw10 as well and try. stephenw would you mind reposting the tarball in this thread for ease of finding?
Hi, according to the Steve's experience, I am trying to "slow down" a bit the panel… for example I changed TitleSpeed to 5 (as was in the configuration of the tarball package). This impacts on the screens that have a "scrolling"... let's see how it goes, I test it for some days...
Ciao,
MicheleMy screens default refresh interval is 5 seconds.
-
Here you go. Remove the .png extension.
We need to test either the new driver compiled against 0.53 or the old driver compiled against 0.55. I'm sure I have one of those here somewhere.
Bah! I have many files all named sdeclcd.so. ::)Steve
-
Which one is this tarball complied against? I have just completed installing the sdeclcd.so and LCDd from your from this tarball, all other files are unchanged from the -DEV v. 0.9 (lcdproc-0.5.5) package. Should know something in about 10 hours ;D
I will be happy to coordinate testing with everyone - Just let me know what version you're using and I will run another configuration…
-
Anyone with a compiler setup want to help test this EZIO-100/MTB134 driver? I found it online, but it appears abandoned - I'm not sure if it will work or not. I tried to get it to compile, but clearly pfSense isn't meant to be used for compiling.
Attachments are trailed with .png for attachment rules sake.
-
@tix:
Well I had no luck with the "test" sdeclcd.so driver. Hit 100% CPU after 10:35 uptime. Interestingly I watched it go from 72% at 10 hours to 100% 35 mins later.
…
My screens default refresh interval is 5 seconds.Guys, I have 2 servers running pfSense, one with refresh 1 second, and in this I have NO PROBLEMS, one with refresh 5 seconds and I get the problem. The servers use the same panel (sureelect).
The client goes to "sleep" for the seconds set in the refresh multiplied for the number of screens available (I thought this is the best way to not to waste resources, since every screen is shown every that seconds).
Can you please ALL try a refresh of 1 second??
Thanks,
Michele -
I think we are making progress for the sdeclcd driver. I installed the sdeclcd.so and LCDd versions provided by Steve and I'm happy to report that after 13 hours of uptime I still have a working LCD display and a responsive machine.
This may be short-lived as I am seeing the usage of LCDd climb - not as quickly as with the newer versions: after 13 hours, LCDd has ran for 10:15 and showing 0% CPU.
I'm going to stay with the current configuration until I reach 24 hours uptime or LCDd hits 100% before I change to a refresh interval of 1 sec as suggested by Michele.
I will post the status later when I get back home….. but it's looking better ;D
-
Here's something perhaps of note:
[2.0.1-RELEASE][root@pfsense.fire.box]/root(11): clog /var/log/system.log | grep huh Jan 26 04:24:24 pfsense LCDd: error: huh? Too much data received... quiet down! Jan 26 15:41:46 pfsense LCDd: error: huh? Too much data received... quiet down! Jan 27 03:45:35 pfsense LCDd: error: huh? Too much data received... quiet down! Jan 27 15:01:05 pfsense LCDd: error: huh? Too much data received... quiet down!
Because I was able to predict when it would happen I could watch top and found that even though the logs show the event taking only 10 seoconds in fact LCDd is stuck at 100% for 15 minutes before that.
That is with LCDd 0.53, old sdec driver, 0.8 package code and refresh set to 5 seconds.
Testing now as above but refresh set to 2 seconds. Can't set to 1 second with 0.53:
Jan 27 15:09:39 LCDd: Waittime should be at least 2 (seconds). Set to 2 seconds.
Steve
@tix: Are you seeing errors in the logs?
-
Steve,
looking my secondary machine, I have the feeling that the problems are related to the "scrolling" feature of the panel.In fact I see sometime frozen screens where there is the scrolling… I will keep an eye on it and try to see if it is the problem...
Ciao,
Michele -
Steve I get the same log entries but they occur at the same time yet the display continues to work unlike with the newer code.
Jan 27 05:45:18 pfsense LCDd: error: huh? Too much data received... quiet down! Jan 27 05:45:18 pfsense LCDd: Client on socket 11 disconnected Jan 27 05:45:18 pfsense LCDd: sock_send: socket write error Jan 27 05:45:18 pfsense LCDd: sock_send: socket write error Jan 27 05:45:18 pfsense LCDd: sock_send: socket write error Jan 27 05:45:43 pfsense php: lcdproc: Connection to LCDd process lost () Jan 27 05:45:44 pfsense LCDd: Connect from host 127.0.0.1:8170 on socket 11
What's interesting to me is that this is right at the 10 hour uptime mark where the newer versions stopped working. I wonder if there is something time related causing this as anything newer than 0.53 version of LCDd breaks on my system after 10 hours?? I wouldn't think so but it's strange it was always around 10 hours before reverting…. weird...
-
Interesting that your box (X700?) takes a lot longer than 10 seconds to sort itself out in the log.
The 0.53 code just gives up and errors out where as newer versions include code to handle the extra data so they keep trying.Steve
-
@tix:
Well I had no luck with the "test" sdeclcd.so driver. Hit 100% CPU after 10:35 uptime. Interestingly I watched it go from 72% at 10 hours to 100% 35 mins later.
Ok, so leaving the process out of "realtime round robin", and leaving it with default priority had no effect.
Long shot: When running at 100%, try and "kill" LCDd with signal 6 (kill -6 <pid of="" lcdd="">). This should give a memory image of the process (core dump). If you can make the core file available, I can give a try to loading it up in the debugger and see where the execution ended. The trick is that this needs to be a version of LCDd I have the code for, like V0.5.5, so the debugger can match the binary with the source. I have never done this, so this is will probably lead nowhere…</pid>
-
Could try compiling LCDd with the debug option enabled to get far more logging output.
Steve
-
Could try compiling LCDd with the debug option enabled to get far more logging output.
MyCommand = YourWish;
-
I will try using kill -6 tomorrow, for now I'm enjoying everything working on my x700 for now. ;D
I'm still hung up on the idea of some kind of time issue. I see a problem every 10 hours. Here is the log from this morning and after running during the day today:
Jan 27 05:45:18 pfsense LCDd: error: huh? Too much data received... quiet down! Jan 27 05:45:18 pfsense LCDd: Client on socket 11 disconnected Jan 27 05:45:18 pfsense LCDd: sock_send: socket write error Jan 27 05:45:18 pfsense LCDd: sock_send: socket write error Jan 27 05:45:18 pfsense LCDd: sock_send: socket write error Jan 27 05:45:43 pfsense php: lcdproc: Connection to LCDd process lost () Jan 27 05:45:44 pfsense LCDd: Connect from host 127.0.0.1:8170 on socket 11 ... Jan 27 15:48:23 pfsense LCDd: error: huh? Too much data received... quiet down! Jan 27 15:48:23 pfsense LCDd: Client on socket 11 disconnected Jan 27 15:48:23 pfsense LCDd: sock_send: socket write error Jan 27 15:48:49 pfsense php: lcdproc: Connection to LCDd process lost () Jan 27 15:48:50 pfsense LCDd: Connect from host 127.0.0.1:8576 on socket 11
10 hours apart and the 05:45 was 10 hours of uptime!
As it stands, everything is working great (excluding the log entries) on v0.53 kernel module and v0.53 LCDd. The display continues to work with the default refresh of 5 secs and the webif and ssh connections are responsive. In fact, I would happily accept this level of functionality permanently. :)
But in the interest of perfection, I will apply the v0.9 package kernel mod and LCDd and when it stops responding on the webif after what I believe will be 10 hours of uptime, will kill it with the -6 option (instead of 15). The next step for me after that will be to use the debug-enabled LCDd and wait.