LCDProc 0.5.4-dev
-
Shot in the dark: is the client reading the responses back from LCDd? The release notes mentioned something about ignoring them causing unpleasant behavior …
Fmertz,
what do you mean? I just read the entire changelog and the release notes but I didn't find anything about that. Can you give me some references?EDIT: Btw, yes, the client is reading the responses and log on pfSense if there is some error reported (messages with huh?)
https://github.com/fmertz/sdeclcd/blob/master/BUGS
The documentation for responses is here: http://lcdproc.sourceforge.net/docs/lcdproc-0-5-5-dev.html#language-messages
It says: LCDd can send messages back to the client. These messages can be directly related to the last command, or generated for some other reason. Because messages can be generated at any moment, the client should read from the connection at regular intervals. A very simple client could simply ignore all received messages. Not reading the messages will cause trouble !
I read this to mean that LCDd could generate more than one response to a command, or even send text outside of typical responses to commands. Does the PHP code accommodate for this? Reading the code, there seems to be the assumption that only 1 response comes back, maybe leaving responses hanging, and slowly filling the buffer. Just a thought.
-
It says: LCDd can send messages back to the client. These messages can be directly related to the last command, or generated for some other reason. Because messages can be generated at any moment, the client should read from the connection at regular intervals. A very simple client could simply ignore all received messages. Not reading the messages will cause trouble !
I read this to mean that LCDd could generate more than one response to a command, or even send text outside of typical responses to commands. Does the PHP code accommodate for this? Reading the code, there seems to be the assumption that only 1 response comes back, maybe leaving responses hanging, and slowly filling the buffer. Just a thought.
well… the client polls the data from LCDd, but maybe it's not enough... I am trying to see if I can do that better... in the while THANKS for the suggestions, this looks to me as the right direction! ;)
-
If could be REALLY that… I am deeply debugging, and I found out that for each command I send to LCDd, this answer with some answer MORE... so in the end I send:
widget_set scr_time text_wdgt 1 2 20 2 h 4 "2/7/2012 23:19" widget_set scr_time text_summary 1 4 "01% 56% 4529 37%" widget_set scr_uptime text_wdgt 1 2 20 2 h 4 "17 days 9:18" widget_set scr_uptime text_summary 1 4 "01% 56% 4529 37%" widget_set scr_system text_wdgt 1 2 20 2 h 4 "CPU 11%, Mem 56%" widget_set scr_system text_summary 1 4 "01% 56% 4529 37%" widget_set scr_load text_wdgt 1 2 20 2 h 4 "0.06, 0.04, 0.01" widget_set scr_load text_summary 1 4 "01% 56% 4529 37%" widget_set scr_states text_wdgt 1 2 20 2 h 4 "Cur/Max 4578/500000" widget_set scr_states text_summary 1 4 "01% 56% 4529 37%" widget_set scr_ipsec text_wdgt 1 2 20 2 h 4 "IPSEC Disabled" widget_set scr_ipsec text_summary 1 4 "01% 56% 4529 37%" widget_set scr_traffic title_wdgt 1 1 "IN: 45.1 Kbps" widget_set scr_traffic text_wdgt 1 2 "OUT: 2.1 Kbps" widget_set scr_traffic text_summary 1 4 "01% 56% 4529 37%"
and it answers:
success success success success success success success success success success success success success success success ignore scr_system listen scr_load ignore scr_load listen scr_states ignore scr_states listen scr_ipsec ignore scr_ipsec listen scr_traffic ignore scr_traffic listen scr_time ignore scr_time listen scr_uptime ignore scr_uptime listen scr_system ignore scr_system listen scr_load
so if the ratio between write and get is 1:1, sooner or later the LCDd buffer will get full and LCDd will hang. Sorry but I found that code there and I really gave for granted that the ratio was 1:1, but the client gets too many answers from LCDd. The client must to "suck" all that answers in order to keep LCDd stable.
I will test this and publish a new release ASAP!
Ciao,
Michele -
Looks promising. :)
More excellent work.Steve
-
If you all set only one screen with 0.5.3 and pkg 0.9.2 do you have this problem?
Thanks,
MicheleI'm testing now, using only the "Interface Traffic" screen set to WAN. It doesn't scroll left/right but obviously updates so the updating activity should tell us something I hope.
Will post an update in about 12 hours.
-
I will test this and publish a new release ASAP!
Which also can be read the other way (negative test): if you remove the code to read the responses, does LCDd die quickly, with 100% CPU?
-
I will test this and publish a new release ASAP!
Which also can be read the other way (negative test): if you remove the code to read the responses, does LCDd die quickly, with 100% CPU?
Hey,
yes… before was like this:
8 Screen enabled
1 second as interval refreshLCDd was dieing after about 2h.
With the full read of all the responses, same number of screens and refresh, it is working since 20h and it's still up and running.
This evening I try to close a version and publish it...
Ciao,
Michele -
Hi,
I just released a new version of the package. I feel that we are at a good point.Here is the changelist:
- Improved the reception of data from LCDd. Now there's a cycle that run until there's no more data do receive. Before that there was the possibility of a buffer overflow in LCDd. The timeout of the receiving socket is 25ms.
- Optimized the number of commands sent to LCDd every cycle. Now it is sent only the half;
- Rewritten with better code the error handling;
- Increased to 3 the number of attempts the client performs to reconnect to LCDd in case of disconnection;
- Simplified the startup scripts. Now lcdproc_client.sh is not generated/run anymore, since the error handling is managed by the client;
- Changed the startup scripts to run both LCDd then the client as "nice" process;
- Capped the wait time between each client cycle to 5 seconds. It still is calculated as the refresh frequency * the number of screen activated, but now it's capped;
- Increased to 8000 chars the trunk of data received from LCDd;
- Improved the "service stop" script. Now it cycle until LCDd is definitely killed and works even if LCDd is hung;
- Added a "welcome" string on the panel at the LCDd startup;
The new package is versioned 0.9.3, I just made a pull request in order to distribute it…
Please give some feedback (including driver, refresh rate, enabled screens and so on)!
Thanks,
Michele -
can't wait to give it test drive :-)
-
Woot! ;D
-
Ops… I found out a little mistake, I cancelled the pull request, will do it tomorrow morning, I swear!
Sorry buddies...
-
@tix:
If you all set only one screen with 0.5.3 and pkg 0.9.2 do you have this problem?
Thanks,
MicheleI'm testing now, using only the "Interface Traffic" screen set to WAN. It doesn't scroll left/right but obviously updates so the updating activity should tell us something I hope.
Will post an update in about 12 hours.
Well in 24 hours of running, I've had no issues running with only this screen enabled with the "standard" .92 package.
-
@tix:
Well in 24 hours of running, I've had no issues running with only this screen enabled with the "standard" .92 package.
Hi Tix,
yes, everything worked with 1 screen only, because in that case there were no "extra messages" (activate/ignore/etc.) from LCDd that caused a buffer overflow after some hour.With the new version of the package we should not have this problem anymore…
Can you confirm it?
Thanks,
Michele -
I have downloaded the new package and enabled the following screens:
-Uptime
-Load
-States
-Mbuf
-Interface Traffic(WAN)
None of these scroll left/right.I installed the package and rebooted to have a clean start. I noticed that load seemed a little higher than usual on startup and took a little bit longer to "calm down" but I'm not entirely sure it's related to LCDproc.
I will post later today or tomorrow if there is anything to report as I'm hoping we have this corrected. :)
Thanks so much for the effort Michele!!!
-
So far so good.. i'm using the picolcd.so driver, 20x2 display
The following is enabled:
Time
Uptime
System
Disk
Load
States (Not sure if other notice, it doesn't show the correct max, always 10000)
Interfaces (scrolls thru them)
Mbuf
Interface Traffic (WAN)one day I'll swap to a different case so I can have a 20/4
-
Testing now.
One thing I have noticed that I didn't expect is:[2.0.1-RELEASE][root@pfsense.fire.box]/root(2): ps aux|grep lcd root 4965 0.0 0.3 3656 1520 ?? I 8:27PM 0:00.01 /bin/sh /usr/local/etc/rc.d/lcdproc.sh start root 6483 0.0 3.4 46428 17476 ?? SN 8:27PM 0:01.11 /usr/local/bin/php -f /usr/local/pkg/lcdproc_client.php
I didn't expect to see lcdproc.sh running, not that it's a problem.
Steve
Edit: I also so that the version of the sdeclcd driver is that with real time process priority set. Is that deliberate? That too should be no problem.
-
Hehe!
You're welcome Tix, let's see how much this version resists… I didn't have any problems since two days with 2 servers since almost 48h, but we need to share the data to consider it stable.
on both machines I use as driver sureelect, 1 second refresh, 8 screens on one and just 1 screen on the other one. The other options are left as default...
Ciao,
Michele -
Testing now.
One thing I have noticed that I didn't expect is:[2.0.1-RELEASE][root@pfsense.fire.box]/root(2): ps aux|grep lcd root 4965 0.0 0.3 3656 1520 ?? I 8:27PM 0:00.01 /bin/sh /usr/local/etc/rc.d/lcdproc.sh start root 6483 0.0 3.4 46428 17476 ?? SN 8:27PM 0:01.11 /usr/local/bin/php -f /usr/local/pkg/lcdproc_client.php
I didn't expect to see lcdproc.sh running, not that it's a problem.
Steve
Edit: I also so that the version of the sdeclcd driver is that with real time process priority set. Is that deliberate? That too should be no problem.
Hi Steve,
well, I removed the "lcdproc_client.sh" script, the one that managed the error counter for the client. lcdproc.sh is the one run to start/stop/restart the package. I honestly don't know why it is still running after it's launched to start the client, but it should not do anything.For the sdeclcd driver, we need to ask to fmertz if he asked to update the binary driver under files.pfsense.org, I guess not if it's not the last version. Anyway, probably after this message he will ask to update it.
Ciao,
Michele -
Hi Cino,
States (Not sure if other notice, it doesn't show the correct max, always 10000)
one day I'll swap to a different case so I can have a 20/4on my x86 pfSense 2.0.1 shows correctly 500000. I am afraid some char is hidden, but with a 20x2 display it should not… :S
Looking at the code, it looks it's able to read $config['system']['maximumstates']. How much did you set?
If you run on the the command prompt, php console, the following commands (copy/paste both), what result do you get?global $config; echo($config['system']['maximumstates']);
Thanks,
Michele -
Not sure what went wrong but I went to add some additional screens and upon clicking save, the log shows the stopping and restarting but then stops again 1 sec later. This happens each time I enable/disable any screen.
Log capture:
Feb 10 18:23:42 pfsense LCDd: Client on socket 11 disconnected Feb 10 18:23:42 pfsense LCDd: sock_send: socket write error Feb 10 18:23:44 pfsense LCDd: Server shutting down on SIGTERM Feb 10 18:23:46 pfsense LCDd: LCDd version 0.5.3 starting Feb 10 18:23:46 pfsense LCDd: Using Configuration File: /usr/local/etc/LCDd.conf Feb 10 18:23:46 pfsense LCDd: Listening for queries on 127.0.0.1:13666 Feb 10 18:23:47 pfsense LCDd: Server shutting down on SIGTERM Feb 10 18:23:58 pfsense php: lcdproc: Failed to connect to LCDd process Operation timed out (60) Feb 10 18:24:09 pfsense php: lcdproc: Failed to connect to LCDd process Operation timed out (60) Feb 10 18:24:20 pfsense php: lcdproc: Failed to connect to LCDd process Operation timed out (60) Feb 10 18:24:31 pfsense php: lcdproc: Failed to connect to LCDd process Operation timed out (60)
Trying to restart via the GUI does nothing (which I think is correct functionality) but running the start script from the shell seems to start the process again.
[root@pfsense]/root(2): /bin/sh /usr/local/etc/rc.d/lcdproc.sh start [root@pfsense]/root(6): ps aux | grep lcd <no results="">[root@pfsense]/root(7): ps aux | grep LCD root 25426 0.0 0.5 3524 1280 1 S+ 6:25PM 0:00.00 grep LCD [root@pfsense]/root(8): clog /var/log/system.log | grep LCD <snip>Feb 10 18:31:25 pfsense LCDd: LCDd version 0.5.3 starting Feb 10 18:31:25 pfsense LCDd: Using Configuration File: /usr/local/etc/LCDd.conf Feb 10 18:31:25 pfsense LCDd: Listening for queries on 127.0.0.1:13666 Feb 10 18:31:27 pfsense LCDd: Connect from host 127.0.0.1:49169 on socket 6</snip></no>
So I'm now running these screens:
-Time
-Uptime
-Disk
-Load
-States
-Mbuf
-Interface Traffic (WAN)and these settings:
-ComPort=/dev/lpt0
-Display Size=2x20 display
-Driver=Firebox SDEC
-Refresh=5 sec
-All other settings are "default"Also, the Load average is staying higher around between 0.60 and 1.4 where prior to this update it stayed less than 0.80 (even when updating RRD). Again not sure it's related but I don't think it's anything to worry about.
Will continue to monitor…