Barnyard2 100% CPU
-
Thanks for responding bmeeks.
I also have Snorby, but it runs on a separate machine. The MySQL instance resides on my pfSense box.
I thought there may be an issue with poorly performing SQL, but if that were the case I'd expect to see the mysqld process using high CPU as well. I may try to run a trace against MySQL and see what comes up.
-
Well I've been running a trace of MySQL for quite awhile now, and here's what I've found.
When I first started up Barnyard, it ran about 20 SQL statements. Each one of them took less than 500 ms, exept for the last one, which took two seconds. After these statements completed, no other statements were run even though Barnyard continues to churn at 100%.
I think we can rule out a problem with bad SQL. Here's what was executed (redacted):
sa@localhost on snorby
set autocommit=0
SELECT vseq FROMschema
SELECT sid FROM sensor WHERE hostname = 'router.example.com:re0' AND interface = 're0' AND detail = 1 AND encoding = 0 AND filter IS NULL
SELECT last_cid FROM sensor WHERE sid = 1
SELECT MAX(cid) FROM data WHERE sid='1'
SELECT MAX(cid) FROM event WHERE sid='1'
SELECT MAX(cid) FROM icmphdr WHERE sid='1'
SELECT MAX(cid) FROM iphdr WHERE sid='1'
SELECT MAX(cid) FROM opt WHERE sid='1'
SELECT MAX(cid) FROM tcphdr WHERE sid='1'
SELECT MAX(cid) FROM udphdr WHERE sid='1'
BEGIN
UPDATE sensor SET last_cid = 1290365 WHERE sid = 1
COMMIT
SELECT last_cid FROM sensor WHERE sid = 1
SELECT sig_class_id, sig_class_name FROM sig_class ORDER BY sig_class_id ASC
BEGIN
COMMIT
SELECT sig_id, sig_sid, sig_gid,sig_rev, sig_class_id, sig_priority, sig_name FROM signature -
Thanks for the SQL profiling. As I mentioned, I've found my CPU whirling away at 100% several times due to Barnyard2. That's why I finally just disabled Barnyard as a interim solution.
In my case it would usually be consuming only one core at 100%, but that was enough to run my on-chip temps well up towards 58-60 degrees Celsius. I don't know if Barnyard was getting hung up on reading something in the unified2 log files, or if it was having issues running SQL connections. Based on your profiling results, I'm starting to think it may be getting tripped up trying to read the unified log files on the firewall.
Bill
-
"In my case it would usually be consuming only one core at 100%, but that was enough to run my on-chip temps well up towards 58-60 degrees Celsius"
This is exactly my problem. One core at 100% with high CPU temps.
Like you I think I'm going to have to resort to stepping through the Barnyard2 code. I don't have high hopes but I may get lucky.
-
Can I ask what Barnyard2 is? I see this in my snort install but it seems to require a connection to a server of sorts before I can enable it. Can anyone elaborate of laymen's terms for me?
-
Sure. Barnyard2 takes your Snort logs and puts them in to a database (usually MySQL). This allows you better control when querying the data, and also is needed if you use a front-end for your logs such as Snorby (https://github.com/Snorby/snorby/wiki/What-and-Why-of-Snorby).
-
To expand on what @melvinlusk said – Snort can log using a special binary log file format called unified2. The binary format allows Snort to very quickly write a log message (such as an alert notice) and move to processing the next packet. Barnyard2 is an application that can read the data from the binary alert log and write that data over to a database on a separate database server. Snorby is a very popular destination package consisting of a MySQL database and a web-based front-end for querying/displaying the logged data.
Barnyard2 can also pull the unified2 log data and send it to a few other types of downstream consumers on remote servers. Examples are remote syslog servers or a BRO IDS instance.
Bill
-
bmeeks, I forgot to ask….if you've disabled Barnyard2 on the pfSense box, are you using some other method to populate the Snorby database, or did you just give up on it altogether?
-
bmeeks, I forgot to ask….if you've disabled Barnyard2 on the pfSense box, are you using some other method to populate the Snorby database, or did you just give up on it altogether?
For the moment I've given up, but hopefully not forever. Since it's a home network, not that big of a deal. No state secrets to protect or anything… ;).
I do intend to dive into Barnyard2 at some point in the future. Probably not until either very late this year or early next year. Have another commitment at the moment taking up the majority of my time.
Bill
-
bmeeks, I keep seeing your name in the Snort changelog. Are you the same bmeeks?
-
bmeeks, I keep seeing your name in the Snort changelog. Are you the same bmeeks?
Yes, I have been the Snort and Suricata package maintainer for about the last two years.
Bill
-
I have been experiencing the same thing with the 100% CPU usage and have noticed the following regarding Suricata/barnyard2 package:
I have suricata running on 'em3', and when i look at a```
psroot 69996 0.0 21.6 1335832 897400 - Ss 7:15PM 4:09.46 /usr/local/bin/suricata -i em3_vlan99 -D -c /usr/pbi/suricata-amd64/etc/suricata/suricata_41174_em3_vlan99/suricata.yaml --pidfile /var/run/suricata_em3_vlan9941174.pid
Along with
root 74929 100.0 27.3 2015868 1133720 - Rs 7:15PM 100:39.99 /usr/local/bin/barnyard2 -r 41174 -f unified2.alert --pid-path /var/run --nolock-pidfile -c /usr/pbi/suricata-amd64/etc/suricata/suricata_41174_em3_vlan99/barnyard2.conf -d /var/log/suricata/suricata_em3
In the 'barnyard2' line, it looks like the **-d** flag is specifying an incorrect directory. **/var/log/suricata/suricata_em3** does _not_ exist, but **/var/log/suricata/suricata_em3_vlan9941174/** _does_ exist. I tried killing barnyard and re-running it from the command line manually with the 'proper' directory, and unfortunately it does not appear to fix the problem, but I am curious if others are running into the same issue (barnyard2 being pointed to a directory that is incorrect). Additionally, I am seeing the same SQL profile/log:
2015-12-04T21:05:14.523347Z 10 Query set autocommit=0
2015-12-04T21:06:16.004574Z 10 Query SELECT vseq FROMschema
2015-12-04T21:06:16.006574Z 10 Query SELECT sid FROM sensor WHERE hostname = 'REMOVED' AND interface = 'em3_vlan99' AND detail = 1 AND encoding = 0 AND filter IS NULL
2015-12-04T21:06:16.011023Z 10 Query SELECT last_cid FROM sensor WHERE sid = 12
2015-12-04T21:06:16.012564Z 10 Query SELECT MAX(cid) FROM data WHERE sid='12'
2015-12-04T21:06:16.014615Z 10 Query SELECT MAX(cid) FROM event WHERE sid='12'
2015-12-04T21:06:16.016576Z 10 Query SELECT MAX(cid) FROM icmphdr WHERE sid='12'
2015-12-04T21:06:16.017935Z 10 Query SELECT MAX(cid) FROM iphdr WHERE sid='12'
2015-12-04T21:06:16.019342Z 10 Query SELECT MAX(cid) FROM opt WHERE sid='12'
2015-12-04T21:06:16.020543Z 10 Query SELECT MAX(cid) FROM tcphdr WHERE sid='12'
2015-12-04T21:06:16.021598Z 10 Query SELECT MAX(cid) FROM udphdr WHERE sid='12'
2015-12-04T21:06:16.022552Z 10 Query BEGIN
2015-12-04T21:06:16.023476Z 10 Query UPDATE sensor SET last_cid = 256742 WHERE sid = 12
2015-12-04T21:06:16.025027Z 10 Query COMMIT
2015-12-04T21:06:16.034645Z 10 Query SELECT last_cid FROM sensor WHERE sid = 12
2015-12-04T21:06:16.037875Z 10 Query SELECT sig_class_id, sig_class_name FROM sig_class ORDER BY sig_class_id ASC
2015-12-04T21:06:16.041287Z 10 Query BEGIN
2015-12-04T21:06:16.042224Z 10 Query COMMIT
2015-12-04T21:06:16.043407Z 10 Query SELECT sig_id, sig_sid, sig_gid,sig_rev, sig_class_id, sig_priority, sig_name FROM signatureWhen I run a count(*) from signature I am getting an impressive amount of data back:
mysql> select count() from signature;
+----------+
| count() |
+----------+
| 693521 |
+----------+
1 row in set (0.11 sec)I am not sure if this is unique to my environment but I suspect barnyard is not capable of handling that amount of signatures… going to attempt to do some more testing. Just my $0.02
-
I have been experiencing the same thing with the 100% CPU usage and have noticed the following regarding Suricata/barnyard2 package:
I have suricata running on 'em3', and when i look at a```
psroot 69996 0.0 21.6 1335832 897400 - Ss 7:15PM 4:09.46 /usr/local/bin/suricata -i em3_vlan99 -D -c /usr/pbi/suricata-amd64/etc/suricata/suricata_41174_em3_vlan99/suricata.yaml --pidfile /var/run/suricata_em3_vlan9941174.pid
Along with
root 74929 100.0 27.3 2015868 1133720 - Rs 7:15PM 100:39.99 /usr/local/bin/barnyard2 -r 41174 -f unified2.alert --pid-path /var/run --nolock-pidfile -c /usr/pbi/suricata-amd64/etc/suricata/suricata_41174_em3_vlan99/barnyard2.conf -d /var/log/suricata/suricata_em3
In the 'barnyard2' line, it looks like the **-d** flag is specifying an incorrect directory. **/var/log/suricata/suricata_em3** does _not_ exist, but **/var/log/suricata/suricata_em3_vlan9941174/** _does_ exist. I tried killing barnyard and re-running it from the command line manually with the 'proper' directory, and unfortunately it does not appear to fix the problem, but I am curious if others are running into the same issue (barnyard2 being pointed to a directory that is incorrect). Additionally, I am seeing the same SQL profile/log:
2015-12-04T21:05:14.523347Z 10 Query set autocommit=0
2015-12-04T21:06:16.004574Z 10 Query SELECT vseq FROMschema
2015-12-04T21:06:16.006574Z 10 Query SELECT sid FROM sensor WHERE hostname = 'REMOVED' AND interface = 'em3_vlan99' AND detail = 1 AND encoding = 0 AND filter IS NULL
2015-12-04T21:06:16.011023Z 10 Query SELECT last_cid FROM sensor WHERE sid = 12
2015-12-04T21:06:16.012564Z 10 Query SELECT MAX(cid) FROM data WHERE sid='12'
2015-12-04T21:06:16.014615Z 10 Query SELECT MAX(cid) FROM event WHERE sid='12'
2015-12-04T21:06:16.016576Z 10 Query SELECT MAX(cid) FROM icmphdr WHERE sid='12'
2015-12-04T21:06:16.017935Z 10 Query SELECT MAX(cid) FROM iphdr WHERE sid='12'
2015-12-04T21:06:16.019342Z 10 Query SELECT MAX(cid) FROM opt WHERE sid='12'
2015-12-04T21:06:16.020543Z 10 Query SELECT MAX(cid) FROM tcphdr WHERE sid='12'
2015-12-04T21:06:16.021598Z 10 Query SELECT MAX(cid) FROM udphdr WHERE sid='12'
2015-12-04T21:06:16.022552Z 10 Query BEGIN
2015-12-04T21:06:16.023476Z 10 Query UPDATE sensor SET last_cid = 256742 WHERE sid = 12
2015-12-04T21:06:16.025027Z 10 Query COMMIT
2015-12-04T21:06:16.034645Z 10 Query SELECT last_cid FROM sensor WHERE sid = 12
2015-12-04T21:06:16.037875Z 10 Query SELECT sig_class_id, sig_class_name FROM sig_class ORDER BY sig_class_id ASC
2015-12-04T21:06:16.041287Z 10 Query BEGIN
2015-12-04T21:06:16.042224Z 10 Query COMMIT
2015-12-04T21:06:16.043407Z 10 Query SELECT sig_id, sig_sid, sig_gid,sig_rev, sig_class_id, sig_priority, sig_name FROM signatureWhen I run a count(*) from signature I am getting an impressive amount of data back:
mysql> select count() from signature;
+----------+
| count() |
+----------+
| 693521 |
+----------+
1 row in set (0.11 sec)I am not sure if this is unique to my environment but I suspect barnyard is not capable of handling that amount of signatures… going to attempt to do some more testing. Just my $0.02
The directory name is getting truncated when displayed by the "ps" command, but the entire path is actually all there. Just a display truncation issue with the command. Barnyard2 seems to have some database issues IMHO. It has been problematic with the Signature Reference Table for a while now.
Bill
-
The directory name is getting truncated when displayed by the "ps" command, but the entire path is actually all there.
Best to use ps -wax to see a wide format
-
When I run this SQL, it takes less than two seconds to return data:
SELECT sig_id, sig_sid, sig_gid,sig_rev, sig_class_id, sig_priority, sig_name FROM signature
The row count for my signature table is 461,236. I'm a DBA during my day job, so I'm not convinced that this is the problem….unless it was having trouble pulling unique rows from the signature table and having to do index scans, but I'd expect to see this in the trace.
I can probably test this, what populates the signature table? Are the number of rows determined by the rulesets?
-
I don't know the Barnyard2 code very well, but I do recall that it seemed to get tripped up if the order of references for a given SID changes. For example, in a rules update it may be that a new reference was added for a given SID. If the reference was added to the end of the old list, then things are OK. But if the new reference was inserted into the old list and thus the original order of references in the rule changed, that seems to trip up Barnyard2 when it is calculating a primary key for the SID Reference table. At least that's what I remember off the top of my head without going back and diving into the code again.
The scenario above would kick off the "duplicate primary key" error and cause a Barnyard2 shutdown. Something in the same bug could be at the root of the 100% CPU utilization churning as well.
Bill
-
Tonight I'm going to kick off a more thorough trace to see if it's having problems retrieving the rows you're talking about. I'll keep everybody informed.
-
Today I let Barnyard run for a solid 6 hours, and it churned at 100% the entire time. I ran a MySQL trace for the entire duration, and this is all that came through. The longest duration query was just under 4 seconds.
/usr/local/libexec/mysqld, Version: 5.1.73-log (FreeBSD port: mysql51-server-5.1.73_2). started with:
Tcp port: 0 Unix socket: (null)
Time Id Command ArgumentUser@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000055 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1450373088;
set autocommit=0;Time: 151217 11:26:18
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000032 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1450373178;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.001034 Lock_time: 0.000955 Rows_sent: 1 Rows_examined: 1
SET timestamp=1450373178;
SELECT vseq FROMschema
;User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000014 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1
SET timestamp=1450373178;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000922 Lock_time: 0.000717 Rows_sent: 1 Rows_examined: 1
SET timestamp=1450373178;
SELECT sid FROM sensor WHERE hostname = 'router.Example.com:re0' AND interface = 're0' AND detail = 1 AND encoding = 0 AND filter IS NULL;User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000009 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1
SET timestamp=1450373178;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000144 Lock_time: 0.000052 Rows_sent: 1 Rows_examined: 1
SET timestamp=1450373178;
SELECT last_cid FROM sensor WHERE sid = 1;User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000013 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1
SET timestamp=1450373178;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.308946 Lock_time: 0.308691 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373178;
SELECT MAX(cid) FROM data WHERE sid='1';User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000009 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373178;
administrator command: Ping;
Time: 151217 11:26:19
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.813862 Lock_time: 0.813663 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373179;
SELECT MAX(cid) FROM event WHERE sid='1';User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000009 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373179;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.001157 Lock_time: 0.001100 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373179;
SELECT MAX(cid) FROM icmphdr WHERE sid='1';User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000009 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373179;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.398204 Lock_time: 0.398012 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373179;
SELECT MAX(cid) FROM iphdr WHERE sid='1';User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000009 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373179;
administrator command: Ping;
Time: 151217 11:26:20
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.242931 Lock_time: 0.242716 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373180;
SELECT MAX(cid) FROM opt WHERE sid='1';User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000010 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373180;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.363021 Lock_time: 0.362832 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373180;
SELECT MAX(cid) FROM tcphdr WHERE sid='1';User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000009 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373180;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.049907 Lock_time: 0.049723 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373180;
SELECT MAX(cid) FROM udphdr WHERE sid='1';User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000011 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 0
SET timestamp=1450373180;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000024 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1450373180;
BEGIN;User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000010 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1450373180;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000170 Lock_time: 0.000036 Rows_sent: 0 Rows_examined: 1
SET timestamp=1450373180;
UPDATE sensor SET last_cid = 1310188 WHERE sid = 1;User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000008 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 1
SET timestamp=1450373180;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000611 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1450373180;
COMMIT;User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000008 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1450373180;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000098 Lock_time: 0.000022 Rows_sent: 1 Rows_examined: 1
SET timestamp=1450373180;
SELECT last_cid FROM sensor WHERE sid = 1;User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000009 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1
SET timestamp=1450373180;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000486 Lock_time: 0.000382 Rows_sent: 38 Rows_examined: 38
SET timestamp=1450373180;
SELECT sig_class_id, sig_class_name FROM sig_class ORDER BY sig_class_id ASC;User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000009 Lock_time: 0.000000 Rows_sent: 38 Rows_examined: 38
SET timestamp=1450373180;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000008 Lock_time: 0.000000 Rows_sent: 38 Rows_examined: 38
SET timestamp=1450373180;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000016 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1450373180;
BEGIN;User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000009 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1450373180;
administrator command: Ping;
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000013 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1450373180;
COMMIT;User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 0.000009 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
SET timestamp=1450373180;
administrator command: Ping;
Time: 151217 11:26:24
User@Host: sa[sa] @ localhost [127.0.0.1]
Query_time: 3.906478 Lock_time: 0.000023 Rows_sent: 466133 Rows_examined: 466133
SET timestamp=1450373184;
SELECT sig_id, sig_sid, sig_gid,sig_rev, sig_class_id, sig_priority, sig_name FROM signature; -
In case anyone is interested, Suricata is available with pfSense release 2.2.6. Barnyard2 doesn't have the 100% CPU problem when used with Suricata instead of Snort.
-
I still do have the error with suricata and the new version of Pfsense