Snap 27 Oct 10:31:57 CDT - broken IPSEC status

gerdesj

I was hoping to give some feedback on this https://redmine.pfsense.org/issues/5149 (memory leak(s) in strongswan) and so applied this:

2.2.5-DEVELOPMENT (amd64)
built on Tue Oct 27 10:31:57 CDT 2015

Unfortunately Status -> IPSEC only shows some (not all) details on phase 1 and no P2 details. The SAD and SPD tabs seem OK and tunnels are running. The dashboard widget is completely broken:


Warning: Cannot use a scalar value as an array in /usr/local/www/widgets/widgets/ipsec.widget.php on line 60 
Warning: Cannot use a scalar value as an array in /usr/local/www/widgets/widgets/ipsec.widget.php on line 61 
Warning: Cannot use a scalar value as an array in /usr/local/www/widgets/widgets/ipsec.widget.php on line 62

The tunnels that really are down, do seem to display correctly though.

cmb

The status XML is invalid after the change to vstr. We're trying with builtin instead for other reasons, will re-test afterwards.

That only impacts the status output display, functionally it should be fine if you want to keep running it. 'ipsec statusall' will show the proper status if you want to check it manually in the mean time.

gerdesj

@cmb:

The status XML is invalid after the change to vstr. We're trying with builtin instead for other reasons, will re-test afterwards.

That only impacts the status output display, functionally it should be fine if you want to keep running it. 'ipsec statusall' will show the proper status if you want to check it manually in the mean time.

I thought it was something like that. I'm keen to see if the vstr change fixes the ~six days uptime (51 IPSEC P1s) I get at the moment so am sticking with it. Icinga does the real monitoring around here so a screwy report in pfSense is no real problem. Just checked your list of open issues - there's only one left! Attached is a current memory RRD graph, which is already looking a lot happier.

Great work on what is a really tricky bug. I have gone through the upstream one that was logged and the comment stream reflects very well on the pfSense team's diplomacy skills. Recalcitrant upstream devs might like to reflect on the fact that pfSense is used rather a lot and in some very large deployments - your userbase stresses the networking components that many others might merely tickle …

Thanks.

pf1-memory.png_thumb

jwt

it would have fixed it, but the hacks that <someone>put in the smp plugin don't work well with the implementation via vstr.</someone>

gerdesj

@jwt:

it would have fixed it, but the hacks that <someone>put in the smp plugin don't work well with the implementation via vstr.</someone>

I wouldn't dream of running through the commits to find out who <someone>might be. I'll take stable functionality over pretty reports any day. The classic engineering approach seems to be at work here: bodge in a solution first, paper over the cracks later 8) Sorry, I mean find the root cause and develop a solution via a series of progressively better iterations.</someone>

djamp42

@cmb:

The status XML is invalid after the change to vstr. We're trying with builtin instead for other reasons, will re-test afterwards.

That only impacts the status output display, functionally it should be fine if you want to keep running it. 'ipsec statusall' will show the proper status if you want to check it manually in the mean time.

If someone can let me know when your pretty much done with the ipsec stuff i can test this on my box.

cmb

If you have a situation where the memory leak is a major problem, latest 2.2.5 is significantly better there and only breaks the GUI status pieces. Otherwise, yeah we'll follow up here when the status problem's fixed too, it's being worked on.

jwt

@Jon:

@jwt:

it would have fixed it, but the hacks that <someone>put in the smp plugin don't work well with the implementation via vstr.</someone>

I wouldn't dream of running through the commits to find out who <someone>might be. I'll take stable functionality over pretty reports any day. The classic engineering approach seems to be at work here: bodge in a solution first, paper over the cracks later 8) Sorry, I mean find the root cause and develop a solution via a series of progressively better iterations.</someone>

given that we completely eliminated the former bodge, you're left to choose from only one of those two eventualities.

gerdesj

@jwt:

@Jon:

@jwt:

it would have fixed it, but the hacks that <someone>put in the smp plugin don't work well with the implementation via vstr.</someone>

I wouldn't dream of running through the commits to find out who <someone>might be. I'll take stable functionality over pretty reports any day. The classic engineering approach seems to be at work here: bodge in a solution first, paper over the cracks later 8) Sorry, I mean find the root cause and develop a solution via a series of progressively better iterations.</someone>

given that we completely eliminated the former bodge, you're left to choose from only one of those two eventualities.

… and for that I am extremely gratefull: Our office pfSense box's memory RRD graph has flatlined nicely rather than suffering from severe arrhythmia and regular crashes.

eri--

@jwt:

@Jon:

@jwt:

it would have fixed it, but the hacks that <someone>put in the smp plugin don't work well with the implementation via vstr.</someone>

I wouldn't dream of running through the commits to find out who <someone>might be. I'll take stable functionality over pretty reports any day. The classic engineering approach seems to be at work here: bodge in a solution first, paper over the cracks later 8) Sorry, I mean find the root cause and develop a solution via a series of progressively better iterations.</someone>

given that we completely eliminated the former bodge, you're left to choose from only one of those two eventualities.

Hahhhahahaa it was brought to my attention this thread!

1. You can not find someone since the logs have been erased.
2. Blaming someone/anyone because of ignorance is the next level of ignorance
3. Making someone/anyone guilty because a bug cannot fixed is total incompetence and shows immaturity
4. I feel bad that cmb allows to be talked for someone like this and says no word, this is completely unproffesional

jwt

Ermal,

I had not named you, but since you outed yourself, I will respond.

I wasn't blaming you for the memory leak, that's due an interaction with how strongswan uses it's "built-in" printf extensions and the implementation of same in FreeBSD's libc.

Moving the printf extensions from "builtin" (libc) to vstr stopped (nearly all of) the leak, but, due to the way the strongswan plugin system is architected, the SMP interface is not compatible with the vstr library.

Several times I tried to get you to replace the SMP interface with VICI, and each time you abjectly refused. This despite the demonstrated need, because SMP had been deprecated in-favor of VICI, and the technical debt incurred in maintaining a set of custom patches to a port.

When we finally undertook the work to replace the SMP plugin, it took less than two days to a full solution. In the process, we reduced the technical debt of the project, because we now need fewer custom patches to the Strongswan port in FreeBSD.

The "logs" have not been erased as you accuse. We put the formerly discrete patches (and patches on patches) on a branch in a copy of the FreeBSD 'src' and 'ports' trees.

In closing, a lot of the work you did here was good. It was really your poor attitude, tendency to 'go missing' for extended periods and repeated instances of involving yourself in situations that presented a clear conflict of interest that catalyzed your dismissal.