Stopped at m_copydata+0x38: movl 0xc(%esi),%eax

_igor_

I'm having this crashes still very frequent. Is there any news about? Or something i can do to help out. Still can provide bts to lots of this crashes.
None of the last updates solved this problem.
What i can tell is that the crashes occur when a bunch of data is copied. It begins with a good connection when the box is freshly started, slowing down traffic and then lastly crashing. I can reproduce it with big down/uploads. Depends on the initial speed: crashes come earlier when the initial speed is higher.

eri--

If you can install a developer kernel and get a core dump it would be the best way.

_igor_

How do I do that?

I've been looking around for that, but i'm completely lost. Need help. Do you mean "build from scratch"? Compile it?

cmb

When you get a panic, you should get a debug prompt. Run bt at that prompt and paste the output here (or take a picture of the screen(s) if you don't have a serial console).

_igor_

oh, i posted a file with lots of bts. Here: http://forum.pfsense.org/index.php/topic,23119.msg121124.html#msg121124

Oh, shame on me. Installing new i can choose kernel. Is it possible to install the dev-kernel manually? Without a reinstall?

jimp

All of the kernels are in /kernels/ on a full install, so IIRC you can just do:

~~cp /kernels/kernel_Dev.gz /boot/kernel/kernel.gz~~

And then reboot.

EDIT: Don't do that. See here: http://doc.pfsense.org/index.php/Switching_Kernels

_igor_

Ouch! This did not work! (cp /kernels/kernel_Dev.gz /boot/kernel/kernel.gz), After that system didn't load.
So i untared the kernel_Dev.gz to /boot, which worked. System booted again and its bombing syslog with messages. Think this has to be so.

What happens in case of crash? Is there a file deposited in some place? ermal is speaking of a core dump. Will there be a /coredump file after, which i send in?

Thanks much for your encouraged help!

wallabybob

@_igor_:

What happens in case of crash? Is there a file deposited in some place?

Unfortunately getting a crash dump is not as easy as it could be because the pfSense kits are missing a crucal piece.

In FreeBSD a crash file can be written to the swap file. On reboot, the savecore utility can be invoked to copy the crash file from the swap file (and release the swap file space for use) to the file system, conventionally to /var/crash. But the savecore utility is not in the pfSense kit. It should be possible to use a savecore copied from a FreeBSD 8.0 system.

The pfSense kernels don't have local symbols so its not possible to call the static function doadump() to write the dump file. Hence the kernel needs to be tweaked to NOT enter the debugger on a panic. The kernel also needs to be told where to write the dump file.

The following commands show how to find the name of the swap file, set the dump device to the swap file and set the kernel to write the crash file on a panic rather than enter the debugger. Note that the name of your swap file may be different from the name of my swap file.

swapctl -l

Device: 1024-blocks Used:
/dev/ad0s1b 266240 0

dumpon /dev/ad0s1b

sysctl debug.debugger_on_panic=0

debug.debugger_on_panic: 1 -> 0

My FreeBSD system is currently dead so I can't easily check this. It appears the path to the savecore utility is normally /sbin/savecore so you should copy /sbin/savecore from a FreeBSD 8.0 system to your pfSense box.

After a crash dump is written and your system is rebooted you should give the command

savecore /var/crash /dev/ad0s1b

to save the crash file to the file system. (You should use your actual swap file where I have typed /dev/ad0s1b.) Crash dumps are generally pretty large so its usually worthwhile compressing them by gzip or equivalent.

The pfSense kit doesn't include the kgdb utility for crash dump analysis, but this won't be of concern if you are passing the crash dump off to someone else for analysis.

wallabybob

One another thing about crash dumps. If you submit a crash dump file it is also useful to submit the accompanying kernel file from /var/crash OR clearly identify the pfSense kit you used OR both. The standard kernel debugging tool (kgdb) gets symbols from the kernel file, not the crash dump.

This experience illustrates why both are needed: Your panic reports state an access violation occurred at the instruction movl 0xc(%esi),%eax at m_copydata+0x38. On the kernel I'm currently running m_copydata+0x38 is the second byte of a two byte instruction. Perhaps your kernel and mine were built with different options OR the source code is different. Without knowing what kernel you are using it can become quite a challenge to match the machine instructions with the source code.

_igor_

Ok, had now crashes, but no dumpfile. Nowhere. I have a bt. Its attached.

In /var/crash I have only one file: minfree
Its content is 2048. Nothing more.

@wallabybob: I don't understand anything of what you try to tell me. Its chinese for me, i'm not a programmer. Sorry for that.

[pfSense Crash.txt](/public/imported_attachments/1/pfSense Crash.txt)

_igor_

Next crash happened! The last output of the box:

Memory modified after free 0xc6007800(2048) val=b00c0de @ 0xc6007800
Memory modified after free 0xc67bd000(2048) val=b00c0de @ 0xc67bd000
panic: m_copydata, length > size of mbuf chain
cpuid = 0
KDB: stack backtrace:

The used snap:
Sun Mar 14 08:41:38 EDT 201
0 sullrich@FreeBSD_8.0_pfSense_2.0-snaps.pfsense.org:/usr/obj.pfSense/usr/pf
Sensesrc/src/sys/pfSense_Dev.8 i386

wallabybob

@_igor_:

@wallabybob: I don't understand anything of what you try to tell me. Its chinese for me,

It doesn't help that I use roman characters? :)

Sorry, my excuse is that its not easy to write instructions for someone when you don't know their level of technical expertise in the topic.

Lets try again. I'll assume you want to configure your pfSense box to write crash dump files.

Do you have access to a FreeBSD 8.0 system? Can you copy /sbin/savecore from the FreeBSD 8.0 system to /sbin/savecore on the pfSense system?

If your answer to both questions is "Yes" please copy /sbin/savecore to your pfSense system.

Regardless of your answers to the above two questions please use the command shell on your pfSense box to configure the pfSense kernel to write crash dumps as follows:

_Find out the swap file location so we can tell the kernel where to write the crash dump. The shell command "swapctl -l" will display the location of the swap file. Here's a example of using swapctl on my system

swapctl -l

Device: 1024-blocks Used:
/dev/ad0s1b 266240 0

This tells me that /dev/ad0s1b is the swap file on my system. The swap file may be at a different location on your system. Now tell pfSense where it should write its dump file:

dumpon /dev/ad0s1b

Now issue the following shell command to modify a kernel variable so that it writes a dump file when a panic occurs:

sysctl debug.debugger_on_panic=0_

If you successfully give the above three shell commands your system system will write crash dump information to the system swap file when a panic occurs. (The crash dump information is not written to the regular file system since a panic generally means something in the kernel is seriously messed up and the panic code can't be certain that the file system information in the kernel isn't messed up so the crash dump information is written to the swap file since that is part of the hard drive that is known to be available for use.) The panic won't enter the debugger and so you won't be able to enter debugger commands. The pfSense system should reboot automatically after writing the crash dump information.

Can you make more sense of that?

wallabybob

@_igor_:

Next crash happened! The last output of the box:

Memory modified after free 0xc6007800(2048) val=b00c0de @ 0xc6007800
Memory modified after free 0xc67bd000(2048) val=b00c0de @ 0xc67bd000
panic: m_copydata, length > size of mbuf chain
cpuid = 0
KDB: stack backtrace:

If the pfSense kernel is built with the appropriate options "heap checking" is enabled. When heap checking is enabled the system will fill a block of heap memory with a known value when the code says it is freeing the block of memory because it has finished using it. Then when code requests a block of heap memory for temporary use (for example, to use as a receive buffer for a network card) the heap checker checks that the block is filled with the same value written there when the block was deallocated. If not, a message like

Memory modified after free 0xc6007800(2048) val=b00c0de @ 0xc6007800

is output to the console. When I last looked at the FreeBSD heap checking it wrote the value 0xdeadc0de to blocks of memory that had been freed. The message suggests that the upper 16 bits of the referenced location were unexpectedly modified.

"Memory modified after free" indicates a serious programming error in that something (might be hardware) is modifying memory that the system doesn't think is owned by the modifying entity. These errors can be hard to track down.

_igor_

The "memory modified after …" appears directly when any traffic passes out or in from wan.
If no traffic, no messages like them.

_igor_

Is there any news? Even with snap from yesterday I have frequent crashes! There is no change in behaviour.
Connection gets slower and slower till crash. Really lots of times a day if there is much traffic.

jimp

Have you ruled out hardware problems?

I haven't heard of anyone else having such crashes.

ttlinna

@_igor_:

Is there any news? Even with snap from yesterday I have frequent crashes! There is no change in behaviour.
Connection gets slower and slower till crash. Really lots of times a day if there is much traffic.

I second this. I've tried many different snapshots and many different types of hardware (Alix 2D3, Soekris 5501, multiple different pc's) and always the same.

The system keeps running for about a day (low traffic) and then crashes. I've not been able to grab error logs, but as far as I've seen, they seem to be quite similar to the ones posted to here at the forum.

This bug is REALLY annoying and prevents usage of newer (since BETA-stage) snapshots in testing environments. In my opinion, finding the solution for this should be one of the highest important issues.