Please help to debug a network connection issue
-
@ady2 said in Please help to debug a network connection issue:
2024-09-30T11:00:12.377806+00:00 t30 kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Down
That message means that the 'electrical' connection is lost. The NIC couldn't 'see' the NIC on the other side of the cable anymore.
When the connection goes down, the IP is lost - or more accurate : the 'DHCP lease' is lost (== gateway, network, DNS, etc). Like you ripped out the cable.
Again :
NIC Link is Down
Was this a mechanical or electrical reason ?
Or was it the system itself that took the NIC down because no traffic was going in or out .... because', for, example, the switch on the other side has issues ?A solution would be : change the NIC in the "Ubuntu-serve", get another cable, and change the switch or switch port on the other side and you've excluded all hardware issues.
-
Yup it shows it actually lost link. If that wasn't you it's a problem.
Can you assign a different NIC to it in pfSense? At the server end?
Or, yes, try putting a switch in between them as a test.
-
@Gertjan said in Please help to debug a network connection issue:
@ady2 said in Please help to debug a network connection issue:
2024-09-30T11:00:12.377806+00:00 t30 kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Down
That message means that the 'electrical' connection is lost. The NIC couldn't 'see' the NIC on the other side of the cable anymore.
When the connection goes down, the IP is lost - or more accurate : the 'DHCP lease' is lost (== gateway, network, DNS, etc). Like you ripped out the cable.
Again :
NIC Link is Down
Was this a mechanical or electrical reason ?
Or was it the system itself that took the NIC down because no traffic was going in or out .... because', for, example, the switch on the other side has issues ?A solution would be : change the NIC in the "Ubuntu-serve", get another cable, and change the switch or switch port on the other side and you've excluded all hardware issues.
@stephenw10
The NIC on the Ubuntu-server machine is on the motherboard and is connected with Ethernet cable directly to pfSense computer nic port.
Regarding the mechanical or electrical, I don't know, the connection is lost without any intervention, the cable still connected. I have checked and the Ubuntu-server looks like don't have any sleep or other option to save power by default. I don't know if the no traffic should cause the NIC Link down, but I think it should not for a server that is waiting for clients to connect, maybe I'm wrong.
I also forget to mention, that I have changed to a different port on the pfSense computer (it has a nic card with 4 ports) but it didn't helped. Cable changed once and it happened again after ~ 2 days, so changed again and will see if that will help. -
@stephenw10 said in Please help to debug a network connection issue:
Yup it shows it actually lost link. If that wasn't you it's a problem.
Can you assign a different NIC to it in pfSense? At the server end?
Or, yes, try putting a switch in between them as a test.
Yes, I tried this (forget to mention) by assigning different port on pfSense NIC card, as it has 4 ports, but it didn't helped.
Why do you think adding a switch could help, as that means adding an additional piece of hardware into the chain? Will look to add a switch in between Ubuntu-server computer and pfSense. -
What I don't understand why I could ping from pfSense (source address automatically selected) and from ubuntu-server computer to pfsense ip, when the
NIC Link is Down
If the DHCP ip address lease is lost, doesn't that also means the ping to that address shouldn't work?
-
plugging the PC into pfSense will work but be aware if the PC restarts or shuts down or turns on pfSense sees that as an interface going down/up and restarts packages.
Which interface is enp0s31f6?
-
@SteveITS said in Please help to debug a network connection issue:
plugging the PC into pfSense will work but be aware if the PC restarts or shuts down or turns on pfSense sees that as an interface going down/up and restarts packages.
Which interface is enp0s31f6?
@SteveITS
The enp0s31f6 as well as the logs I posted are from ubuntu-server computer.Don't quite understand/know what are the negative consequences of having a computer connected directly to pfSense instead of going through a switch in between. In my particular case the ubuntu-server is only computer connected to that interface as a compartmentalization.
-
When directly connected, when pfSEnse goes down (reboot) you see this on your ubuntu server :
@ady2 said in Please help to debug a network connection issue:
2024-09-30T11:00:12.377806+00:00 t30 kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Down
and in that case, there is no issue, as when you shut down the network, the network (== ubuntu interface) will be shut down.
And the other way around : when ubuntu shuts down, the connected LAN interface on pfSense will taken down.
-
@Gertjan said in Please help to debug a network connection issue:
When directly connected, when pfSEnse goes down (reboot) you see this on your ubuntu server :
@ady2 said in Please help to debug a network connection issue:
2024-09-30T11:00:12.377806+00:00 t30 kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Down
and in that case, there is no issue, as when you shut down the network, the network (== ubuntu interface) will be shut down.
And the other way around : when ubuntu shuts down, the connected LAN interface on pfSense will taken down.
@Gertjan
Good point.
The time theNIC Link is Down
matches when my pfSense restarted, so that is expected.
The problem looks like is in the ubuntu-server, when NIC is UP in a few seconds after going Down (much faster than my pfSense rebooting time) and the network never comes back after that.I could create a small bash script to be run to restart the network on ubuntu-server as a work around.
But who could explain why the ping is working from pfSense to ubuntu-server and vice-versa if the network is not working? Is that a glitch? I always trusted ping as a way to test network connection, but this is the first time when ping is working and network is not.
-
Its normal that the pfSense NIC comes up pretty fast, as it activates as soon as the driver is loaded and initialized the hardware.
The thing that will take some time, and you can see this very clearly happening when you follow the pfSense boot process : the DHCP server process on any given LAN type interface will be activated somewhat later.Or, as soon as the interface comes up on the unbuntu side, it will kick-off a DHCP client process and it will start to requests for a DHCP lease.
If there wasn't an answer yet, it will add a small delay, and request again, and if no answer, it will double the delay, and request again.
And so on.
This means that even if it takes 30 seconds or a minute, or even more, the DHCP client will get a lease.
This concept is on billions of devices ... every day.@ady2 said in Please help to debug a network connection issue:
But who could explain why the ping is working from pfSense to ubuntu-server and vice-versa if the network is not working? Is that a glitch? I always trusted ping as a way to test network connection, but this is the first time when ping is working and network is not.
'ping' needs the IP network to be up as ARP needs to work.
Device should have a IP setup on both sides, static or DHCP.
Next time, when you see the situation, run a global packet capture on your Ubuntu device, and you should see the ICMP packets coming in. -
If the link is actually down then ping cannot work.
So either it wasn't down and that log is incorrect or the pings you were seeing were misleading, like something else replying perhaps.
Putting a switch in between two devices like that as a test allows one side only to lose link without affecting the other one. Thus if one device has a problem you can find out which one.
If it's a link negotiation issue it may also negate the problem which is also useful troubleshooting info.But here it looks like that log was caused by rebooting pfSense?
-
@stephenw10 said in Please help to debug a network connection issue:
If the link is actually down then ping cannot work.
So either it wasn't down and that log is incorrect or the pings you were seeing were misleading, like something else replying perhaps.
Putting a switch in between two devices like that as a test allows one side only to lose link without affecting the other one. Thus if one device has a problem you can find out which one.
If it's a link negotiation issue it may also negate the problem which is also useful troubleshooting info.But here it looks like that log was caused by rebooting pfSense?
I don't know what happens here, as after the pfSense is restarting (checked today by restarting pfSense computer) the ping and ssh from my laptop (that is on a different interface than ubuntu -server) to ubuntu-server computer is not working anymore till restart the network on ubuntu-server computer. At the same time the ping from ubuntu-server to pfSense, and from pfSense (source address automatically selected) to ubuntu-server works, but when selecting in pfsense the interface my laptop is on, the ping doesn't work.
After restarting the ubuntu-server network, everything works again as expected.I added a switch between ubuntu-server and pfSense and now restarting the pfSense doesn't impact ping and ssh anymore (after pfSense reboot finish).
The issue is solved and I really appreciate all the help I received. I did not know about the directo connection vs using a switch between a client and the pfSense. Learned something new.Thanks
-
That sounds like the server is blocking those pings from outside it's subnet.
You can confirm that by running a pcap on the interface connected to the server in pfSense whilst pinging from the laptop.