Where to start with network troubleshooting
Where to start troubleshooting is very situation dependent.
For example:
If you can SSH into a server but can’t connect to your database. It would be a good idea to start on the application layer
Whereas, if your internet on your local machine isn’t connecting to the internet. It would be a good idea to start on the physical layer
-
1. Physical
-
2. Data-Link
-
3. Network
-
4. Transport
-
5. Application
We will be using the TCP/IP 5 Layer Model to troubleshoot networking issues, and it is best going through layer by layer.
The Session and Presentation layers play a comparatively less active role in the functioning of a network compared with the other OSI model layers.
https://www.redhat.com/sysadmin/beginners-guide-network-troubleshooting-linux
1. Physical
IP command
The ip command is a more powerful alternative to the long running ifconfig.
The ip command is functionally organized on two layers of the network stack, Layer 2 (Data-link Layer) and Layer 3 (IP Layer) but we can still use this to check if simply the network interface is disabled as well.
While ifconfig mostly displays or modifies the interfaces of a system, ip command is capable of doing the following:
- Displaying or Modifying Interface Properties.
- Adding, Removing ARP Cache entries along with creating new Static ARP entry for a host.
- Display MAC addresses associated with all the interfaces.
- Displaying and modifying kernel routing tables.
Displaying all Network Interfaces
showing all the interfaes whether enabled or disabled
> ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: wlp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 3c:15:c2:cc:fb:a6 brd ff:ff:ff:ff:ff:ff
inet 192.168.20.12/24 brd 192.168.20.255 scope global dynamic noprefixroute wlp3s0
valid_lft 75898sec preferred_lft 75898sec
inet6 fe80::c05b:fa6d:8b11:3d29/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:69:84:08:08 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:69ff:fe84:808/64 scope link
valid_lft forever preferred_lft forever
enable/disable network interfaces
You can enable or disable network interfaces with “ip link set <network interface>" and with the “down” or “up” flags
> sudo ip link set wlp3s0 down
> ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: wlp3s0: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
link/ether 3c:15:c2:cc:fb:a6 brd ff:ff:ff:ff:ff:ff
inet 192.168.20.12/24 brd 192.168.20.255 scope global dynamic noprefixroute wlp3s0
valid_lft 75168sec preferred_lft 75168sec
3: docker0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:69:84:08:08 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
We can check all connections are plugged in by also using the following below.
> ip link show : to check if the network interface isn’t just disabled
> ip link set eth0 up : put the required network interface up
> ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000
link/ether 52:54:00:82:d6:6e brd ff:ff:ff:ff:ff:ff
2. Data-Link
Check if MAC address filtering maybe local IP address conflicts
The most relevant Layer 2 protocol for most sysadmins is the Address Resolution Protocol (ARP)
Which maps Layer 3 IP addresses to Layer 2 Ethernet MAC addresses
> ip neighbor show : Address Resolution Protocol (ARP)
> ip neighbor show
192.168.122.1 dev eth0 lladdr 52:54:00:11:23:84 REACHABLE
Linux caches the ARP entry for a period of time, so you may not be able to send traffic to your default gateway until the ARP entry for your gateway times out. For highly important systems, this result is undesirable. Luckily, you can manually delete an ARP entry, which will force a new ARP discovery process:
> ip neighbor show
192.168.122.170 dev eth0 lladdr 52:54:00:04:2c:5d REACHABLE
192.168.122.1 dev eth0 lladdr 52:54:00:11:23:84 REACHABLE
> ip neighbor delete 192.168.122.170 dev eth0
> ip neighbor show
192.168.122.1 dev eth0 lladdr 52:54:00:11:23:84 REACHABLE
3. Network
Network addressing and routing issues.
“Ping” command past the default gateway means that the internet is working correctly
> ping countpool.tesaluna.com
PING countpool.tesaluna.com (104.21.75.240) 56(84) bytes of data.
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=1 ttl=53 time=19.0 ms
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=2 ttl=53 time=17.1 ms
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=3 ttl=53 time=18.2 ms
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=4 ttl=53 time=23.5 ms
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=5 ttl=53 time=16.3 ms
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=6 ttl=53 time=26.6 ms
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=7 ttl=53 time=18.5 ms
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=8 ttl=53 time=16.7 ms
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=9 ttl=53 time=21.5 ms
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=10 ttl=53 time=23.5 ms
64 bytes from countpool.tesaluna.com (104.21.75.240): icmp_seq=11 ttl=53 time=15.4 ms
^C
- -- countpool.tesaluna.com ping statistics ---
11 packets transmitted, 11 received, 0% packet loss, time 10013ms
rtt min/avg/max/mdev = 15.354/19.652/26.606/3.437 ms
“Traceroute” lets you discover the path between two nodes and give you information about each hop along the way through TTL at an IP level. TTL field is decremented by 1 for each router that forwards the packet
For each hop traceroute send 3 identical packets.
Each line:
-
You can see the number of the hop
-
You can see IP address of the device at each hop and a hostname if there is one
-
You can see the round trip time for each packet
On Linux / MacOS this is defaulted to UDP
> traceroute countpool.tesaluna.com
traceroute to countpool.tesaluna.com (172.67.183.166), 30 hops max, 60 byte packets
1 NF18ACV.Home (192.168.20.1) 4.201 ms 4.100 ms 4.190 ms
2 100.68.128.1 (100.68.128.1) 6.929 ms 9.969 ms 8.159 ms
3 * * *
4 * * *
5 180.150.0.245 (180.150.0.245) 12.539 ms 12.515 ms 12.492 ms
6 be11-3999.pfl1.vdc03.mel.aussiebb.net (180.150.0.242) 13.657 ms 12.428 ms 13.944 ms
7 159.196.252.17 (159.196.252.17) 23.660 ms 18.620 ms 19.729 ms
8 HundredGigE0-0-0-16.core1.vdc01.syd.aussiebb.net (202.142.143.82) 20.811 ms 20.744 ms 20.705 ms
9 be1.core2.vdc01.syd.aussiebb.net (180.150.0.157) 20.664 ms 20.638 ms 20.619 ms
10 as13335.sydney.megaport.com (103.26.68.78) 26.584 ms 19.515 ms 26.323 ms
11 172.67.183.166 (172.67.183.166) 18.046 ms 17.816 ms 18.297 ms
> sudo tcpdump -i wlp3s0 src 192.168.20.12 and dst countpool.tesaluna.com
[sudo] password for neiltesaluna:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlp3s0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:43:37.060469 IP Neil-MacBookPro > countpool.tesaluna.com.33434: UDP, length 32
14:43:37.060615 IP Neil-MacBookPro > countpool.tesaluna.com.33435: UDP, length 32
...
but this can be changed to an ICMP message using -I flag.
> traceroute countpool.tesaluna.com -I
traceroute to countpool.tesaluna.com (104.21.75.240), 30 hops max, 60 byte packets
1 NF18ACV.Home (192.168.20.1) 2.301 ms 2.791 ms 5.178 ms
2 100.68.128.1 (100.68.128.1) 8.640 ms 9.785 ms 10.569 ms
3 * * *
4 * * *
5 180.150.0.245 (180.150.0.245) 10.524 ms 10.520 ms 11.498 ms
6 be11-3999.pfl1.vdc03.mel.aussiebb.net (180.150.0.242) 11.491 ms 8.760 ms 8.664 ms
7 159.196.252.19 (159.196.252.19) 16.685 ms 15.582 ms 17.231 ms
8 159.196.252.24 (159.196.252.24) 16.287 ms 16.283 ms 16.288 ms
9 HundredGigE0-0-0-4.core1.yourdc-haw.adl.aussiebb.net (180.150.1.137) 17.193 ms 17.188 ms 18.344 ms
10 be2.core1.yourdc-ed.adl.aussiebb.net (180.150.2.41) 18.338 ms 15.356 ms 15.017 ms
11 be1.core2.yourdc-ed.adl.aussiebb.net (180.150.2.53) 15.544 ms 16.186 ms 16.826 ms
12 as13335.adl.edgeix.net.au (103.136.101.10) 16.159 ms 16.143 ms 16.141 ms
13 104.21.75.240 (104.21.75.240) 16.888 ms 17.324 ms 15.378 ms
sudo tcpdump -i wlp3s0 src 192.168.20.12 and dst countpool.tesaluna.com
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlp3s0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:45:58.477313 IP Neil-MacBookPro > 104.21.75.240: ICMP echo request, id 3, seq 1, length 40
14:45:58.477644 IP Neil-MacBookPro > 104.21.75.240: ICMP echo request, id 3, seq 2, length 40
14:45:58.477697 IP Neil-MacBookPro > 104.21.75.240: ICMP echo request, id 3, seq 3, length 40
14:45:58.477709 IP Neil-MacBookPro > 104.21.75.240: ICMP echo request, id 3, seq 4, length 40
14:45:58.477719 IP Neil-MacBookPro > 104.21.75.240: ICMP echo request, id 3, seq 5, length 40
...
4. Transport
Check if any damaged TCP/UDP ports. Firewalls also use this layer to block traffic.
Using the netcat command to troubleshoot in the transport layer, this command requires two mandatory arguments, a host and a port.
These flags allow us to just use the netcat tool to find the status of the report
- z: Zero Input/Output mode
- v: Verbose
> nc -v -z countpool.tesaluna.com 80
Connection to countpool.tesaluna.com 80 port [tcp/http] succeeded!
how to check which ports are open on Linux Machine can be done using:
> sudo netstat -tunapl
-
t: TCP
-
u: UDP
-
l: shows listening sockets
-
p: shows which PID the socket belongs to
-
n: shows numerical addresses instead of host names etc
-
a: shows all processes
Just remember: “TUNA APPLE”
5. Application
The application layer is where all the client-server and service-related applications, such as SMTP, HTTP, POP3, FTP, are used. It is also the layer where you will often find DNS-related issues.
Can use “nslookup” for DNS related issues. Another one is “tcpdump,” which is used for filtering TCP/IP packets and analyzing network packages.
nslookup
This tool displays what server was used to perform the request and the resolution result.
if we type nslookup followed by an address, we can see the returned A record that was used.
> nslookup countpool.tesaluna.com
Server: 127.0.0.53
Address: 127.0.0.53#53
Non-authoritative answer:
Name: countpool.tesaluna.com
Address: 172.67.183.166
Name: countpool.tesaluna.com
Address: 104.21.75.240
Name: countpool.tesaluna.com
Address: 2606:4700:3030::ac43:b7a6
Name: countpool.tesaluna.com
Address: 2606:4700:3034::6815:4bf0
we can also go into it interactive mode to set additional options by just entering:
> nslookup
We can then type the following commands:
-
server: shows the list of authoritative servers, such as your server.
-
server <‘address instead of default name server, eg, 8.8.8.8’>: name resolution queries will be attempted to be completed made using the specified server instead of the default name server.
-
set type=<‘resource record type, eg MX or AAAA’>: lets you see records associated with the host
-
set debug: allows you to see even more information with your query.
Other Network commands
tcpdump
We can utilize the tcpdump command to analyze the packets that is received over a network
> sudo tcpdump -i wlp3s0 src 192.168.20.12 and dst countpool.tesaluna.com
-
i: chooses which network interface we want to observe
-
src: defining the source ip
-
dst: defining the destination ip