Traceroute is a fantastically popular network-troubleshooting tool, second in popularity only to Ping. And it’s easy to see why: traceroute quickly shows you what network devices your traffic is going through to reach a destination, and it gives an indication of the performance of each of those devices. Theoretically, you can quickly tell where your traffic stops, where it slows down, and what devices are important for this connectivity to work. It’s both helpful and intuitive, aligning with our mental model of how networks work. It just makes sense.
However, if you’re a regular traceroute user, you know that, sadly, this picture doesn’t always match reality. Traceroute often either doesn’t work, or it shows you results that don’t make sense. The truth is that traceroute, which was invented in 1987, has not kept pace with the changes in networking. Today’s networks have far more stringent security, consistently provide redundancy, and are built using complex hardware architectures that break or confuse traceroute.
Understanding these limitations is the key to deriving correct conclusions from the data traceroute presents. Here are three key things you need to consider:
Instead, traceroute found a clever way to use the loop prevention mechanism built into IPv4 to derive its results. Here’s an example of how that works: Imagine you send a letter to your neighbor, and the post-office stamps the letter as Return to Sender. If you then stamp it as Return to Sender, the letter is stuck in this infinite loop between you and your neighbor.
The loop prevention mechanism sets each letter with an expiry (time to live) of how many times it can be sent from one address (read “router”) to another. It’s a tally starting at either 8, 16, 64, or 255. Each time it is sent from a router, the tally is reduced by 1. When the tally reaches zero, the router that counted it to 0 sends a letter back to the source saying “this letter expired.”
Ultimately, this is a work-around, and like most work-arounds, at some point you’ll come unstuck as the situation gets more complex!
As the internet has grown more popular and been used for ecommerce, it has became more of a target, and security mechanisms have had to be put in place to protect it. Today, the smart security policy is to allow only what you need and block the rest. Traceroute is not required for users to access their applications, so it is often not allowed through firewalls.
If you do try to allow traceroute through firewalls, it can be a challenge. The problem is different operating systems have different implementations of traceroute that vary significantly. Some implementations use UDP. Some use ICMP. In rare cases, TCP is used. On top of that, some operating systems use many different ports. So, as you can imagine, getting traceroute to work reliably can be messy.
As a result, traceroute often does not make it through firewalls. The essence of the problem is that traceroute does not look like end-user traffic and is not needed for users to stay connected with applications.
Transit over the internet is often multipath—a protocol that allows a TCP connection to use multiple paths to maximize resource usage and increase redundancy. This presents a distinct problem for traceroute, as it does poorly with multi-path.
Acknowledging and understanding the limitations of traceroute can help you analyze results and come up with the best conclusions you can in today’s networks. Traceroute remains one of the most popular network troubleshooting tools. If you’re responsible for a network, it’s crucial that you understand it well.
The best thing about traceroute is how well it fits our mental model of a network. At SolarWinds, we’ve spent a lot of time understanding and providing solutions for the many problems with traceroute. We wanted to create something that would provide a clear, intuitive picture of the health of network connectivity from a source to a destination, but would also work in modern networks. We’ve done just that with NetPath™.
The NetPath feature uses a proprietary discovery method to identify network paths in relatively simple or very complex multi-path networks and then visually displays performance details regardless of the location: on-premises, in hybrid networks, and in the cloud.
To understand more about the issues with traceroute and how to work with them, click here to download our free whitepaper, The Shortfalls of Traceroute in Modern Multi-Path Networks
NetPath takes your support from reactive to proactive monitoring. Try NetPath for free now.
© 2018 SolarWinds MSP UK Ltd. All rights reserved.
Get the latest MSP tips, tricks, and ideas sent to your inbox each week.