title: Jittery about jitter?
by: Scott Bradner
We are told in just about every venue that the Internet needs all sorts of QoS mechanisms to make it useful. Some recent real-world experiments seriously question the idea that this is true. The experiments do not indicate that today's Internet is already 5-9s reliable (the mythical reliability of the phone system) over long periods of time but they did show 4-9s over the test period and 100% (infinite 9s?) reliability for one particular week.
The tests were done by Steve Casner, Cengiz Alaettinoglu and Chia-Chee Kuan of Packet Design and were reported at The North American Network Operators' Group (NANOG) meeting back in May. The presentation an be found at http://www.nanog.org/mtg-0105/casner.html. (Additional disclaimer: I am on the Packet Design technical advisory board.)
The test was not an easy one. A 1 Mbps stream of traffic was sent between test hosts installed in an Internet service provider's (ISP) points of presence in San Francisco and Washington DC. The data stream went through the ISP's routers and operational backbone links. The data stream consisted of random-length packets (between 64 and 1500 bytes long) at random intervals (with a 6 millisecond mean interval.) The tests were run for 15 periods of 5-7 days each. A timestamp was included in each packet and the latency was measured with 20-microsecond accuracy.
The observed jitter over the entire test was less than 1 millisecond for 99.99% of the packets. The availability was the same - 99.99%. Even with this level of reliability a further improvement was made by changing things so that the ARP table in the routers did not timeout. With this change, and an absence of fibertropic backhoes, 69 million packets were sent over a week with zero packets lost and 100% of the jitter less than 700 microseconds.
There were a few funnies observed during the tests where things went very strange indeed with latencies of multiple seconds for a few hundred seconds. These were few and probably were the result of routing loops caused by link failures. The tests reported at NANOG were over a single IP-hop between just two routers (though it was multiple ATM hops underneath.) More recent tests have been done over multiple IP-hops (i.e., through more than two routers) with comparable results.
What do these tests mean? For one thing they mean that, at least on an ISP backbone, IP networks are already easily reliable enough for interactive voice traffic without any QoS mechanisms. These tests did not include customer tail circuits or customer networks, which can often be overloaded, so they are not of Internet end-to-end connections. But the test results do indicate that customers with uncongested ISP links to an over provisioned ISP (most of the big ones) will get very high quality voice transport without having to pay extra for QoS. They may get hit for a while when an ISP link fails but many people may put up with 0.01% down time for a no-extra-cost service. This portends quite well for inter-site IP-based PBX connections and is not good news to the die-hard 'the Internet needs circuits' folk.
disclaimer: Whatever Harvard is, it is not a no-extra-cost service and the above observation is my own.