Categories
IPv6 Open Internet Standards Technology

Looking for Comments: Internet Measurements Landscape

[Editor’s Note: We are cross-posting this blog post from Leslie Daigle at TechArk related to our ongoing joint work on Measuring Internet Health.]

I’d like to pick your brain for just a little bit…

The Internet has been measured and analyzed since the first connection was made between networks. Since measurement activities are shaped by everything from the intentions of the people taking the measurements to the vantage point of network and service operators, there are many different approaches and activities that are underway today that make up the landscape of “Internet measurement activities”. NOMA is tackling one corner of the landscape, but understanding its value is dependent on an awareness of the bigger picture.

To that end, we’re drafting a paper that aims to provide an overview of existing Internet measurement activities, approaches, challenges and activities, to build out a map of that measurement landscape. It is aimed at the general reader with an interest in the topic, including policy makers, measurement experts wishing to position their work in the landscape of such activities, and network operators seeking to understand available tools, services and practices with regard to measuring the Internet from their network’s perspective.

With this post, and subsequent ones over the coming weeks, I’m putting out draft text and asking if you would kindly share your thoughts on any errors or omissions, or even just general insights.

Comments to measuring<at>techark.org would be most appreciated.

First up for discussion — a review of the challenges that are faced when people start talking about “let’s measure <some aspect of> the Internet”.

1 Issues – Internet data measurement

There are some particular challenges that need to be addressed when reviewing data collected from, or before forming any kind of measurement of the Internet.

1.1 What are the “endpoints”?

On the user’s end, does the measurement start from the user’s desktop computer, or the CPE [1]? While the smarts for the measurement may be running on the user’s desktop, the reality is that the home network (between the desktop and the CPE) may factor negatively into any measurements. For example, a service provider might provide the network to deliver 75Mbps of data to the CPE, but the user’s desktop may be connected to his home network by an old Ethernet cable – top speed 10Mbps. If the user runs a speed test from their desktop, they can’t see anything faster than the 10Mbps wire delivers.

Similarly, providers are often interested in ensuring that “their network” is well-connected to popular sites, such as Facebook and YouTube. From a routing perspective, “their network” means routers and other network boxes that might be spread far and wide geographically, and have little to do with the “last mile” connection to the customer’s premises. An ISP may have great connections to popular services, but if the customer is connected to the ISP by over-subscribed shared links, old copper, or other low grade links, the endpoint is not going to see advantage from that connectivity.

The same is true when talking about IPv4 and IPv6 connectivity – an ISP may support IPv6 in its core, but it takes a lot of work to update the hardware closest to the customers to ensure that each customer has IPv6 connectivity to their CPE. Then, what happens within the home network determines whether or not the desktop can actually connect to anything over IPv6.

On the server end, analogously, does the measurement reach a particular box on the network, or just one of several real or virtual servers that may be supporting a given service. For example, there is no single computer that “runs the Google website”. Sometimes service instances can be distinguished by differing IP addresses, but even a single IP may support a large server farm behind the edge of the service network.

On the one hand, the user only cares about what they experience – which is everything from their desktop to the server providing the responses to their Internet activities. On the other hand, being able to break down performance by some logical “neighbourhoods” helps: separating out the home network performance from the performance within the access network, and subsequent hops to the network service.

1.2 What is “near”?

From Buenos Aires, Argentina to Cape Town, South Africa is 4,276 mi (6,881 km) across the globe. However, that’s not how Internet traffic flows from Buenos Aires to Cape Town. Virtually (and, quite possibly, literally) all routes out of Buenos Aires to Cape Town go through Miami, US. To be quite clear, the distance from Buenos Aires to Miami is 4,405 mi (7,089 km) – already longer than the distance between the two endpoint cities – and then the distance from Miami to Cape Town is an additional 7,650 mi (12,312 km) [2].

That makes Seattle, US (2,732 mi (4,397 km) from Miami) closer to Buenos Aires in the network than Cape Town is, although that is not at all obvious from looking at a geographical map.

1.3 What is a “fixed point” on the Internet?

At a logical level, “the Google server” is a fixed point in the Internet. However, given the discussion of endpoints, above, it should be clear that there is no single Google server, or one single “Google fixed point”. The same is true of other major global services. For some end users, Google and Amazon services may be “close” to each other, and for other end users that may not be true. The difference stems from the fact that each of Google and Amazon necessarily lay out their service CDN/duplication servers in ways that make sense to their own business, and not based on any global Internet service grid.

A “polestar” endpoint is one that is well known and fixed in the network – at a single IP address that is not anycast from multiple vantage points. This describes few major services today (anything popular is hosted by a CDN). Some NTP [3] servers, as general Internet infrastructure, fall into that category. Of course, services that are built out for the purpose of looking through the network towards fixed points can establish their own polestars.

1.4 Span and scope of measurements

With the variations outlined above, another challenge in setting up Internet measurements is ensuring appropriate span or scope of the measurements. For networks under your administrative control, you can manage and account for different factors, and you can install active or passive gatherers at any and all points as necessary. That gives you confidence in the measurements within your own network, but it doesn’t help address the variability of any measurements that reach outside it (e.g., toward a “pole star” server). It also doesn’t necessarily give information that is readily compared outside the scope of your own network.

To get global span, it is necessary to have some kind of reach into and/or through other networks, and diversity is important. The approaches discussed below outline how that has been addressed in projects to date.

[1] CPE is “customer premises equipment”; the box that connects to your ISP’s access network.

[2] To make matters worse, most routes actually go from Miami to some other network node, in places such as Colorado, US or Paris, France, before connecting to Cape Town.

[3] Network Time Protocol – see https://tools.ietf.org/html/rfc5905

Okay — we’re listening! Comments to measuring<at>techark.org would be most appreciated.

© istock photo / tonefotografia

Categories
Technology

Measuring Internet Health – Introducing NOMA

We are interested in undertaking and supporting activities that help maintain a healthy Internet. To that end, we are pleased to announce a new partnership. The Internet Society is working with Thinking Cat Enterprises on the Techark Network Operator Measurement Activity (NOMA) project to unlock more operator-based measurements that tell us something about the health of the Internet. The first target of NOMA is to develop an access network’s perspective on IPv6 performance as one indicator of the Internet’s health.

An unexpected benefit of World IPv6 Launch in 2012 has been an ongoing look into the state of IPv6 deployment in access networks around the globe, as measured by participating websites. One of the guiding principles was that those making a public commitment to IPv6 deployment had to deliver, and it had to be publicly verified. We established a dashboard that indicated whether or not the website was delivering IPv6 on its homepage. Those websites in turn measured the extent to which the access networks were delivering IPv6.

Once the websites had the ability to assess the amount of IPv6 traffic they were seeing, they agreed that they would continue doing this over time. So, once a month since World IPv6 Launch, we have published a set of measurements of the networks that have registered themselves in our database. This is an important reference for people around the globe interested in the state of IPv6 deployment. (You can see those measurements here, and thanks very much to Google, Facebook, Yahoo!, Akamai, and LinkedIn for continuing to contribute to those measurements each month).

These measurements are not without limitations. Each of the measurers has a different view of the Internet based on geography and the demographics of their user populations. Each of the measurers has a slightly different way of measuring IPv6 traffic (you can see the details of what they report). Yet they are broadly in agreement that this is a good rough estimate of what IPv6 traffic looks like on the Internet.

A bigger limitation is that they are from the point of view of websites. They see roughly how much traffic comes from any particular access network; this tends to show what percentage of each network is enabled with IPv6. If one were to look at IPv6 traffic from the access network’s point of view, the amount of IPv6 traffic might look quite different. That’s because traffic gets concentrated on a few very popular websites that tend to transfer a lot of bits. So for an individual user who has IPv6 enabled in an access network, the percentage of traffic that uses IPv6 may be quite high because of IPv6-enabled video consumption (YouTube, Netflix, Facebook, etc.).

Websites are often reluctant to reveal data due to concerns about business and privacy issues. When the websites agreed to reveal this through a neutral third party (the Internet Society), we were able to get a collective view of what is going on with IPv6. The websites reveal the raw data to us, we combine it to produce a collective view, and then make it available to the public.

NOMA aims to do the same thing, but from an operator’s point of view. Collecting a set of views from various network operators starts to tell us something about what is going on in the Internet with a perspective that is otherwise hard to achieve. NOMA aims to find a way to unlock those views and produce measurements that are:

1. Operator-based. Operators are in unique position with respect to knowledge about what is going on in their networks. Deriving measurements from operators is key to having a real view on what is happening in the Internet as a whole.

2. Bottom-up. Because it is operator-based, it is necessarily bottom-up. Operators use the data they have to help them operate their networks better, so the data always starts with identifying what they need to know about how their networks operate.

3. Collaborative. Operators of all sorts of networks – websites, enterprises, or consumer ISPs – have legitimate concerns about revealing this kind of information, but doing something collaboratively can help both alleviate concerns about acting alone, and create a cooperative environment for revealing what is important to help individual networks operate better.

The Internet Society is interested in the availability of data about the state of the health of the Internet. In 2012, we published the Internet Invariants describing some key, unchanging characteristics that make the Internet what it is. We asserted there that the Internet is accessible and measurable. NOMA will highlight the reality of that by unlocking data that operators have about the operations of their own networks to build up a collective view of the Internet’s health. We also asserted that collaboration is needed not only between devices but also between the people who operate these networks. This project intends to demonstrate that collaboration in the same way that the World IPv6 Launch website participants demonstrated sharing data to make the Internet better – by getting operators together to produce data for IPv6-related measurements it helped to create a self-sustaining community of operators working together to address shared deployment challenges.

I hope you will visit the new web page about NOMA and follow along here on the Internet Technology Matters blog as we get this effort underway.