Categories
Growing the Internet

Measurement Lab: How Do We Know If the Internet Is Open?

An open Internet is the foundation of access and innovation, where users can go where they want, when they want without discrimination. But how do we know if the Internet is truly open? As individuals, our Internet performance experience is mediated by our physical location, infrastructure, government, and Internet service providers. Yet we are largely blind to how our Internet is impacted by these systems. Without that knowledge, innovation stalls, disparity of access grows, and people become isolated from this critical piece of global infrastructure.

Measurement Lab (M-Lab), a fiscally sponsored project of Code for Science & Society, is a consortium of research, industry, and public interest partners focused on fostering, collecting, and publishing open Internet performance data. M-Lab was founded in 2008 to build a global platform designed to enable anyone to measure their Internet service using open source tools. Over ten years later, M-Lab collects over 2 million measurements per day worldwide and has become a trusted source of open data and tools to gather and understand Internet infrastructure from the consumer perspective. Cities and municipal governments; national regulators and government agencies; academics and researchers; ISPs, network operators, and companies; civil society and advocacy organizations; and the general public are using tests, tools, and data developed or supported by M-Lab.

M-Lab’s core foundation is built on the values of transparency, openness, and true open science practice. Originally built to support the academic Internet measurement research community, M-Lab requires test code running on our infrastructure to be open source, and that tests’ measurement methodologies be openly documented and available for scientific collaboration and scrutiny. All of the data collected by tests running on M-Lab’s global measurement platform must be published under an open license. Anyone can review and improve the underlying methodologies and assumptions on which M-Lab’s platform, tools, and data rely. We believe that this radical transparency and openness to review are key to producing good science and paramount to trusted measurement.

Upgrading the M-Lab Platform

Managing and maintaining a healthy measurement platform service is no small task, and is the core mission of the team at M-Lab. Ten years ago we built M-Lab as a “fork” of the PlanetLab platform, which at the time provided the best available open source virtualization and distributed server management system to support the needs of networking researchers around the world.

Our team is now in the final stages of updating the M-Lab platform stack to the latest technologies for managing virtual computing and distributed infrastructure. Kubernetes-managed Docker containers are the base layer of M-Lab 2.0, and open a new universe of possibilities for the M-Lab community. While our previous system architecture supported a limited number of “experiments,” or tests, the new Kubernetes and Docker-based M-Lab removes that limit, and opens up the platform for new collaborating researchers and developers.

New Experiments

The first of what we hope will be many new M-Lab tests will be WeHe, also known as Differentiation Detector. Developed by researchers at Northeastern University and UMass Amherst, WeHe is a mobile network test for iOS and Android devices that assesses whether your Internet traffic from popular apps and services are being slowed or throttled by your carrier.

As a part of the platform upgrade, the M-Lab team, in coordination with external developers has rewritten our speed and latency test, the Network Diagnostic Tool (NDT). NDT measures the capacity of your connection using a single TCP stream, following the IETF’s RFC 3148. The M-Lab team is very excited about the new version of NDT because it now runs on the standard secure web port, works with the latest TCP congestion control algorithm (BBR), and effectively implements a Model Based Metrics (MBM) approach to measuring the end-to-end performance of your connection. The new version of NDT is in testing now and will be deployed to production along with the Kubernetes/Docker platform upgrade in Q3 2019.

Expanding the Platform Footprint 

M-Lab has been able to place servers in about 130 locations around the world, through the generous support of research and education networks, government agencies, open Internet exchanges (IXPs), and transit network operators. But we aren’t everywhere. M-Lab is currently seeking new partnerships and supporters to enable us to expand in Central and South America, Southeast Asia, and Africa. People in these regions can still use M-Lab tests and the resulting data is still a relevant assessment of their experience online, but having more servers and more diverse connectivity in these areas will increase the fidelity of our data. Better geographic coverage across the topology of the Internet will help do that. M-Lab is seeking relationships with open IXPs, transit providers, and donors to support this expansion, and to diversify the partners who contribute to M-Lab’s core infrastructure. We frame these relationship as a partnership for a reason – M-Lab is able to better measure the Internet, and our partners receive value from the data generated through their donation or investment in M-Lab.

Community Engagement

Maintaining a healthy measurement platform service is M-Lab’s primary mission – enabling data on global Internet health to be collected and shared in the public domain– but equally important is our goal to make these data accessible and useful. Communities around the world want to know how the Internet is serving them. The M-Lab team supports a wide variety of organizations to do just that (e.g., policy makers, grassroots advocates, cities and municipalities, regulatory and other government agencies, developers, as well as the academic research community). This is a huge job and we recognize that we really can’t do it alone. To truly scale support globally, M-Lab envisions a data stewards program that builds the data analysis skills and capacity of regional and local advocates and champions. This isn’t something we’re doing yet, but it’s one of the next steps in M-Lab’s outreach and support plan.

After 10 years of measuring the Internet, M-Lab is looking toward the future and the communities we support and serve. If you are a researcher, advocate, Internet measurement researcher or test developer interested in growing data science skills with network measurement data – or if you’d like to learn more about how M-Lab can support you, please let us know!

Categories
IPv6 Open Internet Standards Technology

Looking for Comments: Internet Measurements Landscape

[Editor’s Note: We are cross-posting this blog post from Leslie Daigle at TechArk related to our ongoing joint work on Measuring Internet Health.]

I’d like to pick your brain for just a little bit…

The Internet has been measured and analyzed since the first connection was made between networks. Since measurement activities are shaped by everything from the intentions of the people taking the measurements to the vantage point of network and service operators, there are many different approaches and activities that are underway today that make up the landscape of “Internet measurement activities”. NOMA is tackling one corner of the landscape, but understanding its value is dependent on an awareness of the bigger picture.

To that end, we’re drafting a paper that aims to provide an overview of existing Internet measurement activities, approaches, challenges and activities, to build out a map of that measurement landscape. It is aimed at the general reader with an interest in the topic, including policy makers, measurement experts wishing to position their work in the landscape of such activities, and network operators seeking to understand available tools, services and practices with regard to measuring the Internet from their network’s perspective.

With this post, and subsequent ones over the coming weeks, I’m putting out draft text and asking if you would kindly share your thoughts on any errors or omissions, or even just general insights.

Comments to measuring<at>techark.org would be most appreciated.

First up for discussion — a review of the challenges that are faced when people start talking about “let’s measure <some aspect of> the Internet”.

1 Issues – Internet data measurement

There are some particular challenges that need to be addressed when reviewing data collected from, or before forming any kind of measurement of the Internet.

1.1 What are the “endpoints”?

On the user’s end, does the measurement start from the user’s desktop computer, or the CPE [1]? While the smarts for the measurement may be running on the user’s desktop, the reality is that the home network (between the desktop and the CPE) may factor negatively into any measurements. For example, a service provider might provide the network to deliver 75Mbps of data to the CPE, but the user’s desktop may be connected to his home network by an old Ethernet cable – top speed 10Mbps. If the user runs a speed test from their desktop, they can’t see anything faster than the 10Mbps wire delivers.

Similarly, providers are often interested in ensuring that “their network” is well-connected to popular sites, such as Facebook and YouTube. From a routing perspective, “their network” means routers and other network boxes that might be spread far and wide geographically, and have little to do with the “last mile” connection to the customer’s premises. An ISP may have great connections to popular services, but if the customer is connected to the ISP by over-subscribed shared links, old copper, or other low grade links, the endpoint is not going to see advantage from that connectivity.

The same is true when talking about IPv4 and IPv6 connectivity – an ISP may support IPv6 in its core, but it takes a lot of work to update the hardware closest to the customers to ensure that each customer has IPv6 connectivity to their CPE. Then, what happens within the home network determines whether or not the desktop can actually connect to anything over IPv6.

On the server end, analogously, does the measurement reach a particular box on the network, or just one of several real or virtual servers that may be supporting a given service. For example, there is no single computer that “runs the Google website”. Sometimes service instances can be distinguished by differing IP addresses, but even a single IP may support a large server farm behind the edge of the service network.

On the one hand, the user only cares about what they experience – which is everything from their desktop to the server providing the responses to their Internet activities. On the other hand, being able to break down performance by some logical “neighbourhoods” helps: separating out the home network performance from the performance within the access network, and subsequent hops to the network service.

1.2 What is “near”?

From Buenos Aires, Argentina to Cape Town, South Africa is 4,276 mi (6,881 km) across the globe. However, that’s not how Internet traffic flows from Buenos Aires to Cape Town. Virtually (and, quite possibly, literally) all routes out of Buenos Aires to Cape Town go through Miami, US. To be quite clear, the distance from Buenos Aires to Miami is 4,405 mi (7,089 km) – already longer than the distance between the two endpoint cities – and then the distance from Miami to Cape Town is an additional 7,650 mi (12,312 km) [2].

That makes Seattle, US (2,732 mi (4,397 km) from Miami) closer to Buenos Aires in the network than Cape Town is, although that is not at all obvious from looking at a geographical map.

1.3 What is a “fixed point” on the Internet?

At a logical level, “the Google server” is a fixed point in the Internet. However, given the discussion of endpoints, above, it should be clear that there is no single Google server, or one single “Google fixed point”. The same is true of other major global services. For some end users, Google and Amazon services may be “close” to each other, and for other end users that may not be true. The difference stems from the fact that each of Google and Amazon necessarily lay out their service CDN/duplication servers in ways that make sense to their own business, and not based on any global Internet service grid.

A “polestar” endpoint is one that is well known and fixed in the network – at a single IP address that is not anycast from multiple vantage points. This describes few major services today (anything popular is hosted by a CDN). Some NTP [3] servers, as general Internet infrastructure, fall into that category. Of course, services that are built out for the purpose of looking through the network towards fixed points can establish their own polestars.

1.4 Span and scope of measurements

With the variations outlined above, another challenge in setting up Internet measurements is ensuring appropriate span or scope of the measurements. For networks under your administrative control, you can manage and account for different factors, and you can install active or passive gatherers at any and all points as necessary. That gives you confidence in the measurements within your own network, but it doesn’t help address the variability of any measurements that reach outside it (e.g., toward a “pole star” server). It also doesn’t necessarily give information that is readily compared outside the scope of your own network.

To get global span, it is necessary to have some kind of reach into and/or through other networks, and diversity is important. The approaches discussed below outline how that has been addressed in projects to date.

[1] CPE is “customer premises equipment”; the box that connects to your ISP’s access network.

[2] To make matters worse, most routes actually go from Miami to some other network node, in places such as Colorado, US or Paris, France, before connecting to Cape Town.

[3] Network Time Protocol – see https://tools.ietf.org/html/rfc5905

Okay — we’re listening! Comments to measuring<at>techark.org would be most appreciated.

© istock photo / tonefotografia

Categories
Open Internet Standards Technology

Submissions for the RAIM Workshop on Internet Measurements Closes Friday!

If you’re working on some aspect of Internet measurement, in either academia or industry, you have until this Friday (7 August) to get your submission in to the organising committee for the Research and Applications of Internet Measurements (RAIM) workshop in Yokohama, Japan.

This one-day workshop will bring together researchers and practitioners that are advancing the state of the art in measuring networked systems to share measurement-based insights into operational networks, to improve the use of research in protocol design decisions, and to suggest relevant new topics for research.

The workshop is being organised by the Internet Research Task Force (IRTF) and the Internet Society (ISOC) in cooperation with ACM SIGCOMM to increase collaboration between industry and academia in the field of networking. The meeting will take advantage of the proximity of two important networking events: the 2015 ACM SIGCOMM Internet Measurement Conference (IMC) in Tokyo and the 94th Internet Engineering Task Force (IETF) meeting in Yokohama.

More information about why and how to participate is available from the workshop webpage. The submission deadline is Friday, 7 August 2015. Note that for academics a copy of a previously published academic paper related to the topics of the workshop is sufficient.

I hope to see you in Yokohama at RAIM 2015!