Focus Areas:    Antitrust | Big Data | internet

Debunking Elizabeth Warren’s Claim That “More Than 70% of All Internet Traffic Goes through Google or Facebook”

Truth on the Market View Original

In March of this year, Elizabeth Warren announced her proposal to break up Big Tech in a blog post on Medium. She tried to paint the tech giants as dominant players crushing their smaller competitors and strangling the open internet. This line in particular stood out: “More than 70% of all Internet traffic goes through sites owned or operated by Google or Facebook.

This statistic immediately struck me as outlandish, but I knew I would need to do some digging to fact check it. After seeing the claim repeated in a recent profile of the Open Markets Institute — “Google and Facebook control websites that receive 70 percent of all internet traffic” — I decided to track down the original source for this surprising finding.

Warren’s blog post links to a November 2017 Newsweek article — “Who Controls the Internet? Facebook and Google Dominance Could Cause the ‘Death of the Web’” — written by Anthony Cuthbertson. The piece is even more alarmist than Warren’s blog post: “Facebook and Google now have direct influence over nearly three quarters of all internet traffic, prompting warnings that the end of a free and open web is imminent.

The Newsweek article, in turn, cites an October 2017 blog post by André Staltz, an open source freelancer, on his personal website titled “The Web began dying in 2014, here’s how”. His takeaway is equally dire: “It looks like nothing changed since 2014, but GOOG and FB now have direct influence over 70%+ of internet traffic.” Staltz claims the blog post took “months of research to write”, but the headline statistic is merely aggregated from a December 2015 blog post by Parse.ly, a web analytics and content optimization software company.

Source: André Staltz

The Parse.ly article — “Facebook Continues to Beat Google in Sending Traffic to Top Publishers” — is about external referrals (i.e., outside links) to publisher sites (not total internet traffic) and says the “data set used for this study included around 400 publisher domains.” This is not even a random sample much less a comprehensive measure of total internet traffic. Here’s how they summarize their results: “Today, Facebook remains a top referring site to the publishers in Parse.ly’s network, claiming 39 percent of referral traffic versus Google’s share of 34 percent.”

Source: Parse.ly

So, using the sources provided by the respective authors, the claim from Elizabeth Warren that “more than 70% of all Internet traffic goes through sites owned or operated by Google or Facebook” can be more accurately rewritten as “more than 70 percent of external links to 400 publishers come from sites owned or operated by Google and Facebook.” When framed that way, it’s much less conclusive (and much less scary).

But what’s the real statistic for total internet traffic? This is a surprisingly difficult question to answer, because there is no single way to measure it: Are we talking about share of users, or user-minutes, of bits, or total visits, or unique visits, or referrals? According to Wikipedia, “Common measurements of traffic are total volume, in units of multiples of the byte, or as transmission rates in bytes per certain time units.”

One of the more comprehensive efforts to answer this question is undertaken annually by Sandvine. The networking equipment company uses its vast installed footprint of equipment across the internet to generate statistics on connections, upstream traffic, downstream traffic, and total internet traffic (summarized in the table below). This dataset covers both browser-based and app-based internet traffic, which is crucial for capturing the full picture of internet user behavior.

Source: Sandvine

Looking at two categories of traffic analyzed by Sandvine — downstream traffic and overall traffic — gives lie to the narrative pushed by Warren and others. As you can see in the chart below, HTTP media streaming — a category for smaller streaming services that Sandvine has not yet tracked individually — represented 12.8% of global downstream traffic and Netflix accounted for 12.6%. According to Sandvine, “the aggregate volume of the long tail is actually greater than the largest of the short-tail providers.” So much for the open internet being smothered by the tech giants.

Source: Sandvine

As for Google and Facebook? The report found that Google-operated sites receive 12.00 percent of total internet traffic while Facebook-controlled sites receive 7.79 percent. In other words, less than 20 percent of all Internet traffic goes through sites owned or operated by Google or Facebook. While this statistic may be less eye-popping than the one trumpeted by Warren and other antitrust activists, it does have the virtue of being true.

Source: Sandvine