Any Two Pages on the Web Are Connected By 19 Clicks or Less
There are more than 14 billion pages on the web, but they are linked by hyperconnected nodes, like Hollywood actors connected through Kevin Bacon
Note: After publishing this article, it came to our attention that Barabási originally made this finding in 1999, and it was merely referenced in the recent publication. We regret the error.
No one knows for sure how many individual pages are on the web, but right now, it’s estimated that there are more than 14 billion. Recently, though, Hungarian physicist Albert-László discovered something surprising about this massive number: Like actors in Hollywood connected by Kevin Bacon, from every single one of these pages you can navigate to any other in 19 clicks or less.
Barabási’s findings, published noted yesterday in Philosophical Transactions of the Royal Society (Correction: initially made way back in 1999), involved a simulated model of the web that he created to better understand its structure. He discovered that of the roughly 1 trillion web documents in existence—the aforementioned 14 billion-plus pages, along with every image, video or other file hosted on every single one of them—the vast majority are poorly connected, linked to perhaps just a few other pages or documents.
Distributed across the entire web, though, are a minority of pages—search engines, indexes and aggregators—that are very highly connected and can be used to move from area of the web to another. These nodes serve as the “Kevin Bacons” of the web, allowing users to navigate from most areas to most others in less than 19 clicks.
Barabási credits this “small world” of the web to human nature—the fact that we tend to group into communities, whether in real life or the virtual world. The pages of the web aren’t linked randomly, he says: They’re organized in an interconnected hierarchy of organizational themes, including region, country and subject area.
Interestingly, this means that no matter how large the web grows, the same interconnectedness will rule. Barabási analyzed the network looking at a variety of levels—examining anywhere from a tiny slice to the full 1 trillion documents—and found that regardless of scale, the same 19-click-or-less rule applied.
This arrangement, though, reveals cybersecurity risks. Barabási writes that knocking out a relatively small number of the crucial nodes that connect the web could isolate various pages and make it impossible to move from one to another. Of course, these vital nodes are among the most robustly protected parts of the web, but the findings still underline the significance of a few key pages.
To get an idea of what this interconnected massive network actually looks like, head over to the Opte Project, an endeavor started by Barrett Lyon in 2003 to create publicly available visualizations of the web. In the map above, for example, red lines represent links between web pages in Asia, green for Europe, the Middle East and Africa, blue for North America, yellow for Latin America and white for unknown IP addresses. Although the most recent visualization is several years old, Lyon reports that he’s currently working on a new version of the project that will be released soon.