arXiv:cond-mat/0407339AbstractReferencesReviewsResources
Why Mapping the Internet is Hard
Aaron Clauset, Cristopher Moore
Published 2004-07-13Version 1
Despite great effort spent measuring topological features of large networks like the Internet, it was recently argued that sampling based on taking paths through the network (e.g., traceroutes) introduces a fundamental bias in the observed degree distribution. We examine this bias analytically and experimentally. For classic random graphs with mean degree c, we show analytically that traceroute sampling gives an observed degree distribution P(k) ~ 1/k for k < c, even though the underlying degree distribution is Poisson. For graphs whose degree distributions have power-law tails P(k) ~ k^-alpha, the accuracy of traceroute sampling is highly sensitive to the population of low-degree vertices. In particular, when the graph has a large excess (i.e., many more edges than vertices), traceroute sampling can significantly misestimate alpha.