Why isn't URL X included in Netograph?

There are a range of reasons why a given URL might not be included in Netograph.

  • Netograph only monitors links passing through Reddit, Hacker News, Delicious, Pinboard, and Digg.
  • Netograph only surveys actual webpages. If a URL points directly to an image or some other document type, it's skipped.
  • I apply some throttling to avoid probing a domain too often. For instance, a very large fraction of links that pass through Reddit are links to Reddit itself, imgur.com, or quickmeme.com. Without throttling, I would probe these domains thousands of times a day, which is pointless.
  • Any of a range of "weird" behaviours can cause Netograph to skip a URL. This includes timeouts, outsize documents, too many HTTP redirects, and so forth.

Why isn't netograph.com itself included in Netograph?

Indexing netograph.com with Netograph would cause a catastrophic rupture in the space-time continuum. Rest assured that Netograph sets no persistent state, and doesn't load resources from any third party.

What limitations should I be aware of?

Pages can make more requests and store more persistent state when users click, hover or otherwise interact with page elements. There's simply no way to auto-generate this type of input in any exhaustive way without a real human at the keyboard, so any "extra" activity of this kind would not show up on Netograph.

You should also be aware that Netograph doesn't track persistent state set by plugins other than flash. For instance, it's possible to use Microsoft Silverlight local storage in much the same way as Flash local storage for nefarious purposes.