Bill French, MyST Technology Partners, is an insightful guy with a rare combination of business acumen, 'top of the heap' Information Architecture skill & programming experience. He’s a thought leader on how Web 2.0, Blogs, RSS and syndicating' content formats will and do apply to the Enterprise. It's relevant to mention, MyST provides Intel, Verisign, Borland and many other brand name Technology Cos. with their Web 2.0 Architecture. I believe there are Cos. stepping forward with some reliable 'Blog Feed Subscription Tracking' solutions and I had a discussion with Bill on the subject, as follows:
Q: "Correct me if I’m wrong, but it may be awhile before Blog Feed Subscription tracking can be perfected for the Blogger or Enterprise Blogger?"
BF: "The web (really, the HTTP protocol) was never designed to be measured. Web 2.0 tosses a new curve ball to that reality. It may be a very long while.
"
Q: "I was speaking with a prominent Web 2.0 Executive and he expressed pride in their world class caching* techniques. It wasn’t relevant to the conversation at hand, but I thought to myself, 'it’s where accurate feed tracking breaks down.' If the Feed or Newsreader Co. caches, they’re undermining the Content Management Software or Blog Software Cos. ability to track Blog Feed Subscriptions accurately for their Bloggers."
BF: "Caching*-forward content has been around along time – Akamai perfected the model in the late 90’s, and it now happens in almost every application. It’s necessary for many reasons (performance is a big one), and it will not likely end."
Q: "So if they grab & cache a Blog Post or Feed once and then redistribute to many, can there be an accurate tracking path between Blog & individual?"
BF: "Bloglines, MyYahoo, and pretty much everyone does this. Everything about syndication undermines tracking. If you have 300 unique IP addresses requesting a feed, that doesn’t mean you have 300 subscribers. One IP address could represent a proxy server for 70,000 people inside a firewall. Or One IP address could be a NewsGator server with 3500 subscribers behind it. Furthermore, One subscriber might have 30 different IP addresses in a single day. So all measures (that we’ve come to know for Web 1.0) are pretty much useless."
Q: "That said, is it sensible for the Blogger to offer a ‘summary only’ feed whereby, at least, they can track the page views of those that click into their Blog website in order to review the entire post?"
BF: "This is an often-debated point, and (in my view) it should be decided based on the content consumer’s requirements, not the producer’s requirements."
Q: "Tracking of ‘subscriber to Blog’ connections and therefore 'Return on Investment' metrics for a sponsor may not ever be a completely accurate science."
BF: "Correct."
* Cache- In computer science, a cache (pronounced kăsh) is a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive (usually in terms of access time) to fetch or compute relative to reading the cache. Once the data is stored in the cache, future use can be made by accessing the cached copy rather than refetching or recomputing the original data, so that the average access time is lower.