6
votes

My records show a particular page of my web site was visited 609 times between July 2 and November 15.

Google Analytics reports only 238 page views during that time.

I can't explain this discrepancy.

For Google Analytics to track a page view event, the client browser must have JavaScript enabled and be able to access Google's servers. I doubt 60% of my visitors have either disabled JavaScript or firewalled outbound traffic to Google's tracking servers.

Do you have any explanation?

More Info

My application simply puts a record into a database as it serves up a page.

It doesn't do anything to distinguish a bot viewer from a human.

9

9 Answers

11
votes

The disparity is almost certainly from crawlers. It's not unheard-of for crawler traffic to be 10x user traffic.

That said, there's a really easy way to validate what's going on: add an ASPX page which emits a uncacheable, 1x1 pixel clear-GIF image (aka "web bug") to every page on your site, and include an IMG tag referencing that image on every page on your site (e.g. in a header or footer). Then parse your logs for hits to that image, looking at a query-string parameter on the image call (e.g. "referrer=") so you'll know the actual URL of the pageview.

Since crawlers and other bots don't pull images (well, Google Images will, but not images sized as 1x1 pixel in the IMG tag!), you'll get a much more accurate count of pageviews. Behind the scenes, most analytics software (including Google Analytics) uses a similar approach-- except they use javascript to build the image URL and make the image request dynamically. But if you use Fiddler to watch HTTP requests made on a site that uses Google Analytics, you'll see a 1px GIF returned from www.google-analytics.com.

The numbers won't line up exactly (for example, users who quickly cancel a navigation via the back button may have downloaded one image but not the other) but you should see roughly comparable results. If you don't, then chances are you don't have Google Analytics set up correctly on all your pages.

Here's a code sample illustrating the technique.

In your header (note the random number to prevent caching):

<img src="PageviewImage.aspx?rand=<%=new System.Random().NextDouble( )%>&referer=<%=Request.UrlReferrer==null ? "" : Server.HtmlEncode(Request.UrlReferrer.ToString()) %>"
  width="0" height="0" hspace="0" vspace="0" border="0" alt="pageview check">

The image generator, PageviewImage.aspx :

private void Page_Load(object sender, System.EventArgs e) 
{ 
    Response.ContentType="image/gif";
    string filepath = Server.MapPath ("~/images/clear.gif");
    Response.WriteFile(filepath);
}

BTW, if you need the image file itself, do a Save As from here.

This is of course not a substitute for a "real" analytics system like Googles, but if you just want to cross-check, the approach above should work OK.

4
votes

Could the rest of the page views be from crawlers - either Googlebot or others?

2
votes

Are you looking at unique page views in Analytics and total page views in your logs?

1
votes

Probably crawlers. Our website was being hit every couple of hours by robots.

1
votes

Are you positive the site is working properly in all browsers? I've seen analytics thrown off by pages that fail to render properly in Firefox but work fine in IE, and vice versa.

1
votes

Maybe the tracker of your web pages record every hit, even if it comes from the same IP address (same surfer hits the page twice).

1
votes

It is not, many visitors have javascript turned of or have the customize google firefox extension installed.

1
votes

Given the time stamp of the last comment, I thought I'd leave an update here; Google Analytics recently announced they'd let people opt-out of Google Analytics, on the user-side, meaning if you didn't want website owners to track your movements, you could effectively become invisible on sites that are measured by Google Analytics. this could further offset your data points. in a sep thread, I suggested running two web analytics tools (many free to choose from) to measure against each other.

1
votes

Justin's answer is very good. I would just add this as a comment but I'm lacking powerpoints :P

One thing to keep in mind, too, when comparing analytics systems, is that there's always some discrepancy to be expected:

The methodology of page tagging with JavaScript in order to collect visit data has now been well established over the past 8 years or so. Given a best practice deployment of Google Analytics, Nielsen SiteCensus or Yahoo Web Analytics, high level metrics remain comparable. That is, can be expected to lie between 10-20% of each other.[ link ]