What Experts Are Saying About Website Visitor Statistics

When you have a website, checking the website visitor statistics reports now and then is a common task. Most web owners or managers often take a look at where traffic comes from and what position a site appears at on a search engine for the terms they are interested in, etc.

Your site is seen by much fewer people than you think

Yet, what if I told you that a large percentage of that traffic data is fake, or even better, not generated by a human being?

This week, I invited Jorge Enrique Aguayo, an SEO specialist at InTechCenter, to explain to us in greater detail how our website visitor statistics reports do not show what they should, why, and what we can do if we want to have more realistic numbers.

Without further ado, here is what Jorge has to say on the matter:

How web counters work

Web counters work by adding together the number of times a counting code is executed. Currently there are two ways to measure it: the code is written in JavaScript (a programming language) and is counted every time a browser executes it, or it calls an image (usually tiny or hidden, so the design is not altered by its presence) and is counted every time it is loaded from the web server.

The mechanism is simple: every time something calls the counter code, a database adds one. Then the software prepares statistics pages so that the manager can read reports and get an idea of how many people are actually visiting the site.

All web counter applications are based on that counting of calls described above and to generate more reliable results, often include both methods in the same code. On one part, they use the JavaScript approach, and on the other they include a “noscript” tag with the image call as a call-back in case the JavaScript fails.

Here is an example extracted from StatCounter.com:

<!– Start of StatCounter Code for Default Guide –>

<script type=”text/javascript”>

var sc_project=XXXXXXX; var sc_invisible=1; var sc_security=”XXXXXXX”; var scJsHost = ((“https:” == document.location.protocol) ? “https://secure.” : “http://www.”); document.write(“<sc”+”ript type=’text/javascript’ src=’” + scJsHost+ “statcounter.com/counter/counter.js’></”+”script>”);</script>

<noscript><div class=”statcounter”><a title=”hits counter” href=”http://statcounter.com/” target=”_blank”><img class=”statcounter” src=”http://c.statcounter.com/XXXXXXX/0/XXXXXXX/1/” alt=”hits counter”></a></div></noscript>

<!– End of StatCounter Code for Default Guide →

See how there are “script” and a “noscript” sections in the code? Well, that’s how these counter codes are implemented in almost every case.

Where do fake visits come from?

Although the counting system has a very simple and effective mechanism, and despite the fact it should work just as described above, the truth is it has a conceptual error: it assumes all requests come from a human being sitting in front of a computer or other device.

The most disturbing fact, however, is that around sixty percent (yes, 60%) of web traffic is robots traffic, and I am afraid those requests are being counted as visits too.

Let me prove it

Posting something so blatant should always be backed up with some facts, so let me prove my statement in two different ways.

Last year Google Analytics included an option to see your website visitor statistics removing any bot traffic from the final count. If you use Google Analytics, test it yourself by clicking on Admin > View Settings.

Check the box that says “Exclude all hits from known bots and spiders” and wait to see how your traffic goes down in the following days. You may get surprised by the change. Actually, a few of you could even get depressed.

The other way requires some technical knowledge, and it is by blocking bots on your website configuration files (robots.txt, .htaccess or both) so you can get only the most realistic visitor statistics possible.

See how my own stats went down once I implemented the change on my website. It was shocking to say the least:

stats-after-blocking-bots

stats-after-blocking-bots

How to block bots to get more realistic statistics

If you are like me, and knowing that bots are messing up your statistics bugs you, there is something you can do. As mentioned before, the change requires some technical knowledge, but it is not rocket science either.

What bots are we interested in blocking? Mainly those that…

  • …are used to clone or download a whole website, like HTTrack
  • …get onto your site just looking for e-mail addresses to harvest (spammers love these ones)
  • …get onto your site to try finding vulnerabilities a cracker may exploit

…and many others! The list can get long, as a matter of fact.

How to block them? Basically you need to include them on the robots.txt file of your site. If you wish to carry out a more aggressive blocking approach, you may want to consider blocking them on the .htaccess file too.

On a robots.txt file, you add these lines for every bot:

User-agent: {name of the bot here}

Disallow: /

For a .htaccess file, the instruction is a little different. Just add one RewriteCond %{HTTP_USER_AGENT} line for every bot you want to block:

<IfModule mod_rewrite.c>

RewriteEngine on

RewriteCond %{HTTP_USER_AGENT} ^{name of the bot} [NC,OR]

RewriteRule .* – [F]

</ifModule>

Conclusion and what to do next

Where to get a list of bots to block? Well, there are many lists available when searching on the web. Try to get a comprehensive list and implement it.

If you would like to save time, feel free to contact me and I will be glad to send you the list I currently use for my projects.

That way I can also check how many people actually read this post. 😉

Has this article enlightened you about real website visitor statistics?

If you have any questions or suggestions, please use the comments below. 🙂

Featured image:

Website Visitor Statistics

Website Visitor Statistics

 

 

 

 

Image courtesy of Stuart Miles at FreeDigitalPhotos.net

 

2 Comments

  1. Joseph April 23, 2015
    • Mike L April 23, 2015

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.