This page is about what happens when somebody connects to your web site, and what statistics you can and can't calculate. There is a lot of confusion about this. It's not helped by statistics programs which claim to calculate things which cannot really be calculated, only estimated, with varying degrees of accuracy. The simple fact is that certain data which we are used to knowing for traditional print and even broadcast media are simply not available on the web.
I should say that these ideas are not new to me. In particular, I can recommend four excellent articles about this subject: Interpreting WWW Statistics by Doug Linder; Making Sense of Web Usage Statistics by Dana Noonan; Getting Real about Usage Statistics by Tim Stehle; and, the most negative of all, Why Web Usage Statistics are (Worse Than) Meaningless by Jeff Goldberg.
So, what do you know about it? First, I make one request for your front page. You know the date and time of the request and which page I asked for (of course), and the internet address of my computer (my host). I also usually tell you which page referred me to your site, and the make and model of my browser. I do not tell you my user name or my e-mail address.
Next, I look at the page (or rather my browser does) to see if it's got any graphics on it. If so, and if I've got image loading turned on in my browser, I make a separate connection to retrieve each of these graphics. I never log into your site: I just make a sequence of requests, one for each new file I want to download. The referring page for each of these graphics is your front page. Maybe there are 10 graphics on your front page. Then so far I've made 11 requests to your server.
After that, I go and visit some of your other pages, making a new request for each page and graphic that I want. Finally, I follow a link out of your site. You never know about that at all. I just connect to the next site without telling you.
The other sort of cache is on a larger scale. I'm in the UK. Because the link across the Atlantic is sometimes very congested, we've set up a national cache. (Many individual ISP's also do the same thing.) I can set my browser to get your pages from the national cache instead of directly from you. If anyone else in the country has used the cache to look at your pages recently, the cache will have saved them, and will give them out to me without ever telling you about it. So hundreds of people could read your pages, even though you'd only sent it out once. Also, if the page I wanted wasn't already stored in the cache, the cache would ask for it from you on my behalf. This would mean that the request appeared to come from the cache, rather than from me. If several people did this, you would think that only one host was accessing the cache, rather than lots of different ones.
You can also know what people told you their browsers were, and what the referring pages were. You should be aware, though, that many browsers lie deliberately about what sort of browser they are, or even let users configure the browser name. Also, some browsers send incorrect referrers, telling you the last page that the user was on even if they weren't referred by that page.
Successful requests
Average successful requests per day
Successful requests for pages
Average successful requests for pages per day
Failed requests Redirected requests
Distinct files requested
Distinct hosts served
Data Transferred
Average data transferred per day
Monthly Report
Daily Report
Hourly Report
Domain Report
Directory Report
File Type Report
Status Code Report
Request Report
The general summary
This is a good summary to look at and provides some nice numbers. The first number listed is total since the logfile
has been created. This date is listed above. The number in parentheses is the amount in the last week.
This is the total number of hits to your web site. A hit occurs anytime someone downloads a file from your site. So for each html file on the site
this counts as a hit when someone reads it. Also for each graphic or other type of file, sound, or animation, etc. that someone sees that counts as a
hit as well. So for example a page with something like the HTML file, a logo, a picture and a link to a file. Just viewing the page is 3 hits (the HTML file, the logo, and the picture) then clicking on the link and
downloading the file is another hit. As you can see this isn't always a very useful number to track visitors, but it will give you a general idea of traffic to your site.
Same as above but averaged per day.
This is the total number of HTML pages that have been requested. Does not include graphics or sounds, etc. This is a good number to look at to see if people are actually viewing a large amount of the site. The higher the number here, the more of the si
te has been viewed. Or many people have viewed just the home page. This is expanded upon in the request report at the bottom of the web stats.
Same as above but averaged per day.
These occur if either something is wrong with part of your web site or if people are
trying to go to a part of the site that isn't there anymore. Such as a link from
somewhere else that links to a page that you have removed from the site. These also
occur because search engines look for optional instruction files called
robots.txt that may not exist.
Indicates that the user was directed to a different file instead of the one requested
The most common cause is that the user has incorrectly requested a directory name without the trailing slash. The server replies with a redirection and the user then makes a second connection to get the correct document (although usually the browser does
it automatically without the user's intervention or
knowledge). The other common cause of redirected requests is their use as "click-thru" advertising banners.
This is the number of files that have distinct names that have been served. This number won't change much unless you change the file names in your web site.
Every computer on the internet has a unique address. This counts the number of unique addresses that have been to your site. This unique address is pretty much the number of people who have seen your web site. At the very least it correlates the the n
umber of computers that have been to your web site.
This is the amount of data sent to the browser from your web site. This number will grow as people view more of the site, or download files from the site.
Same as above but averaged per day.
Graphical statistics
Graphs out the usage per month.
Graphs out the usage per day. This is good to know if people are going to your site during the day or the weekend to see what their surfing habits are.
Graphs out the usage per hour. This will keep increasing each day. Lets you see what time of day the most users visit you.
File and directory reports
This report is not used by our server.
Shows what directories have the most traffic in your web site.
Shows what file types have the most traffic.
Shows the status codes that are returned to the browser by the server.
This is getting into the details of how the http protocol works. Here is a general overview of what the codes mean according to their number order:
For a more complete overview of http and these codes go to
RFC 2068, status codes are in section 10.
This shows you which files on your site are the most popular ones. Very useful to see where users are going in your site and what they are looking at.