eclecTechs Internet Access + Web Hosting Web Site Development Technical Support About Our Company Custom Programming Classes and Workshops Web Mail
-------------------------
 
Search: Web   eclecTechs™
----------------------

Statistics Frequently Asked Questions

Stats / Webalizer Information

  1. What is the difference between 'HITS' and 'FILES'?
  2. Hits
  3. Files
  4. Pages
  5. Sites
  6. Visits
  7. KBytes
  8. Top Entry and Exit Pages
  9. Why don't the daily visit totals add up to the monthly totals?

Server Errors and What They Mean

  1. HTTP Error 304
  2. HTTP Error 403
  3. HTTP Error 404
  4. Analyzing 404 Errors
  5. 404 Errors - Summary

Stats / Webalizer Information

What is the difference between 'HITS' and 'FILES'?

HITS is the total number of HTTP requests that the server received during the reporting period. Any request made to the server is considered a hit. FILES is the number of hits that actually resulted in something being sent back to the user, such as an HTML page or image. 'Total Files' and '200 - OK' totals should be the same. If you add up the totals in the 'Hits by Response Code' section, it should be the same as the 'Total Hits' figure.

For a complete description of what all the numbers mean in the output:

Back to Top

Hits

Any request made to the server which is logged, is considered a 'hit'. The requests can be for anything... html pages, graphic images, audio files, CGI scripts, etc... Each valid line in the server log is counted as a hit. This number represents the total number of requests that were made to the server during the specified report period.

Back to Top

Files

Some requests made to the server, require that the server then send something back to the requesting client, such as a html page or graphic image. When this happens, it is considered a 'file' and the files total is incremented. The relationship between 'hits' and 'files' can be thought of as 'incoming requests' and 'outgoing responses'.

Back to Top

Pages

Pages are, well, pages! Generally, any HTML document, or anything that generates an HTML document, would be considered a page. This does not include the other stuff that goes into a document, such as graphic images, audio clips, etc... This number represents the number of 'pages' requested only, and does not include the other 'stuff' that is in the page. What actually constitutes a 'page' can vary from server to server. The default action is to treat anything with the extension '.htm', '.html' or '.cgi' as a page. A lot of sites will probably define other extensions, such as '.phtml', '.php3' and '.pl' as pages as well. Some people consider this number as the number of 'pure' hits... I'm not sure if I totally agree with that viewpoint. Some other programs (and people :) refer to this as 'Pageviews'.

Back to Top

Sites

Each request made to the server comes from a unique 'site', which can be referenced by a name or ultimately, an IP address. The 'sites' number shows how many unique IP addresses made requests to the server during the reporting time period. This DOES NOT mean the number of unique individual users (real people) that visited, which is impossible to determine using just logs and the HTTP protocol (however, this number might be about as close as you will get).

Back to Top

Visits

Whenever a request is made to the server from a given IP address (site), the amount of time since a previous request by the address is calculated (if any). If the time difference is greater than a pre-configured 'visit timeout' value (or has never made a request before), it is considered a 'new visit', and this total is incremented (both for the site, and the IP address). The default timeout value is 30 minutes (can be changed), so if a user visits your site at 1:00 in the afternoon, and then returns at 3:00, two visits would be registered. Note: in the 'Top Sites' table, the visits total should be discounted on 'Grouped' records, and thought of as the "Minimum number of visits" that came from that grouping instead. Note: Visits only occur on PageType requests, that is, for any request whose URL is one of the 'page' types defined with the PageType option. Due to the limitation of the HTTP protocol, log rotations and other factors, this number should not be taken as absolutely accurate, rather, it should be considered a pretty close "guess".

Back to Top

KBytes

The KBytes (kilobytes) value shows the amount of data, in KB, that was sent out by the server during the specified reporting period. This value is generated directly from the log file, so it is up to the web server to produce accurate numbers in the logs (some web servers do stupid things when it comes to reporting the number of bytes). In general, this should be a fairly accurate representation of the amount of outgoing traffic the server had, regardless of the web servers reporting quirks.

Note: A kilobyte is 1024 bytes, not 1000 :)

Back to Top

Top Entry and Exit Pages

The Top Entry and Exit tables give a rough estimate of what URL's are used to enter your site, and what the last pages viewed are. Because of limitations in the HTTP protocol, log rotations, etc... this number should be considered a good "rough guess" of the actual numbers, however will give a good indication of the overall trend in where users come into, and exit, your site.

Back to Top

Why don't the daily visit totals add up to the monthly total?

You cannot add up the daily visit totals and compare them to the monthly total, they are different reporting periods. For example, if someone visits your site at 11:45pm and stays until 12:15am, the monthly total would show one visit, while the daily totals will show two (one for each day).

Back to Top

Server Errors and What They Mean

304 - Not modified

HTTP error 304

This error is specifically defined in the HTTP protocol. It does not really indicate an error as such, but rather indicates that the resource for the requested URL has not changed since last accessed or cached. The 304 status code should only be returned if allowed by the client (e.g. your Web browser). The client specifies this in the HTTP data stream sent to your Web server e.g. via If_Modified_Since headers in the request.

Systems that cache or index Web resources (such as search engines) often use the 304 response to determine if the information you previously gathered for a particular URL is now out-of-date.

Back to Top

HTTP error 403

This error is specifically defined in the HTTP protocol. Your Web server thinks that the HTTP data stream sent by the client (e.g. your Web browser) was correct, but access to the resource identified by the URL is forbidden for some reason.

These indicate a fundamental access problem, which may be difficult to resolve because the HTTP protocol allows the Web server to give this response without providing any reason at all. So the 403 error is equivalent to a blanket 'NO' by your Web server - with no further discussion allowed.

By far the most common reason for this error is that directory browsing is forbidden for the web site. Most Web sites want you to navigate using the URLs in the Web pages for that site. You do not often allow you to browse the file directory structure of the site.

This URL should will fail with a 403 error saying "Directory browsing failed - access forbidden". This is true for most Web sites on the Internet - your Web server has "Allow directory browsing" set OFF.

Back to Top

HTTP error 404


This error is specifically defined in the HTTP protocol. Your Web server thinks that the HTTP data stream sent by the client (e.g. your Web browser) was correct, but simply can not provide the access to the resource specified by your URL. This is equivalent to the 'return to sender - address unknown' response for conventional postal mail services.

This error is easily shown in a Web browser if you try a URL with valid domain name but invalid page e.g. http://www.ibm.com/ggggggg.html.

Back to Top

Analyzing 404 errors

For top level URLs (such as www.isp.com), the first possibility is that the request for your site URL has been directed to a Web server that thinks it never had any pages for your Web site. This is possible if DNS entries are fundamentally corrupt, or if your Web server has corrupt internal records. The second possibility is that the Web server once hosted the Web site, but now no longer does so and can not or will not provide a redirection to another computer which now hosts the site. If your site is completely dead - now effectively nowhere to be found on the Internet - then the 404 message makes sense. However if your site has recently moved, then an 404 message may also be triggered. This is also a DNS issue, because the old Web server should no longer be accessed at all - as soon as global DNS entries are updated, only your new Web server should be accessed.

For low-level URLs (such as www.isp.com/products/list.html), this error can indicate a broken link. You can see this easily by trying the URL in a Web browser. Most browsers give a very clear '404 - Not Found' message.

Back to Top

404 errors - summary

Provided that your Web site is still to be found somewhere on the Internet, 404 errors should be rare. For top level URLs, they typically occur only when there is some change to how your site is hosted and accessed, and even these typically disappear within a week or two once the Internet catches up with the changes you have made. For low-level URLs, the solution is almost always to fix your Web pages so that the broken hypertext link is corrected.

Back to Top

Back to Frequently Asked Questions

eclecTechs™
35 State Street
Northampton, MA 01060

413.584.8600
800.471.4638
Fax 413.584.0562

info@eclectechs.com
www.eclectechs.com

All Contents Copyright © eclecTechs™, All Rights Reserved