Apache Live Log (ALiveLog) – Monitoring apache log files

February 5th, 2009 | Tags:

ALiveLog: Find out how a visitor browsed your site Existing utilities for monitoring apache logs didn’t suit me, so I’ve decided to create a tool for monitoring apache log files remotely via a browser. The aim was to create a very small application that just do the job: show the last visits to a website and nothing more. The tool is called ALiveLog (Apache Live Log): you can monitor multiple websites hosted on the same server through a single ALiveLog installation. Regarding security, the script performs only read operations (reading the last N entries from access.log files). The server side script is written in Perl, but can be easily ported to PHP as well.

alivelog.zip 140 KB / Version: 0.9; Date: February 4, 2009

Technically speaking, there are two approaches to analyze website statistics:

  1. Use the apache access.log (AWStats)
  2. Add a code snippet (mostly JavaScript) to all your html pages (AXS, Google Analytics, Woopra, StatCounter)

The second approach is very flexible, your don’t have to install any additional software on your server: a plug&play solution. Although it has a problem, there is no guarantee that all page views are tracked due to: disabled JS, network problems (the visitor’s internet connection with server that hosts the tracking facility is poor). Furthermore, only pages that contains the tracking code appear in stats, i.e. only HTML files. Ordinary files like images, zip archives, pdf documents are not counted. And because search engines doesn’t run javascript code, it means that crawler activity will be omitted too.

Only the access.log will give you a 100% accurate picture of how visitors are browsing your site. I recommend AWStats: a free log analysis tool that generates nice reports out of your access.log. However, AWStats is focused on general usage reports per day/month/year and doesn’t offer an handy way to see the recent activity on your site.

Don’t forget to use the combined log format. Verify that you httpd.conf file has the following line:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

And then specify that you want to use it:

CustomLog logs/access_log combined

But sometimes I need a real time report that shows the last 10/20 visits to my website. Moreover, if I have a couple of websites hosted on the same server, I need to easily switch from one website to another.

Such a long prologue finally brings us to the core. I’ve written a small utility that consists of a server side perl script and a frontend JS application that communicates via AJAX with the script. The script fetches the last entries of an apache access.log, group entries of the same visitors (by IP) and attach the GeoLocation information to the data. The hits from search engine bots (for ex: Google, Yahoo, MSN ) are also displayed but using a gray color to visually differentiate them form normal human visits. A visitor session will be colored in red if at least one hit was not successful (for ex: the server returned 404 Not Found or 500 Server Internal Error). The package also contains the jquery_131.js: the jQuery library v.1.3.1. It has an Install.txt file inside.

Features

  • monitor multiple websites;
  • specify which log entries to process. You can show: a) the last 10/20 visits (unique ip’s); b) log hits in the last 5/10/30 minutes (1/3/12 hours intervals are also provided); c) the last 50/100/500 entries in the log file
  • ignore files of a given type (by extension) that are smaller than a predefined size;
  • highlight visits with errors: if a visit contains an entry with a 404 or 500 error code, it will be highlighted in red;
  • highlight bot access: visits of spiders are colored in gray. The spiders are determined by User-Agent;

Screenshots

ALiveLog: Find out how a visitor browsed your site

ALiveLog: screen1

ALiveLog: screen2

ALiveLog: screen2

ALiveLog: screen3

ALiveLog: screen3

Discussion

The script doesn’t have a login/password feature yet, so you should use .htaccess to restrict the access to it.

The Geo::IP perl module is needed if you want to see the visitor’s location (country/city). (Notice that you should also install the GeoIP C Library). After that download the GeoLite City database and decompress it to /usr/local/share/GeoIP/GeoLiteCity.dat.

At last, you may need to install two small and handy perl modules:

> cpan install JSON
> cpan install File::ReadBackwards

In an attempt to minimise the number of files used, I wanted to concatenate all country flags into one big image, I found the Montage utility. It is a part of ImageMagic package.

Here is how I’ve managed to obtain one big file from 247 small country flags.

> montage “*.gif” -tile 1×247 -geometry 16×16+0+5 all_countries.gif

each country should have a corresponding declaration in the css file, for ex:


span.gb {width: 16px; height: 16px; background: url('flags/all_countries.gif') 0px -2033px no-repeat; display: block;}
span.us {width: 16px; height: 16px; background: url('flags/all_countries.gif') 0px -5959px no-repeat; display: block;}

The problem is that after concatenation, the image had a white background instead of a transparent one. On row hover, the flags were surrounded by some white space. I think that the number of countries from the recent log hits will be small, and the benefit of having one big image will not improve significantly the loading speed of the front end application.

While writing this post I remembered that I should check the app in Internet Explorer (I’m using FireFox and Google Chrome occasionally). Hmm, I was very disappointed by IE… When the number of rows exceed the grid capacity, the grid will expand below the footer, and you’ll not be able to see the bottom rows. At the same time, the window that appears when clicking on the number of entries will not be displayed correctly.I’ll try to dig a little, maybe I’ll find the correct CSS solution to this problem. At least I was happy to find out that it works ok in Google Chrome.

  1. April 18th, 2009 at 00:17
    Reply | Quote | #1

    Hello.

    I have to give you praise for this wonderful tool. By watching the traffic on my little website and I am excited over it. However, I noticed that some things act differently (or did not act). Web robots are never labeled with another color and name as it appears on your image. Not shown is also information on the web browser and it has never shown Referrer. Any idea?
    (Alivelog runs on apache 2 server on Ubuntu intrepid.)

    Greetings.

  2. admin
    April 19th, 2009 at 08:58
    Reply | Quote | #2

    Hi Lambda,
    You apache logs doesn’t feature referrer information, as stated in the post, “Don’t forget to use the combined log format. Verify that you httpd.conf file has the following line:” … notice the LogFormat and CustomLog apache directives.
    Regards

  3. April 19th, 2009 at 14:17
    Reply | Quote | #3

    Hello,

    Thank you for your quick response. You’re right. I forgot to remove a comment before the line in httpd.config.
    Now all works fine. Thank you for your help.

    Greetings.

  4. andi
    June 25th, 2009 at 00:25
    Reply | Quote | #4

    your script has a bug.

    if i browse one of the logged hosts from my local area network (and there is a entry with local ip (192.168.100.123) … your script doesnt show anything!
    i’ll try to fix that for my own – should be difficult because i cannot code perl… but you should fix that also – it did cost me hours of finding that.
    anyway – thanks for your simple and nice script.

You must be logged in to post a comment.