AWStats tip: creating static pages (and why it’s a good idea)

AWStats is probably the most popular free statistics package for self-hosted sites (if we don’t count “external” ones such as Google Analytics), and, as any decent Unix sysadmin probably knows, there are several ways of configuring it.

One of them is by having the CGI accessible on the web and having it analyze the logs and generate the statistics on demand. I don’t think many people use it that way, though — not only is it the slowest method, but it could theoretically be used for DOS attacks. Yes, you could put it somewhere private (and it’s probably still a good idea to do so, no matter what method you use), either by using a non-world accessible web server, or by adding authentication. But, still, there are no real advantages to this method, other than being sure you have the absolutely most recent stats. But having the stats of, say, 5 minutes or less ago is, in most cases, more than good enough.

In my experience, most people use an intermediate method: the CGI is still accessible, but is isn’t capable of analyzing logs; it just generates the stats page. The logs themselves are analyzed by the same CGI file, but through a local crontab.

And this is what I had been using until today. Yes, much like in the case of the “cd back” trick, I had been using AWStats for years… and only today did I switch to using fully static pages. A few more of these and someday I may have to turn in my geek card. 🙂

It’s pretty easy to configure AWStats this way: Here’s my old crontab line:

*/5 * * * * /usr/local/cgi-bin/awstats.pl -config=winterdrake.com -update >/dev/null 2>&1

And here’s my new one (if it word wraps, it’s supposed to be a single line):

*/5 * * * * /usr/local/bin/awstats_buildstaticpages.pl -config=winterdrake.com -update -awstatsprog=/usr/local/cgi-bin/awstats.pl -dir=/var/www/htdocs/awstats

Before that, I had to put awstats_buildstaticpages.pl (included in the AWStats /tools directory) in the /usr/local/bin directory (you may prefer it somewhere else, of course), and create the /var/www/htdocs/awstats directory so that the static files could be put there. And now, they’re accessible on https://myserver/awstats/awstats.winterdrake.com.html . They look exactly the same as if I accessed the CGI directly (which I can still do, in order to see yearly reports, for instance — but I do that very rarely), but let’s do a little benchmarking, shall we?

CGI version:
Requests per second: 2.33 [#/sec] (mean)

Static version:
Requests per second: 4557.69 [#/sec] (mean)

Now, you may be thinking: “yes, the speed is in a completely different order of magnitude, but I don’t look at my stats all the time, and they’re private, so nobody else does… isn’t taking half a second good enough?” Yes, that’s true… but getting rid of limits is always a good thing, because you can then do so much more. Suppose you don’t have half a dozen sites on that server, but a thousand, with statistics for all of them? Suppose you want to use the AWStats stats to generate other stats (for instance, I’m currently using MRTG to plot a graph of Google and Bing referrals, using the AWstats-generated static pages as input) ((here, another advantage becomes obvious: I can now do this through a trivial combination of grep and awk on a static HTML file.))? In both these examples (and I’m sure there are many more), having stats accessible almost instantly and taking up virtually no processing power at all is obviously a Very Good Thingâ„¢.

Other advantages: you can move your stats to a (virtual) server that only serves static files, since that what they’ll be. Alternatively, if you had CGI processing enabled just for AWStats, you now can simply turn it off on your web server, improving its security.