Web Traffic Statistics
1215  words. Read time: 5 Minutes, 31 Seconds
2025-08-09 19:28 +0000
Intro
I used to use WordPress, and the Jetpack plugin which gave me statistics. I like knowing what types of posts are popular and which aren’t. Note; My instructions on fixing a Subaru Tumble Generator Valve (TGV) are by far the most popular post I have ever made, or ever will. So I guess instructions for things that don’t already exist, are something that desperate people with dysfunctional daily drivers are going to seek out.
In any case, I disliked how incredibly needy Jetpack was. There were updates every few days. It was most certainly data mining and sending my info to god knows who.
I finally decided to use GoAccess. It’s not quite as feature-rich as Jetpack, but it works on my self hosted system. I.e. it will simply read my Nginx logs and generate a static page to show my stats. There are lots of options, too many to list, but I really don’t want to have to keep up with a database or have something running constantly in order to work. I want something that runs a single process once a day, then is done with.
Here’s what it looks like in the end.
What am I trying to do here?
- Install GoAccess and test it out
- Point it to my Nginx logs
- Have it generate a static html file each night
- Serve that html file as part of my website (no SCPing the .html file and opening it locally like a neanderthal)
- Protect the site, so that only I can view it
Applicability
I’m using Hugo to create a static site. Nginx is my reverse proxy. But it can read a number of different log formats. I’m not well versed in other systems, so you will have to read the GoAccess documentation to see if this is applicable to you.
Installation
I want this to run on the same machine where Nginx is already running, so that it has easy access to the logs. You can install elsewhere and work a solution to copy files over as needed, but this will be easier, trust me.
So I’m running this on my Nginx LXC machine, not the machine with Hugo. On Debian, this package is available in the repository with no extra work.
apt install goaccess
Get your logs straight
First make sure you have logging enabled in your Nginx config file for the site in question. Mine is /etc/nginx/sites-available/fitib.us.conf
We’re looking for
error_log /var/log/nginx/fitib.us.error.log;
access_log /var/log/nginx/fitib.us.access.log;
Obviously, name these whatever you need to for your site. Or name them after my site, and remain supremely confused when it comes to managing your stuff later.
I have seen some people complaining about GoAccess only showing the last 7 days of logs, etc. This is not a GoAccess problem, it’s a linux logging problem. Logrotate is the utility that manages the logs, deciding how often they’re rotated, what the naming convention is, if they are zipped, when to delete, etc.
Go modify /etc/logrotate.d/nginx
to change these settings.
I want logs rotated weekly. I’m not worried about disk space and want them easily able to parse through later (yes, I know I can zcat and pipe that into whatever I need. But I’m stupid and lazy sometimes), so no zipping. Retaining them for one year, then deleting. Here’s what that looks like. Only a few things were modified from the default.
/var/log/Nginx/*.log {
weekly
missingok
rotate 52
dateext
#compress
#delaycompress
notifempty
create 0640 www-data adm
sharedscripts
prerotate
if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
run-parts /etc/logrotate.d/httpd-prerotate; \
fi \
endscript
postrotate
invoke-rc.d Nginx rotate >/dev/null 2>&1
endscript
}
Test it out
There are some examples here. For now I’m just running it on a single file to see that it works. We just feed it a log file input, specify the output file, and tell it the format so it understands the timestamps. That’s it!
goaccess /var/log/Nginx/fitib.us.access.log -o ./teststats.html --log-format=COMBINED
I’m doing this headless, so I’ll have to SCP the file onto my laptop quick in order to take a look at it.
Here’s what it looks like initially, just the partial week worth of logs that are contained in the newest file.
Get it to run on its own
I’m just making a basic script and running it with cron. If it fails, it fails. Using a wildcard to feed it all of the access logs, and avoid the errors log.
nano ~/goaccess-update.sh
!#/bin/bash
goaccess /var/log/Nginx/fitib.us.access.lo* -o /var/www/html/fitib-stats.html --log-format=COMBINED
Then use crontab to make it run on your schedule. I’m not going to walk through this, I’ve already covered it, as have many others. Don’t be stupid like me initially and forget to make it executable chmod +x ~/goaccess-update.sh
. Cron will get mad, and after a few days you’ll start to wonder what you forgot…
Make it accessible
Remember earlier when I said it would be easier to just install the program on the same machine that was running Nginx? Well here’s why.
I want this to be included as a separate location of my site. So I’m just going to take the static html file on this machine, and serve it as a static file with Nginx.
Get back into the Nginx config file for my blog /etc/Nginx/sites-available/fitib.us.conf
, and let’s add a few lines. It is a location block, that is created within the main (already did https redirect) server block. You can see the root stays the same in the first block
server{
root /var/www/html;
server_name fitib.us;
#Main website hosted on separate Apache2 static backend
location / {
proxy_pass http://main_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Host $server_name;
proxy_set_header X-Forwarded-Proto $scheme;
}
#Statistics, generated by goaccess on this machine
location /stats.html {
alias /var/www/html/fitib-stats.html;
allow 192.168.0.0/24;
allow 10.10.0.0/16;
deny all;
}
listen [::]:443 ssl ipv6only=on;
listen 443 ssl;
ssl_certificate ________/fullchain.pem;
ssl_certificate_key _______/privkey.pem;
include /etc/letsencrypt/options-ssl-Nginx.conf;
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
}
...
So it will be located at fitib.us/stats.html
. It’s only available to my internal networks, and it blocks everything else from outside. This probably isn’t truly necessary, but it does show IP addresses of visitors. And resources trying to be accessed by bots. So things not already blocked by router firewall, fail2ban, etc. are still making it past to this nginx instance and showing all of the sql/javascript/etc. attacks people/bots are trying, and I don’t know that I want to advertise what’s making it through to the last line of defense.
Final thoughts
I’m only running this script once a day, at midnight. I think I’ll change it in the future to run hourly. Sometimes it’s nice to check in the middle of the day if things aren’t loading the way I expect, and have an indicator that something is going on.
I don’t get that much traffic. The logs in their entirety are ~17MB for 2 months worth. If I had a bigger site, I would probably let it gzip the older logs, and pipe the zcat output into goaccess.
I still miss the ability to flip through stats at a high level in Jetpack, then drill down into individual months/weeks/days. But I still refuse to run a database (required for other statistics tools) and deal with all the crap that comes along with that. I may look into the option of creating separate .html files for the week/month/etc.