One of my resolutions for 2020 was to relaunch my anti-junk mail website. I'm pretty sure that was on the list with resolutions for 2019 as well. Unfortunately, I haven't found the time to work on the project. As it doesn't look like I'll be enjoying much spare time this year either I decided to create a minimal, read-only copy of the site.
wget to the rescue
The first step was to clone the Drupal install and create a minimal version of the website. There was a time when I was working pretty much full time on the site. It grew quite large, which is why it became too much to maintain. I didn't want to keep all the content. Much of it had become dated, and junk mail historians can always find the site on the WayBack Machine (there's also a copy on the UK Web Archive).
Stripping the site down to just the home page and the guide to stamping out junk mail was fairly straight-forward. I just needed to make sure to remove all links to other pages, such as the old news pages – one link to the section would include the entire news section in the copy.
wget to create a read-only copy of the site. As the site was on my localhost with nothing being downloaded from external domains the following command worked just fine:
$ wget -mpck --html-extension http://localhost/stopjunkmail.org.uk
I probably didn't need the
--html-extension option. The downside of using the option is that all the URLs now have the extension .html. That's bad for SEO and I'm pretty sure a simple directive in a .htaccess file would have prevented the need for the extension. Frankly, though, I don't care about SEO. The website is a minimal archive, and I'm not interested in visitor numbers. In fact, I'm keen to get fewer visitors – it helps keep my hosting costs low.
To handle the inevitable flood of error 404s I simply added a 404 redirect to the virtual host:
ErrorDocument 404 /404.html
Any non-existing page redirects to /404.html. On the page I apologise profusely for the error.
I now need to keep an eye on the VPS. It's a cheap 'n cheerful Hetzner VPS with just one CPU core and 2GB of RAM. It would be great if it can cope as the cost is less than €3 per month, but I'm expecting it will need more juice.
The future of Stop Junk Mail
The ultimate goal is to replace the archived website with a new site. I might use Pelican, as I'm fairly familiar with it now. Honestly though, this feels like a retirement project – and I'm still quite a way off from my retirement.