archive your slice of the web

Many use and love archive.org. A service that roams the public internet and archives whatever it finds. It even creates timelines of websites so you can dive right into history.

Have a piece of history right here:

You can have something similar hosted in your own environment. There are numerous open source projects dedicated to this archival purpose. One of them is ArchiveBox.

ArchiveBox takes a list of website URLs you want to archive, and creates a local, static, browsable HTML clone of the content from those websites (it saves HTML, JS, media files, PDFs, images and more).

I’ve done my set-up of ArchiveBox with the provided Dockerfile. Every once in a while it will start the docker container and check my Pocket feed for any new bookmarks. If found it will then archive those bookmarks.

As the HTML as well as PDF and Screenshot is saved this is extremely useful for later look-ups and even full-text search indexing.