Replacing Goodreads the Indieweb way

As a young boy, I enjoyed reading books.

I remember a bunch of youth books I inherited from my dad. 11 mann til angrep (Åshöjdens BK) about a local football team and Gutten med gullbuksene (Pojken med guldbyxorna) about a boy who could pull an endless amount of coins from his pant pocket by Max Lundgren stand out.

There were many more.

But I don't remember them, nor do I know where the physical books ended up. (Probably in an incinerator when my parents last moved.) But I would've liked to at least have a list of all of these books that I read and enjoyed way back when. For posterity and reminiscing.

Tracking my reading

That's why, when I picked the habit of reading back up again in 2012, I signed up for Goodreads. If you're in the habit of reading books, you probably know this website.

Goodreads is a bit of an everything website for people who like to read. It's a social network and place to discover new books. A place to follow and interact with authors and a place to share your opinions on the books you read. And much, much more.

For me it was a place to track my reading. So that I could see what I had read when, and perhaps jot down a few words about what I thought about it at the time.

Goodreads is also owned by Amazon. And, seeing as how, in recent years, I have become more conscious about sharing data with the tech oligopolies, that seemed like reason enough to find an alternative to Goodreads. Kinda pointless to keep your Kindle in airplane mode indefinitely to avoid sending usage data to Amazon if you're logging it all on Goodreads anyways, don't you think?

Sharing my reading history on my website

Having recently discovered the Indieweb movement, this was an entirely obvious thing to incorporate into my own website. So I set about putting it together. In July last year, it went live. I wrote:

Very happy with how I was able to put this together. I'm going to do a full post on how everything's set up once I have added all the data.

That was a bit of a lie. I was happy that I had been able to put something together — and I quite liked the way it looked. I just hated the way it worked. My site was running on Wordpress at the time, and I cobbled this functionality together by combining a bunch of custom post types, custom fields and custom templates. Adding and updating data was a bit of a hassle that involved adding a new "post" for each book.

In the meantime, I was doing all of my actual reading tracking by now in a simple spreadsheet. I just updated the site whenever I could be bothered to add the data.

Wouldn't it be nice if my website just updated whenever I updated my spreadsheet? Thinking about how that might work in conjunction with my Wordpress setup gave me headaches. All the while the data backlog of my newly created reading log remained missing.

A few months later, things changed. Instead of spending my time backfilling my reading log, I decided to make a static site generator to replace Wordpress. I began by trying to replicate the reading log functionality I had created with Wordpress. That's to say making a new "post" (type book) for each book in my reading log.

Luckily, I quickly came to my senses and scrapped this approach. Instead, I realised that this new setup laid the groundwork for making the reading log the way I actually wanted to do it. That would require a bit of thinking and coding, however, so I left it out of the initial version.

Towards the end of January, I felt ready to embark on another small project. It was time to get my reading history back online. Only this time, I would do it right.

Thinking through the new setup

Premise: I track my data in a spreadsheet.

Challenge: Display the data from my spreadsheet in a reading log section of my website.

Can't be too hard, can it? All I have to do is create some kind of script that checks the spreadsheet and generates an HTML document from it based on a fairly simple set of rules. And then I need to figure out a way to track whether an entry has been handled, or if it's been changed since last handled and ...

Unbeknownst to myself, I inadvertently realised the point Henrik later made in his post Reduce new problems to known solutions:

The problem is to determine what needs to be changed. That single word that was changed may influence the output in several ways. Let’s say you change the title of a page, then not only that page needs to change but every other page which links to that page, and maybe a “last updated” timestamp in the site footer as well.

Keeping track of what has changed is hard!

"Surely there must be a simpler way to solve this?" I thought. I felt like a genius when I eventually landed on the obvious conclusion to just regenerate the entire reading log every time. "Brilliant!" I thought, having finally reasoned my way into the solution that I've complained several times didn't make any sense to me.

This was, quite honestly, the hardest part of the entire process.

In line with the philosophy I used to create my static site generator, I wanted to keep it as simple as possible. That meant no separate pages for each book, nor any of the redundant archive pages like pages for each author, series and genre.

It's nifty to be able to go to logs/reading/author/brandon_sanderson/ and get a list of all books I've read by Brandon Sanderson or logs/reading/genre/science_fiction/ to see all science fiction books I've read. But it adds a fair bit of extra complexity to the code. And with my main goal being to just create a list of the books I've read and keep the code simple and maintainable, I chose to omit all of these niceties.1

Instead, I doubled down on simplicity:

  • Everything on a single page
  • Split into four main sections based on book status in my spreadsheet (reading, read, want to read, did not finish)
  • Group read books by year read

Considering cover images and pagination

I make it a priority to keep this site lightweight.

I don't rely on any kind of frameworks and write all the HTML and CSS by hand. Most pages are just a double digit number of kilobytes in size as a result. When I do share images I try to downsize and compress them to reasonable sizes. My definition of reasonable in this context being preserving just enough image quality as is necessary to get the point across.

Displaying potentially hundreds of book cover images on a single page is not the most efficient approach. So, for the first iteration I dropped cover images altogether. I didn't like it. There's something about a book that doesn't come across at all when only reading the title versus seeing the book cover. Especially if it's a book you're familiar with. At least for me, the cover sparks an instant and visceral recognition.

And — let's be honest here — this section of the website will be used mostly by myself to peruse my reading history from time to time. To scratch that itch properly I need cover images.

The relevant question, then, became "how low can I go?" in terms of size and compression. After some experimenting, I decided that the stamp-like image width of 150 pixels and very high compression was enough to preserve the desired effect. A bit of blur and smudge isn't a problem. I don't care about polish, just displaying the "character" of each book through its cover and to maintain that element of recognition upon seeing the image.

This let me keep each book cover image <10 kilobytes. Coupled with setting the loading="lazy" attribute to all the images, to ensure that the images don't load before they are shown on the screen, it felt like a decent compromise. The result is an initial page load size of 131 kilobytes increasing to 1.11 megabytes if you scroll through the entire page for the current 142 books being listed.

That means I can probably add five times as many books to my reading history before coming close to the average size these days of a website with a few lines of text. What a time to be alive!

It still feels a bit on the fat side, though. At the same time, dropping pagination keeps the code for generating the page as simple as possible. I also like having all books listed on a single page, because it lets me use cmd/ctrl + F to search my entire reading log for a particular book or author.

Setting up the script

My static site generator runs on my home server. It's just an old always on Macbook Pro running with the screen deactivated and lid closed. The site generator is set up as a single executable script that runs all the various modules like generating new content pages, updating the front page, archive pages and feeds and so on. Another script monitors the content directory to trigger the site generator executable whenever a new file appears.

Given my eureka moment mentioned above, solving this seems simple enough. Set up a separate script to generate the reading log. Don't worry about checking whether or not the reading log has actually changed, just call it whenever the main executable runs. That's easily done with adding two lines of code to the main executable.

First, I pull in the main function of the reading log module:

from readinglog_processing import process_reading_log

Then, add it to the end of the loop where I call the various function to update everything after processing new content:

generate_posts_archive(output_dir, settings)
generate_notes_archive(output_dir, settings)
generate_front_page(output_dir, settings)
generate_feeds(output_dir, settings)
process_reading_log(output_dir, settings)

Of course, I might go days or weeks or even months without adding new content to this website. Yet I would still want my reading section to remain current. To solve this without adding any unnecessary complexity whatsoever, I decided to simply add a daily scheduled run to my monitoring script. So at 22:30 CET each night the site generator runs and thus updates my reading section.

The actual reading log generation script is quite straight forward. As I can't write code, I rely on LLMs to help me translate the logic from plan English to Python. Meaning it is probably all just spaghetti code. But it works.

I use the ezodf package to parse the data from the spreadsheet. (Because I do, of course, use LibreOffice and Open Document Format (ODF) to handle and store my data.)

Afterwards I pull in the styling and create the page title, before the magic happens in the main function. Here the templates are loaded, books grouped and sorted and the content is generated. In line with the logic for the other parts of my static site generator, I then replace a placeholder in the template file with the generated content before combining everything into the finished HTML document.

Design

I'll readily admit that it's not the prettiest page on the web. But, to toot my own horn a little, I think it looks better than my reading history on Goodreads. The general idea was to use a card-style layout, with a card for each book. I then wanted to style each card to make it resemble the cover of a book. The result, well... Insert your favourite "nailed it" meme here!

Nevertheless, it was an interesting exercise in working with CSS flexbox. It took a bit of playing around with to get it to get it to work as I wanted on various screen sizes. I landed on an approach that keeps each card a fixed width and transforms from a three card rows to two card rows when the viewport shrinks. When there's only room for one card per line (504 pixels) the behaviour changes and each card fills the entire width.

In the end, I was quite happy with how this turned out. I might clean up the styling a bit in the future, for the cards themselves as well as the table of content links. But for now, it does the job.

Wrapping up

If you haven't yet, hop on over to the Reading section to check out the finished product.

A year after I started working on this in Wordpress, I'm very happy to have a way to display my reading history working exactly the way I want it. It suits my needs and preferences both with how it works and how it looks. And I have full control if I want to make any changes to the setup in the future.


  1. At some point in the future I might explore how I can make use of HTML tables and their "sortable" and "searchable" properties and some simple JavaScript to make it easier to explore the data. I don't exactly know what that might look like, and this is more than anything a note to my future self to look into that.