How This Site Is Built

24 January 2021

Someone sent me an email after an article I wrote was posted on Reddit. The email had a question about how this site was built. “Pure HTML that looks so good with so little CSS is a great skill indicator” said the reader.

I hate to disappoint that reader, but I’m going to do it. There’s definitely no web development skill here.

This site is put together with cat(1), awk(1) and a bit of shell script called by make(1). Why? For a longer answer, see the bottom of this article. For a short answer, well, if “no dependencies” isn’t interesting for you, then don’t waste your time on the walkthrough of this “static blog generator” hackjob ;)

Just Gimme The Numbers

Delete all generated files:

rm 20*/*/*.html index.html

Then make them all again:

time make `echo 20*/*/*.md | sed 's/\.md/\.html/g'` index.html
0.37 real         0.18 user         0.22 sys

Essentially instant if it’s all generated:

time make `echo 20*/*/*.md | sed 's/\.md/\.html/g'` index.html
0.01 real         0.00 user         0.00 sys

I use lowdown instead of markdown which saves time by not having to start up the perl interpreter for every single article.

Hierarchical filesystem

I’m a big fan of Plan 9, so I often think about stuff in terms of file trees. I put each article into a tree, as follows:

$year/$month/$name

For example, for 2020:

2020/10/got.md
2020/10/jump.md
2020/10/nice-guy.md
2020/11/open-minded.md
2020/11/quitting-computer-hoarding.md
2020/12/escape-to-the-tropics.md
2020/12/xmas-panic-buyers.md

We have a bunch of articles not written in HTML.

One Article: `cat` and Markdown

An article is written in markdown. No fancy markdown extensions.

Each article follows a strict format.

The first line must be a top-level header i.e. <h1> (# markdown)
The second line is a date of the format day month year, such as 24 January 2021.

The rest of the markdown file can be… whatever. See a source file of this site for an example.

How do we turn markdown into HTML? The thing I like about John Gruber’s original markdown is that it’s distributed as a good old fashioned UNIX tool which reads from standard input and writes to standard output. So to turn site-build.md into HTML, we just run:

markdown 2021/01/site-build.md

And HTML is written to standard output.

To write it to an HTML file, just do some redirection:

markdown 2021/01/site-build.md > 2021/01/site-build.html

But the output isn’t really valid HTML. We need to concatenate markdown(1)’s output with some <html> and <body> tags, at least. There’s a tool made for this very purpose, but it’s not really used like this so much anymore: cat(1).

This site uses 2 files: head and footer which contain the opening and closing html and body tags.

Here’s head:

<html>
    <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title>olowe.co</title>
    <link rel="stylesheet" href="/style.css">
    </head>
    <body>
    <nav>
        <a href="/index.html">Home</a> |
        <a href="/who.html">Who</a> |
        <a href="/projects.html">Projects</a> |
        <a href="mailto:o@olowe.co">Contact</a>
    </nav>
    <hr />

And footer:

<i>Copyright 2010-2020 Oliver Lowe. All rights reserved.</i>
</body>
</html>

The command we run to make this article is:

markdown 2020/01/site-build.md | cat head /dev/stdin footer > 2020/01/site-build.html

That is, we put together the contents of head, standard input (from markdown), then footer.

More Articles: `make`

The command line is the same for all articles. The recipe is:

markdown $source | cat head /dev/stdin footer > $output

There’s a program that was designed to perform operations to “make” files like this. It’s another tool, like cat(1), whose popular usage does not really demonstrate its original purpose: make(1).

Here’s the rule from Makefile which tells make(1) how to build articles:

.SUFFIXES: .md .html
.md.html: head footer
    markdown $< | cat head /dev/stdin footer > $@

The syntax is cryptic at first. First, we tell make to be aware of two new file suffixes: .md and .html.

Then we provide an inference rule. We’re telling make how to turn any file with the .md suffix to one with a .html suffix (with dependencies head and footer). The line of shell script is what we had before, but with some special tokens:

$< is the input file
$@ is the output file.

With this rule, I can run the following command:

make 2021/01/site-build.html

and our original command from above will be run. When I make a new article, say some-story.md, all I need to do is run make again to rebuild the HTML:

make 2021/01/some-story.html

make already knows how to take some-story.md and turn it into some-story.html from the inference rule we wrote.

What if I edit an article in the future and want to rebuild the HTML? make looks at the last modified times of both the markdown and HTML files. If the markdown file’s modified time is later than the HTML, then it will rebuild the HTML. If not, it does nothing, and prints a message instead:

make: `2020/12/escape-to-the-tropics.html' is up to date.

To build more than one article, I can run make with a few more arguments:

make 2021/01/site-build.html 2020/12/escape-to-the-tropics.html

It’s safe to keep running this command to rebuild the articles.

I’ve written over 130 pieces over the years, so I’m not going to keep running a big long command all the time. That’s where the shell comes in.

All Articles: The Shell

All articles in the tree are matched by the following shell glob rule:

20*/*/*.md

What I want is a list of all the output files to pass to make. I can get this with a bit of sed by replacing the .md suffix with .html:

echo 20*/*/*.md | sed 's/\.md/\.html/g'

To build all the articles:

make `echo 20*/*/*.md | sed 's/\.md/\.html/g`

For a good reference on make, see OpenBSD’s manual page make(1).

We’ve got our articles. Now to build a homepage.

Homepage

I wanted index.html to be a simple list of every article. Doing this was a bit trickier than writing a little shell script. Somehow I decided to use awk(1).

index.awk is little awk script which does some very…(!) brittle parsing (or really just tokenizing) of article files, and writes HTML to standard output.

Given the file 2020/12/escape-to-the-tropics.md, it will do the following:

Read the title from the first line
Read the year from the second line
Print a nice big heading of the current year
Print a HTML link to the article.
Print a line break, ready for the next file

Example output:

<h1>2020</h1>
<a href="2020/12/escape-to-the-tropics.html">Escape To The Tropics in Cairns, Queensland, Australia</a> 30 December 2020
<br>

We do the usual wrapping of the output in head and footer to make it a real webpage:

awk -f index.awk 20*/*/*.md | cat head /dev/stdin footer > index.html

See index.awk and Makefile.

Why?

My blogging platform experience goes like this:

2002: Angelfire. I was just a kid writing HTML.
2007: Wordpress. When I tried to export articles from this years ago I had no idea what a database was so I just gave up and deleted the sites.
2011: Tumblr. Got an XML export from this.
2014: Octopress/Jekyll. Saved some files from this, but I forgot how to put it back together again. I need to install ruby 1.9.3 and rake and … huh?
2020: Hugo. Less to install, but more documentation for another framework. Will Hugo go stale like Octopress?

Eventually I just wanted the most stupid, good enough solution using plain text files and no fancy frameworks. Pipes, text streams, awk, will probably last another 20 years. I actually use Hugo for srcbeat. For my personal, life legacy stuff, I don’t want to rely on a framework like last time. And the time before that… And the time before that…

Bugs

The title of every page is “olowe.co”. It would be nice to include the title of the article.

Within a given month, the articles are not in correct chronological order, because index.awk does not parse or sort the input it gets. It receives stuff in lexographic order from the shell.

I could drop the Markdown dependency entirely and just write articles in plain HTML with <p>, <h2>, <code> tags. But Markdown contains barely any markup (get it?); the articles are perfectly readable as plain text files. If the whole software world blew up, I’d rather have plain text than HTML.

No RSS or Atom or JSON feed. I’ve thought about writing a little Go program which parses the files and spits out a feed from a template file.