gx – Build a Blog Website by Pandoc

I built my blog with Hugo/Zola/Jekyll before. But I got bad experience from time to time. For example,

And above of all, all these SSG require me to attach a front matter in every page. It is very inconvenient, if I just need … a page. For example, I write my notes in markdown files. I want to build a local website to render them, and then I can read the math expressions more easily. Of course, I don’t have front matters in my notes, as they are just notes.

I want to build a SSG by myself for a long time. I thuoght I could do that by pandoc, which is very powerful and even supports non-markdown files, like org, latex. The only barrier blocking me is that I don’t know how to generate atom feed. However, after reading its definition, I found it was much easier than I thought.

So I migrated my posts this week. As I now generate the feed myself, old posts may spam in RSS readers. But I think nobody subscribe my website at present. It should be okay.

Web Pages

pandoc has its default template, which is good enough for me. Besides, I don’t want to increase the migration workload. So I don’t built my own template during the migration. What I did is to change my CSS file according to the output of pandoc.

pandoc \
    -s \
    path/to/markdown \
    -o path/to/html \
    --template path/to/template

The command to generate web pages is simple and easy. --template is not necessary if default template is used. There are some good flags to customize the default template:

Build

The building of website can be achieved by Makefile easily. Read https://makefiletutorial.com/ if you are not familiar with that.

I define my posts in posts/, the generated files in public/, and other files in static/. These file paths can be defined like below. I define the url of my posts as /posts/some-slug-here/.

MD_FILES := $(shell find posts posts.md -name '*.md')
CONVERTED_HTML_FILES := $(patsubst %.md, public/%/index.html, $(MD_FILES))
STATIC_FILES := $(shell find static -type f)
PUBLIC_STATIC_FILES := $(patsubst static/%, public/%, $(STATIC_FILES))

It is noted that /posts/ comes from posts.md instead of posts/index.md. It is to make the Makefile rules simpler.

PANDOC := pandoc --toc --include-before templates/navbar.html --css /css/main.css

.DEFAULT_GOAL := all
all: $(PUBLIC_STATIC_FILES) $(CONVERTED_HTML_FILES) public/index.html

public/%: static/%
    mkdir -p $@
    cp $< $@

public/%/index.html: %.md templates/navbar.html
    mkdir -p $(dir $@)
    $(PANDOC) -s -o $@ $<

public/index.html: index.md templates/navbar.html
    mkdir -p $(dir $@)
    $(PANDOC) -s -o $@ index.md

Unluckily, a dedicated rule for public/index.html may be necessary, as it cannot fit public/%/index.md.

Local Server

It will be helpful to preview the rendered pages locally. I use Python to do that, as it is now installed in every machines.

make -j
python3 -m http.server -d ./public

The only problem is that, the page cannot render again after I change my files. A trick is to use watch to auto building, and live.js to auto refresh web pages.

<script src="https://livejs.com/live.js"></script>

ifdef INCLUDE_LIVEJS
    PANDOC += --include-in-header=templates/livejs.html
endif

Then after running the following command, my pages will be built automatically, and the page will auto refresh if I open it in my browser.

watch -n 1 -- make INCLUDE_LIVEJS=1 -j

Atom Feed

The feed generation is the troublesome part. I will recommend to do it by packages like gorilla/feeds or python-feedgen. But I don’t want to include lots of dependencies. I decided to generate it by bash.

I need to get meta info defined in markdown files. I don’t want to parse markdown files myself, so I decided to use existing pandoc to achieve that. I created a file named templates/meta.json with a oneline content: $meta-json$ . pandoc can then return meta info in json.

One big problem is that I don’t know how to parse them in parellel by bash script. My solution is to cache the parsed results first and generate the feed later. Thuogh it creates lots of temp files, the performance is much better than iterating my posts in a bash script.

META_JSONS := $(patsubst %.md, metajsons/%.json, $(MD_FILES))
HTML_CONTENT_FILES := $(patsubst %.md, genfeed/content/%.html, $(MD_FILES))

metajsons/%.json: %.md
    mkdir -p $(dir $@)
    pandoc $< --template=templates/meta.json > $@

genfeed/content/%.html: %.md
    mkdir -p $(dir $@)
    pandoc -t html $< | jq -Rr @html > $@

Then I only need to fetch these cached data during generating. Much easier. The script is here. Luckily, the published date of posts are included in filenames, so sorting them is not a big issue for me.

all: some_other_output_files public/atom.xml

public/atom.xml: bin/atomfeed.sh $(META_JSONS) $(HTML_CONTENT_FILES)
    mkdir -p public
    bin/atomfeed.sh > public/atom.xml

The way of creating atom feed can be used to generated the index of /posts if you don’t want to write it manually.

Gain and Loss

The loss? I spent much time on it. It will be much easier if I keep using zola or hugo. But it may be worthwhile? I learnt how to use Makefile a lot during the migration, and learnt the definition of atom.xml.

I lost some features other SSGs provide. For example, it is allowed to put some static files in the same directories of markdown files in Hugo and Zola, and my system doesn’t support that at present. And for sure, tags and categories. But I don’t think such features are that necessary, as I don’t write posts that often. A simpler system is more suitable to me.

The gain is actually trivial. I won’t recommend any one to do that, as building such system costs time, especially one with lots of features. It just works for me, as I want a SSG not requiring front matters, and in addition to that, I don’t need lots of features.

The generator is very simple, with jq, bash, pandoc, make. They are all well maintained. If I want better features someday, I can rewrite any one of them instead of reading documents of new generators to check compatibility.