anirudh@Shadow:~$ cat I built an SSG that beat hugo.md

I built an SSG that beat hugo

I spend my time working on low-level systems and distributed architecture in Rust. When it came to building my own site, I wanted to go full systems with it. The answer was a Static Site Generator. And I wanted it to be easy-to-use, minimal and fast. Really fast. That project became Palya.

Date: 19/04/2025

Tags:

What is Palya?

Palya is a minimal static site generator (you can checkout the lore behind the name here). You give it a directory of Markdown files and a folder of Jinja-style templates, and it spits out a complete HTML site. No Node.js, no Python runtime, no config file required if you don’t want one. Just a single binary you download and run.

It’s not trying to be the next Hugo or Zola. It’s trying to be fast, simple, and easy to understand!

How It Was Built

The commit history tells the full story, but the development followed a clear, iterative arc: from a naive parser to a fully parallelized engine.

Phase 1: The Minimum Viable Generator

I started with the simplest possible premise: read a single Markdown file, render it via a template, and write the HTML output. The initial prototype was built in a day using pulldown-cmark for parsing and Tera for templating. It worked, but it wasn’t fast enough.

Phase 2: Scaling Up & Swapping Engines

As the requirements grew, so did the architecture. I swapped Tera out for MiniJinja to chase better rendering performance. I integrated walkdir to recursively process complex content hierarchies and added essential features like pretty URLs, draft support, and taxonomy generation.

Phase 3: The Incremental Cache

This was the turning point for performance. I implemented an incremental build system that hashes each source file and stores the state in a BuildCache. On subsequent runs, Palya skips parsing and rendering entirely if a file’s hash remains unchanged. At a scale of 7,500 posts, this caching strategy drops the rebuild time of a single modified file from ~2 seconds down to ~400ms.

Phase 4: Zero-Bloat Features

With the core engine stable, I focused on adding functionality without sacrificing client-side performance. I integrated syntect for compile-time syntax highlighting—eliminating the need for heavy JavaScript libraries. Finally, I built a pre-compiled search.json index that pairs with minisearch.js to provide full-text, client-side search without a backend.

The Parallelism Advantage

Concurrency was baked in early and is intentionally straightforward: relying on rayon’s par_iter to process the independent content files. The benchmarks demonstrate this scaling effectively—showing 8.893s of User time condensed into a 2.134s wall clock time, proving that the engine is actually saturating the hardware.

Features

Parallel builds using Rayon — utilizes all CPU cores, not just one
Jinja-style templates via MiniJinja with full .j2 template support, including {% extends %} inheritance
Incremental builds — only re-renders files that have actually changed
Content collections — blog/, notes/, projects/, etc. are automatically detected and given appropriate template context variables
Tag pages — aggregated automatically from frontmatter, no config needed
Syntax highlighting — via syntect at build time, no client-side JS (you can choose the theme to btw ;)
Client-side search — build-time search.json index with minisearch.js
Single static binary — no runtime dependencies
Optional config — palya.toml is entirely optional for minimal setups

Usage

Download the binary from the releases page and make it executable:

chmod +x palya

Build a site:

./palya \
  --input path/to/my_site \
  --output path/to/dist

Or the binary can just be executed in the my_site (your site lol) directory!

A typical site layout looks like this:

my_site/
├── content/
│   ├── blog/
│   │   └── my-first-post.md
│   └── notes/
│       └── rust-ownership.md
├── static/
│   └── style.css
├── templates/
│   ├── base.j2
│   ├── post.j2
│   └── index.j2
└── palya.toml

Frontmatter

---
title: My First Post
date: 2026-01-15
tags:
  - rust
  - webdev
draft: false
---

Your content here.

Templates

Templates use Jinja2 syntax. The context variable available in a template depends on which collection the content file lives in:

Directory	Context variable
`blog/`, `notes/`, `tutorials/`, etc.	`post`
`projects/`	`project`
`pages/`	`page`

A basic post template:

{% extends "base.j2" %}
{% block content %}
<article>
  <h1>{{ post.frontmatter.title }}</h1>
  <p>{{ post.frontmatter.date }}</p>
  {{ post.content | safe }}
</article>
{% endblock %}

Configuration

palya.toml is optional. If present:

title       = "My Site"
description = "A blog about things"
base_url    = "https://mysite.com"
author      = "Your Name"

All values are available as {{ site.title }}, {{ site.description }}, etc. in templates.

Benchmarks

I wanted to know how Palya actually performs against Hugo, which is the standard for “fast static site generator”. But I also wanted to do this honestly as hugo does more work per file due all those features it supports. So I disabled everything that palya doesn’t do. To make the benchmarks fair.

Methodology

The benchmark corpus is generated by create.py with a fixed random seed (random.seed(42)) for reproducibility. It generates 7500 Markdown posts distributed across three content collections (blog/, notes/, tutorials/), with two tags each, randomized dates, and realistic body content including code blocks.

Both tools use the same three collections and the same 7500 posts. Hugo’s hugo.toml has disableKinds = ["RSS", "sitemap", "robotsTXT"] set — these are outputs Hugo generates by default that Palya doesn’t have an equivalent for. Disabling them levels the playing field so both tools are only generating post pages, section index pages, and tag pages.

The benchmark itself is run with hyperfine, 1 warmup run and 10 timed runs, with the output directory deleted before each run to force a full cold build every time. Results were verified: Palya produced 7520 HTML files, Hugo produced 7527 — the 7 file difference is Hugo generating a handful of extra pages (a root 404, taxonomy index, etc.) that have no meaningful effect on timing at this scale.

Hardware: Apple M-series, macOS Darwin 25.1.0. Hugo v0.154.4+extended, Palya 0.4.0.

Results

Benchmark 1: palya --input ssg_benchmark/palya --output ssg_benchmark/palya/dist
  Time (mean ± σ):      2.134 s ±  0.065 s
  Range (min … max):    2.049 s …  2.234 s    10 runs

Benchmark 2: hugo --source ssg_benchmark/hugo --destination ssg_benchmark/hugo/public
  Time (mean ± σ):      2.986 s ±  0.038 s
  Range (min … max):    2.910 s …  3.035 s    10 runs

Summary: palya ran 1.40 ± 0.05 times faster than hugo

Palya builds 7500 posts in 2.13 seconds. Hugo does it in 2.99 seconds. That’s a 1.40x speedup.

I’ll be honest — my first run of this benchmark before fixing the methodology showed 1.92x. That number felt great and it was wrong. Hugo was generating RSS feeds and sitemaps on top of the post pages; Palya wasn’t. The 1.40x you see here is the real number, with both tools doing the same job.

The User time is also worth looking at: Palya uses 8.893s of CPU across all cores for a 2.134s wall clock. Hugo uses 13.692s of CPU for 2.986s. Palya is both faster on the wall clock and doing less total CPU work per build! The parallelism is genuine, not just throwing more cores at the same amount of work.

What’s Next

A few things I want to add:

Zola benchmark — I scaffolded Zola support in the benchmark tooling but haven’t run it yet. Zola is also Rust-based so it’s the more interesting comparison.
Watch mode — palya --watch that rebuilds on file change, useful for local development
More speed!

If you want to try it or run the benchmarks yourself, everything is on GitHub. The benchmarks/ directory has create.py and bench.sh. Set your Palya binary path, run bench.sh, and it handles the rest.