Exploring headless CMS and front-end SSG for blogs

Dec 30 2023

I remember when I first heard of headless CMS, I thought to myself: isn't that just a database but with an interface to input data? I understood that headless means that the front-end is not tied to the data itself, and there are fancy terms like framework-agnostic to describe it. In contrast, a traditional CMS like wordpress bundles everything and manages them for you. So if I separate the data part from the presentation layer, wouldn't that just leaves me with a database?

Well, that is pretty much the case but the real-life model is far more complex than what I initially has perceived.

The Data Model

Let's take a look at how the data model works:

In this model, a server stores some data and responses to users' requests. You can't get any simpler than this. This is the model that fits when individuals self-host their website where the data resides in the same location with the server. (I used to keep my pc on 24 hours to run pm2 and serve my vanilla js website but thats a story in the past). However, most of us do no host our website on our own computers. Instead, we mostly use third part services like AWS, Vercel, Digital Ocean, etc. that provides computing power that can handle user requests. Now, these solutions work great. Their machines perform better, are more reliable and scale more easily, but not all platforms comes with the ability to store the data needed and we need to relocate the data to somewhere else...

... which brings us to this second diagram where the data lives outside of the server. The server gets the codes for your project, compiles and builds the code to html, and serve the files when users request them (this is called Server-Side Generation or SSG). Or, the server gets the codes, builds and compiles the code to some Javascript files (or other languages), then runs these files to generate HTML when users request (SSR, ISR, ... these are for another blog post). The data can be fetched in any of the build steps or the generation of HTML depending on the implementation.

But where does the server gets the codes? Cloudflare Pages allow you to drag and drop files from your computer directly to their cloud server, but it is slow and not automatic. The most popular solution for small projects is to use github repositories to store the files.

In this model, whenever the developers push codes to github, it automatically triggers the server to fetch and build the codes (if there is an integration). It is still a very simplified diagram that describes basic use cases but we will work with this.

Headless CMS

Headless CMS technically does not replace the database component. It is connected to it. And as I will introduce later, some headless CMS uses a database in their cloud, some requires you to provide your own database and some uses github itself as the database.

API-first Headless CMS

API-first headless CMS refers to the CMS which provides api endpoints based on the data. Users can query the data at any location through different means such as HTTP requests, GraphQL queries, etc. Some has their own database for storage while some require you to provide your own database solution. If the CMS does not have a database, then there will be an additional component in the data model.

Whenever authors edit the content, the CMS makes relevant update to the database and the next time the server fetches data from the CMS, the updated version of the data will be returned.

Git-based Headless CMS

Technically github itself can also act as a database (well it is more like a static storage). It can store files and lets users upload and edit the content. So why not use github as the source of data instead of using a dedicated database? This approach is called Git-based headless CMS where the content is directly stored inside of github, usually alongside the project codes so only one repository is needed.

This approach is usually free since you do not need another database but the downside is that the files size is limited by the github restriction (<100Mb per file) and it is difficult to query the data outside of your project since there are no api endpoints (unless you set them up yourself which is a lot of work). You also need to embed the dashboard inside your website so you will need to implement authentication if needed.

What about the UX

There are many other factors in addition to the storage, for example the user experience when editing content, authentication, ease of integration, collaboration, internationalization, etc. All of them could be equally important and could affect your decision as to which headless CMS you choose. I will mainly discuss schema definition and content editing here.

Schema Definition

A headless CMS needs to know what kind of data you will be inputting and it requires you to define it yourself. For example, for authors, you might have string for name, an image for the profile pic and a rich text field for the bio. Although image and rich text is not a primitive type supported in SQL and noSQL databases, the CMS provides abstraction for the data type and interact with the database in a way the database understands so you do not need to worry about it.

The way we defines schemas (or sometime called models) are usually done from the dashboard of the headless CMS.

Overview of the field types in Strapi

You can add field types that are pre-defined by the CMS for a document (or a collection, the naming varies according to the CMS). Some CMS also allows you define components which can be re-used in other components or documents.

While most headless CMS that are without database and git-based CMS require you to host the server on your own, they still provide an UI for editing the schemas.

Sanity and Decap are the only CMS I have found so far that does not provide a UI for editing schemas so only people with the technical knowledge will be able to update them (but Sanity is still a good option for several reasons).

Content Editing

Content editing is arguably the most important part of the user experience and all the CMS I have worked with do a great job at providing a clean and easy-to-use interface.

Wordpress UI for adding a post
Hygraph UI for editing a post

The UI displays all the documents/categories and allow users to edit the content of individual slots according to the schemas. There are also other functionalities that vary across different headless CMS. Most provides different stages of content editing: Draft and Published content. Draft content allows for reviewing to ensure content quality before publishing. Some even goes as far as content versioning to allow precise control over the content. Most CMS also includes a page to view all the assets uploaded, which includes images, files, etc.

Full-Site editing / Preview Mode

There is another kind of visual editing called full-UI edit where you can preview the changes live as you edit the content.

StoryBlok's live preview editing mode

I have not worked with Wordpress a lot but I would say it feels like a half-site edit more than a full-site edit. You can preview the post in live mode but you can only preview the content part instead of the whole site (so no headers, nav bar, footer, etc.).

In all cases, the general functionalities provided stay relatively the same but the details come down to individual CMS.

Headless CMS comparison

There are a lot, like a lot of headless CMS options. Here is a screenshot of all the headless CMS listed in astro's doc.

I will try to categorize them in the following order:

  • no free tier
  • API/Git-based
  • Cloud/Self-hosted (API-first)

No Free Tier

I will be honest, as a hobbyist developer and someone who has spent almost all of their life-savings in college, I am the type of person who would create multiple accounts just to enjoy more free-trials. So any solutions without a free tier is a no-go for me.

Here are the lists without free-tier:

Here are a few CMS that provides a free tier but with low bandwidth usage allowance (<100GB per month). I am excluding these because we usually need to store images and even videos which use up a lot bandwidth pretty quickly no matter how well we optimize them.

We will exclude the above CMS in the discussion below.

Git-based

As mentioned above, there are a lot of restrictions when you opt for a git-based CMS. However, it is still suitable for small-sized projects (like blogs). Note you should not use git-based CMS when sensitive information are involved to avoid security issues.

  • It is easy to integrate into one of the route in the website but there is no visual schema editor, the features are limiting (no lifecycle, difficult to add custom fields) and you are vendor-locked-in to using netlify to deploy.
  • FrontMatter is an interesting choice as it is only a vscode plugin. You get a dashboard directly from the plugin and you manage your schemas by directly updating the json files. It integrates with different frameworks and provides content reviews through the framework routing. However, it also means that you are almost writing raw markdown content inside the code editor and collaboration is not possible.
  • I have not personally used Tina before but it seems to be a comprehensive CMS that provides full-site edit and content versioning. However, it is complicated to setup and learn which is why I have not tried it yet.
  • Keystatic is a minimal git-based (in beta) CMS that is extremely easy to use. However, it also means you are trading off features as Keystatic does not have a lot of features other CMS provide. Also, it seems like anyone can edit the content online as long as they have a github account.

Self-hosted API CMS

Before I introduce the CMS, self-hosted solutions do not come with a database and usually require a separate project folder which makes it hard to manage and imo overkill for just a blog website.

  • Strapi is one of the most starred headless on github but in my honest opinion, it does not seem well polished. Before Strapi v4, you do not have a real wysiwyg for markdown content. You need to switch panels to view the result which is inconvenient. They added in rich content block in v4 which looks a bit like this.
  • The formatting is limited and there are not any docs as to how to extend the functionalities. The deal-breaker for me is that Strapi docs specifically discourage the use of external database and all hosting solutions are paid (except AWS which is notoriously hard to setup). Strapi feels like it is trying to bridge the gap between a minimal CMS and a full-scale solution like wordpress with the plugins and I would really love to try using it, if the above problems are resolved in the future.

I have not tried either of the CMS above because they are complicated to setup. Statamic uses a php laravel backend and directus uses docker containers, either of which I am not familiar with.

Cloud Based API CMS

There are a lot of cloud based API CMS in the market, each with their unique set of features and ecosystems.

  • Sanity, Hygraph and Caisy all provide a similar set of features with content curating, publishing, schema declaration and asset management. Hygraph has the best built-in rich text editor:
Hygraph rich text editor
  • But Sanity allows you to extend the default text editor pretty easily as well.
Sanity's extended rich text editor
  • Caisy's rich text editor is sub-par with no code-block, underline, etc. support:
Caisy rich text editor
  • The down side of the flexibility and extensibility of Sanity is that you cannot edit the schema directly in the dashboard, rather you have to edit it in the codes. The dashboard is also cumbersome to navigate on laptop monitors.
  • I would highly recommend Hygraph for creating blogs. It is super easy to use and the docs are one the best I have seen. However, if you need more functionalities for the rich text, Sanity is the way to go. Prismic and PreprCMS are two other similar options but with a much minimal rich text input and I do not recommend them for blogs.
  • All 3 options are very powerful and feels a bit overkill for a simple blog. They all provide live preview and full-site edit (except wordpress) but StoryBlok's default rich text editor has the most features out of all 3.
StoryBlok rich text editor
  • Contentful and Wordpress on the other hand provides a lot of plugins that you can find in the marketplace.

Front-end

This is a very long blog but we have finally reached the front-end part. Now I am aiming for a zero (or minimal)-javascript website which already excludes most of the meta-frameworks like Next JS, Solid Start, Nuxt or Svelte since they ships the framework bundle even with static site generation. Meta-frameworks do not just generate the static HTML files, but also all the Javascipt needed for the routing and DOM manipulation. Moreover, Next app router actually do not even generate static HTML files for dynamic routes when compared to the pages router. It caches the data fetched as static files.

So the options I have now are static site generators. I have taken a look at Gatsby, E11ty, Hugo and Astro. E11ty feels like it lacks a lot of the features when compared with Astro. E11ty supports many templating engines but it seems less compelling than Astro which you can integrate with other frameworks. Gastby tries to group all the data into a "data mesh" layer which can be queried through GraphQL. I like the idea that it tries to group all the data together and separate the logic, but unless you have multiple CMS source, I do not think it is really necessary.

I ended up choosing Astro over Hugo since I started web development learning React and the JSX syntax of Astro seems more familiar than the Hugo syntax. Astro is also more versatile compared to Hugo since you can integrate any framework at any time if you want. (This exact site is made with Astro and Sanity. I added in view transition and prefetching so a tiny bit of Javascript is loaded)

Hugo Syntax

Qwik SSG

Qwik is a meta-framework but with drastically different approach when it comes to Server-Side Rendering (SSR). It does not require any hydration and by default ships zero javascript unless necessary. The exteme code-splitting and lazy-loading is what sets Qwik appart from other frameworks. For example, other framework technically does not need to ship any Javascript for the static sites but the Javascript files is linked in the HTML files anyway because they do not have a mechanism to separate them.

You can check out my blogs over here which are made with Qwik instead of Astro and you can see that absolutely no Javascript is loaded.

Conclusion

In the end, CMS is just a tool that helps you manage your content. There are a lot of CMS and headless CMS options in the market and it is really hard to determine if one satisfies your needs or not unless you dive deeply into the docs and implementation. And to be frank, if you are working on a solo blog project, you might be bette off writing raw markdown or using libaries like markdoc. Same thing goes with SSG, although I am aiming for zero-javascript in this blog, framework that ships Javascript loads as fast as the ones that do not (provided that the website is cachable) so the choice is yours.

Last updated: Jan 16 2024

Tags :