Best practices for automatically generated sites

Let us suppose that I want to build a static site that can change from time to time using Jekyll. The idea is to have some program that generates automatically some posts and then building and uploading the site.

Let us suppose that I have the program, but I would like to know if it would be considered a good practice to include the code in the directories of the site (something like bin or _bin, see for example the theme al-folio (GitHub - alshedivat/al-folio: A beautiful, simple, clean, and responsive Jekyll theme for academics), which has a bin directory for deploying scripts.
Or, alternatively, it is better to have the generating program as an external entity.

For me, the first approach would be nice for others to be able to construct a site similar to mine, but I’m not sure if this is a good idea.

Are there some recommendations about this topic?

To answer your question better, we will need more detail. I will refer to the following quote:

What would be the program that automatically generates posts? For example, are you thinking about some AI tool that will create content for you based on a topic?

As for uploading content to the site, you would typically keep your code in a git repository. Some APIs and tools can connect to your repo on, say, GitHub or GitLab to update your code, create posts, etc. Of course, you can also create workflows (like GitHub Actions or GitLab Runners).

For example, check out my Agile in Action podcast site, and you will see featured images and guest images for each post. I upload the guest photos, but GitHub Actions resize the images and generate the featured image.

With that context in mind, could you provide some more examples of the use case you are looking for?

What would be the program that automatically generates posts? For example, are you thinking about some AI tool that will create content for you based on a topic?

That’s the idea: some program which generates some posts (all the site content is new each time). The idea could be an AI generated site, but it can also be an aggregator (a post contains links to other places). My use case is a personal aggregator.

But the question is: should I put the code inside the Jekyll creation part, or it would be a better practice to have two repos (one for the Jekyll installation, and another for the content generator).

Hello @fernand0
This is an interesting scenario.

Jekyll “sites” (entirety of source files) are usually personal / private / intended to serve a very specific need and therefore, they are not meant to be consumed as templates by “others”. Consequently, there are no standardized “best practices” for Jekyll sites.

Coming back to your specific scenario, directory names starting with either of ., _, ~ and # are ignored by Jekyll unless it is configured to forcibly read such a directory.
If your program is intended to be consumed by more than one repository (regardless of repository owner), then its always best to have the “program” in a dedicated repository.
The Jekyll site (source repos) would then connect to the program using GitHub Actions or similar services that allow being triggered at predetermined intervals using a “cron” setting.

I agree with @ashmaroli that Jekyll was not specifically built for these scenarios and even though you can create themes and custom plugins, those tools are all in service of compiling your code into a static website.

Let’s take a look at how Jekyll works to create a post by hand:

  1. Go to (or create) Jekyll’s _posts folder and create a file
  2. The file looks something like this
    /_posts/2022-11-04-my-first-post
---
title: My First Post
date: 2022-11-04
layout: post
---
Hello world!
  1. Whether manually or automatically (like with GitHub Pages), Jekyll builds your site, noticing the yaml front matter between the ---'s and then displays a static HTML web page that displays your content Hello world.

In your case, you want some solution to automatically create that content for you. Specifically, you want to aggregate your content, which I suppose might mean you want to collect posts you made on social media, Medium, or whatever other sites you contribute to, is that correct?

Since Jekyll is not designed for this scenario, I do agree with @ashmaroli that you would want to create a separate repo(s) to do the aggregation for you.

I am willing to bet someone has already written code that would get you where you want to go in terms of getting, say, getting your posts from Medium (perhaps that is something as simple as a feed reader?).

At a high level, this is how you would create such a solution on GitHub (since that is what I know best):

  1. Create a Jekyll repo called my-website
  2. Create a repo called create-jekyll-content. Since you want the create repo to talk to the my-website repo, I believe the most common way to do that is with secrets
  3. Create one (or many) GitHub Actions in the create repo. Those actions can run dockers containers which then run your custom code, which technically can contain any code you want. That code can (with more secrets) read APIs from other websites. I mentioned Medium earlier, so you can use their API to pull your latest post.
    Note: GitHub Actions can run based on a schedule, specifically, a CRON job as @ashmaroli points out or you can manually execute the code at your whim, or it can be based on various GitHub events.
  4. Your code would take the content, create a filename based on Jekyll’s required post filename and content structure
  5. You code would then copy the file from the create repo into the my-website repo, specifically into the /_posts folder. If you are using standard GitHub Pages, the website will update automatically and your new content will be on your website in a matter of minutes

By the way, there is a whole GitHub marketplace full of Actions other people built, so even if you want to build your own, you might want to play with those to see how they work. For example, I found this action that will copy files from one repo to another.

The nice thing about GitHub Actions (as of this writing), is that they are free for everyone, but there are limitations, so as long as your code is not bulky and taking up too much processing time, you should be good.

Hope this helps!

Thanks for the tips and ideas. I do not know if this would be considered spam but I could provide some more details and a proof of concept.

Joining all together:

Pros:

  • You can pack the solucion and then you can be sure that the changes are local always. You can also share it with others.

Cons:

  • Part of the pros are not completely true. Maybe you’ll depend on libraries and so on, so the packaging is not complete.
  • Maybe your code will end being too related to the thing you are publishing.

Two separate repos (the ‘site’ and the ‘program’):

Pros:

  • It is nice to have different things separated (one thing is a publishing framework and the other is just code to generate posts).
  • You can change any of them withou much problem because they are separated (you can always making a tailored suit, it is up to you).

Cons:

  • More complication for install: donwload two repos, configure them,…
  • More complication to maitain.

Who knows, at this moment they are separated and, maybe, they should be like this.

Assuming you go the GutHub Action route, it won’t really matter. You can just create folders, name them whatever you want and put your code in there. So long as you don’t use folder names that conflict with GitHub, Jekyll, or Ruby, you are good to go.

If it were me, I would probably write the code in the same repo until I got it working so I can easily run Jekyll, test the code, and make real-time updates to my code and Jekyll at the same time. However, I would probably pull it out and put it into a new repo for long term maintenance

But that’s me and everyone has repo opinions, so my response would always be to go with what feels right for you

One other note, if you write your code in Jekyll’s Native language, Ruby, there is a version requirement. Maybe you write code in one version and then Jekyll gets a new feature you can’t live without but it requires a Ruby bump that renders your code unusable until fixed. In that case, you have two code bases with differing platform requirements so that would get messy.