Include all files in a specific directory, even if the filenames start with underscores

I’m trying to include documentation that’s generated by a third-party tool in a Jekyll site. This tool outputs a deep directory structure, and sometimes files in this structure have names starting with underscores - which Jekyll ignores by default. I can get the files to show up in the output by adding them to includes individually, but it’s impractical to list every one of these files for a large project - doubly so since I want to be able to deploy changes to the documentation via a CI workflow.

Is there any way to specify an entire folder and all its subdirectories and files via include? I’ve given common glob patterns a shot, e.g. include: ["docs/**/*"], but they don’t appear to be accepted, only single paths from the root of the project.

One work-around might be to independently copy the docs directory to _site before or after building Jekyll and use the keep_files setting to prevent Jekyll from deleting the docs.

@chuckhoupt that assumes the files are output HTML files that belong in _site.

My guess is that these are markdown files that still have to be processed with Jekyll and then end up in site - or in this case getting ignored.

If all these output files are flat HTML and don’t need Jekyll processing, then you do add this as an empty file in the repo root.

.nojekyll

Example

Then when GitHub Pages serves the site, it will serve all your pages as static content without processing with Jekyll.

If you aren’t using GH Pages, this makes no difference. But you can just avoid Jekyll in the case.

If your site is a hybrid of 3rd party generated HTML and manually written Jekyll markdown, you could add your top level generated dir to exclude of Jekyll config so Jekyll ignores it, then use cp in the shell to copy to the _site dir.

I don’t know if Jekyll supports anything to handle what you want.

Perhaps you can do replacement of underscore with empty string or letter. I don’t know if a dash will work.

You can use a shell script to recursively rename under score dirs and files. And then also recursively find links in HTML to underscore files and replace those.

So

  1. Third party tool
  2. Replacement
  3. Jekyll build.

Thanks for the ideas @chuckhoupt and @MichaelCurrin!

I am indeed in a Github Pages environment. I was under the impression that _site isn’t something I can modify at all under Pages, and testing @chuckhoupt’s idea seems to confirm this - keep_files: [docs] works when testing with jekyll s locally, but not when pushed to Github. Does Pages use a different destination directory that I’m not aware of, or is it just impossible to use keep_files with Pages?

@MichaelCurrin My site is a hybrid - mostly Jekyll-processed files, with only one subdirectory that contains purely tool-generated HTML/CSS/etc. I can’t use .nojekyll because that trick seems to only apply to the entire repo, not just subdirectories. Your suggestion of manually cping the docs into _site doesn’t work with Github Pages, since I can’t customize the build process for Pages deployments.


I have come up with something of a solution, however: Since I’m using Github Actions to generate everything in the docs directory and commit to the gh-pages branch automatically, I’ve added a step to that build process that lists all the newly-generated files whose names start with underscores, and adds them to include in _config.yml individually. It’s a bit of a hack, but it works something like this:

# Working directory is the root of the Pages site

# Generate documentation in the `docs` subdirectory of the site
npx typedoc ../project-source/index.ts --out docs --readme none

# Add all generated files that start with underscores to `include`
UNDERSCORE_FILES=(docs/**/_*)
for file in "$UNDERSCORE_FILES[@]"
do
  # This line assumes that `includes` is at the end of the config
  echo "- '$file'" >> _config.yml
done

This is just a rough example. My actual use case is building docs for new tags into dedicated subdirectories - e.g. a new tag v2.3.0 would generate docs for that version of the project and push them to the docs/v2.3.0 directory. Because this doesn’t involve overwriting old versions, I don’t have to worry about diplucate entries in the includes list, but a more robust solution might be to use a proper YAML utility to rewrite the config.

Anyway, I think I’m gonna submit a feature request on the Jekyll repo for supporting glob patterns in includes and similar config options - it seems to me like that would make automated use cases significantly simpler, especially when working with Github Actions and Pages together.

1 Like

I suppose you could create a branch called jekyll-pages where the Jekyll-based doc would live. A Github-Action could build the jekyll-pages branch with Jekyll, and then build the TypeDoc docs from master (or various tags), and finally merge them together and push them to gh-pages (with .nojekyll).

In this scenario, the gh-pages branch is only used for automatic publishing of static HTML. All human commits would be made to jekyll-pages or master, etc.

Oh yes. I thought of config built dynamically to include and I see you arrived at it.

If it helps, you can make to configs.

One standard one.
And one dynamic one which only has one field and is unversioned.

Then in GH Actions.

jekyll build --config=_config.yml,_dynamic_config.yml

Note no space after comma.

Though your current approach might be fine. As long as your main config has includes key at the very bottom or the shell script adds it.

Seems like this issue is still relevant. In case anyone else struggles, here’s something I figured out today:

In Jekyll 3.9 (the version used by GitHub Pages), the include: setting needs to refer to directory names, not including the parent path. What I mean is, if you have a docs/_foo folder, your _config.yml needs to have include: [_foo], not include: [docs/_foo], for Jekyll to correctly include the files.

I believe this will change in Jekyll 4, but that’s how things work today.