Trouble with hooks for a custom backlink generator

I’m building a Jekyll site at https://github.com/steinea/garden, using Maxime Vaillancourt’s Digital Garden Jekyll Template. It’s got a cool backlinks generator plugin that I have successfully tweaked to work with my site config. However, the links generated by a Liquid {% for %} loop are not recognized by the plugin.

I’ve done some reading of the Jekyll documentation, and it appears that this is due to the sequence of the Jekyll rendering process. Maxime’s plugin is a generator which, according to the docs:

run[s] after Jekyll has made an inventory of the existing content, and before the site is generated

This means that the backlinks generator runs before the rendering process, so the links generated by the for loop do not yet exist when the backlink generator runs.

I found the documentation for Jekylls hooks but they’re not the easiest to understand. These seem to be the way to change the sequence in which plugins run in the rendering process, but I haven’t had any success with the variations I’ve tried.

So far, I’ve tried wrapping Maxime’s plugin in a few different variations of the example from the documentation. I think this is what it should be, but it’s still not working:

# BEGIN HOOK

Jekyll::Hooks.register :documents, :post_convert do |docs|

# BEGIN GENERATOR

  # frozen_string_literal: true
  class BidirectionalLinksGenerator < Jekyll::Generator
    def generate(site)
      graph_nodes = []
      graph_edges = []

      all_posts = site.collections['posts'].docs
      all_pages = site.collections['pages'].docs

      all_docs = all_posts + all_pages

      link_extension = !!site.config["use_html_extension"] ? '.html' : ''

       # Identify note backlinks and add them to each note
      all_docs.each do |current_note|
        # Nodes: Jekyll
        notes_linking_to_current_note = all_docs.filter do |e|
          e.url != current_note.url && e.content.include?(current_note.url)
        end

        # Nodes: Graph
        graph_nodes << {
          id: note_id_from_note(current_note),
          path: "#{site.baseurl}#{current_note.url}#{link_extension}",
          label: current_note.data['title'],
        } unless current_note.path.include?('_notes/index.html')

        # Edges: Jekyll
        current_note.data['backlinks'] = notes_linking_to_current_note

        # Edges: Graph
        notes_linking_to_current_note.each do |n|
          graph_edges << {
            source: note_id_from_note(n),
            target: note_id_from_note(current_note),
          }
        end
      end

      File.write('_includes/notes_graph.json', JSON.dump({
        edges: graph_edges,
        nodes: graph_nodes,
      }))
    end

    def note_id_from_note(note)
      note.data['title'].bytes.join
    end
  end
# END GENERATOR

end
# END HOOK

I’ve tried some different registers and some different events, but none of combinations I’ve tried have successfully run the generator after Jekyll has iterated through the for loops on my site and before Jekyll has rendered my site pages. Help?

If I add Jekyll::Hooks.register :documents, :pre_render do |docs| in the plugin file, like so…

# frozen_string_literal: true
class BidirectionalLinksGenerator < Jekyll::Generator
  def generate(site)
    Jekyll::Hooks.register :documents, :pre_render do |docs|
...
end

… the plugin will generate backlinks, but it seems to be capturing too many backlinks, i.e., it’s going to a second layer of depth (e.g. PAGE 1 is linked by PAGE 2, but then PAGE 2 is linked by PAGE 3, and a backlink for PAGE 3 is showing up on PAGE 1, which isn’t correct).

If I use the :post_convert or :post_render events, I get the same result with backlinks, but I also get connections in the notes_graph.json. However, the connections in the graph are also too many, possibly re-linking the backlinks?

So, simply adding the hook register at the top of the plugin is not the correct solution.

Maybe you should use the priority attribute. Give a look at this answer.

Thanks for the idea @george-gca ! I looked into this, but unfortunately priority is only for ordering plugins relative to each other, but there doesn’t seem to be a way to use priority to rearrange the order of a plugin with respect to the Jekyll::Generator.

However! I don’t know what changed from when I posted my update, but I was tinkering and inexplicably the graph has started working using the :post_convert event! The full plugin looks like this:

# frozen_string_literal: true
class BidirectionalLinksGenerator < Jekyll::Generator
  def generate(site)
    Jekyll::Hooks.register :documents, :post_convert do |docs|
      graph_nodes = []
      graph_edges = []

      all_posts = site.collections['posts'].docs
      all_pages = site.collections['pages'].docs

      all_docs = all_posts + all_pages

      link_extension = !!site.config["use_html_extension"] ? '.html' : ''

      # Identify note backlinks and add them to each note
      all_docs.each do |current_note|
        # Nodes: Jekyll
        notes_linking_to_current_note = all_docs.filter do |e|
          e.url != current_note.url && e.content.include?(current_note.url)
        end

        # Nodes: Graph
        graph_nodes << {
          id: note_id_from_note(current_note),
          path: "#{site.baseurl}#{current_note.url}#{link_extension}",
          label: current_note.data['title'],
        } unless current_note.path.include?('_notes/index.html')

        # Edges: Jekyll
        current_note.data['backlinks'] = notes_linking_to_current_note

        # Edges: Graph
        notes_linking_to_current_note.each do |n|
          graph_edges << {
            source: note_id_from_note(n),
            target: note_id_from_note(current_note),
          }
        end
      end

      File.write('_includes/notes_graph.json', JSON.dump({
        edges: graph_edges,
        nodes: graph_nodes,
      }))
    end

    def note_id_from_note(note)
      note.data['title'].bytes.join
    end
  end
end

So :post_convert has the plugin run after the renderer has interpreted all Liquid, but before the render has actually happened. Then, just making sure the plugin is properly nested between the Jekyll:Hooks.register phrase and a final end tag does the job.

Strangely, this is not working for posts in my _posts collection, only for pages in my _pages collection. Something about the date in the filename, which Jekyll requires for posts, is breaking the plugin… The post will successfully be rendered to the site, but no backlinks will be generated for the post, and notes graph will be broken, with all the paths disappearing…

This is weird. What if you force it for documents and posts, even though documents should include these? Like

Jekyll::Hooks.register([:posts, :documents], :post_convert) do |docs|

Hmm, this doesn’t work unfortunately, has the same effect as

Jekyll::Hooks.register :documents, :post_convert do |docs|

So I’ve continued tinkering with this, and decided to change a bunch of things…

I had configured pages and posts as collections in my _config.yml, like so:

# Collections
collections_dir: collections
collections:
  posts:
    permalink: /:year/:month/:day/:title/
  pages:
    output: true
    permalink: /:title/

I’ve done this for years with Jekyll sites I have built, but I started wondering if maybe I was causing some sort of hook collision with the Jekyll defaults site.posts and site.pages. So I got rid of the collections and just reconfigured my permalinks with defaults instead, like so:

# Defaults
defaults:
  - scope:
      path: ''
      type: pages
    values:
      permalink: /:title/
  - scope:
      path: ''
      type: posts
    values:
      permalink: /:year/:month/:day/:title/

Then, I needed to change how I called pages and posts in the plugin. I was originally calling them as collections, like so:

all_posts = site.collections['posts'].docs
all_pages = site.collections['pages'].docs

So I changed these two lines to:

all_posts = site.posts.docs
all_pages = site.pages

At this point, I started getting errors on jekyll build. For some reason, jekyll no longer liked this part of the plugin:

def note_id_from_note(note)
      note.data['title'].bytes.join
end

It would throw this error:

/site/_plugins/bidirectional_links_generator.rb:47:in `note_id_from_note': undefined method `bytes' for 404:Integer (NoMethodError)

page.data['title'].bytes.join\r
                   ^^^^^^

I tried a whole bunch of tweaks with no luck, mostly just generated other related errors. So I went back to the drawing board and tried to start the plugin from scratch. I also searched around for other solutions, and found this thread from 2021: Mention other pages linking to current page. The OP was also trying to get Maxime’s plugin to work, just for backlinks. TerminalAddict did a bit of a refactor of the plugin and made it much more compact, removing a bunch of the extra functionality necessary for Maxime’s template but not required to generate backlinks. I copied this approach and produced the following:

class BidirectionalLinksGenerator < Jekyll::Generator
  def generate(site)

      Jekyll::Hooks.register :pages, :post_convert do |page|
        all_posts = site.posts.docs
        all_pages = site.pages

        all_docs = all_posts + all_pages

        all_docs.each do |current_note|
          notes_linking_to_current_note = all_docs.filter do |e|
            e.content.include?(current_note.url)
          end
          current_note.data['backlinks'] = notes_linking_to_current_note
        end
      end

    end

  end

And it almost works… It does generate backlinks, but so far only for pages. Since I’m no longer using collections, I can no longer use the documents hook point, like above in this thread:

Jekyll::Hooks.register :documents, :post_convert do |docs|

If I use the pages hook point, I can get some* backlinks:

Jekyll::Hooks.register :pages, :post_convert do |page|

But if change this to use the posts hookpoint, like so:

Jekyll::Hooks.register :posts, :post_convert do |post|

I have not been able to get a backlink to a test post to work so far, whether hard-coded in HTML or generated from a Liquid loop.

I have tried combining these, like the example provided by @george-gca above, like so:

Jekyll::Hooks.register([:posts, :pages], :post_convert) do |page|

Page backlinks will continue to work with this syntax, but the do |page| needs to be modified to do both |page| and |post|, but I can’t figure out the syntax for this…

Now, even more confusingly, the some* note above… Unlike the iterations of the plugin posted above, where some Liquid loops would not generate any backlinks, while others would consistently generate a backlink for each listed link, now all Liquid loops ony my site are generating some backlinks, but not consistently generating a backlink for each listed link. The plugin seems to be skipping iterations through the for loop at random, and I have no idea why. So I find myself at another wall :confused: