How to iterate all pages in custom folders in jekyll

sandeepv · May 6, 2022, 5:04am

HI Team. I have implemented one githubpages website.Andmy project structure is

repository root folder
Docs folder
test folder
file.md
inner-test folder
file1.md
file2.md
file3.md
sample folder
file4.md
file5.md
innersample folder
file6.md
test2 folder
file7.md
home.md
index.md
_config.yml

	So now I want to iterate through all the folders and get the page.name and page.content and page.url to implement the search functionality can you pls help me.
	
	I tried below code and not working.

{% for post in site.docs %}
“{{ post.url | slugify }}”: {
“title”: “{{ post.name | xml_escape }}”,
“category”: “{{ post.category | xml_escape }}”,
“content”: {{ post.content | strip_html | strip_newlines | jsonify }},
“url”: “{{ post.url | xml_escape }}”
}
{% unless forloop.last %},{% endunless %}
{% endfor %}

I provided my test repo URL also here. pls help.
testrepo/docs at main · vanamsandeep/testrepo (github.com)

BillRaymond · May 6, 2022, 6:51pm

There are two primary document collections in Jekyll: posts and pages. Another will be collections if you use them.

Pages and posts may contain their own front matter, so I suggest you treat each one separately.

List of all posts

Read the Jekyll array containing a list of all the posts:

{% assign posts = site.posts %}
{% for post in posts %}
    title: {{post.title}}
{% endfor %}

List of all pages

Read the Jekyll array containing a list of all the pages:

{% assign pages = site.pages | where_exp: 'page', 'page.title' %}
{% for page in pages %}
    title: {{page.title}}
{% endfor %}

Notice I used a where_exp filter in the list of pages. Sometimes a page does not have a title in the front matter. That will remove untitled pages from the listing. If you want all of them, even without a title, then the code would look like this instead:

{% assign pages = site.pages %}
{% for page in pages %}
    title: {{page.title}}
{% endfor %}

More options

I recommend you check out the list of built-in Jekyll site variables. You can learn about the different arrays you can access that Jekyll automatically collects for you. There may be other document types (like HTML docs) that you want to include as well.

sandeepv · May 7, 2022, 2:49am

Thank you it worked. I have one more issue. How can I remove the below details from page.content: by using filters in above loop
I have <img src fields in content how can I remove using filters.
I tried “content”: {{ page.content | markdownify | strip_html | markdownify | jsonify }}, but not working.
Also tried : “content”: {{ page.content | strip_html | strip_newlines | jsonify }}
tried multiple remove and replace filters but none of them working.

Screen shot is:

BillRaymond · May 7, 2022, 7:21pm

Please share the original content you are working with.

sandeepv · May 7, 2022, 7:24pm

I have 100’s of markdown files and one of the sample markdown file is

---
permalink: "/"
---


# SEARCH :mag: 
You can <b>[Search](ww.example.com)</b> with a combination of keywords. For documentation results, there is always a Markdown file listed (*.md).

# Test

Html code for iterating:

---
layout: search
---
<form action="/search.html" method="get">
    <label for="search-box">Search</label>
    <input type="text" id="search-box" name="query">
    <input type="submit" value="search">
</form>

<ul id="search-results"></ul>


 window.store = {
    {% for page in site.pages %}
      "{{ page.url | slugify }}": {
        "title": "{{ page.name | xml_escape }}",
        "content": "{{ page.content | strip_html | strip_newlines | replace: '"', '\"'  }}",
        "url": "{{ page.url | xml_escape }}"
      }
      {% unless forloop.last %},{% endunless %}
    {% endfor %}

  };

<script src="/js/lunr.min.js"></script>
<script src="/js/search.js"></script>

rdyar · May 7, 2022, 10:11pm

I tried to reformat your post, I think it is correct?

when posting code you should use the </> code button in the editor or wrap it in 2 sets of 3 back tics ( ```)

rdyar · May 7, 2022, 10:17pm

when you do that do you get what is in your screenshot? if no what do you get?

sandeepv · May 8, 2022, 2:58am

Yes the this is correct. I’m implementing the search functionality in my blog by using lunr.js search functionality. For that I need to iterate all the pages in the blog using jekyll liquid tag to form as json structure. So while using the above code json structure is breaking because of some html tags. I need help on removing those html tags and make it as valid json structure.
My Markdown file code is:

# SEARCH :mag: 
You can <b>[Search](https://example/search?search=1234)</b> with a combination of keywords for any of our for documentation results, there is always a Markdown file listed (*.md).

support requests to [examplesupport@example.com](mailto:example.example.com)

Below the search.html code to iterate and get it as json file

---
layout: search
---
<form action="/testrepo/search.html" method="get">
    <label for="search-box">Search</label>
    <input type="text" id="search-box" name="query">
    <input type="submit" value="search">
</form>

<ul id="search-results"></ul>

<script>
 window.store = {
    {% for page in site.pages %}
      "{{ page.url | slugify }}": {
        "title": "{{ page.name | xml_escape }}",
        "content": "{{ page.content | strip_html | strip_newlines }}",
        "url": "{{ page.url | xml_escape }}"
      }
      {% unless forloop.last %},{% endunless %}
    {% endfor %}

  };
</script>
<script src=/js/lunr.min.js"></script>
<script src="/js/search.js"></script>

With this code it’s iterating but it’s breaking like below.

I’m also pasting my code repository and URL:
Github repo: vanamsandeep/testrepo (github.com)
URL: Jekyll Actions Demo with leap-day theme

pls suggest me on this

sandeepv · May 8, 2022, 3:00am

I’m implementing the search functionality in my blog by using lunr.js search functionality. For that I need to iterate all the pages in the blog using jekyll liquid tag to form as json structure. So while using the above code json structure is breaking because of some html tags. I need help on removing those html tags and make it as valid json structure.
My Markdown file code is:

# SEARCH :mag: 
You can <b>[Search](https://example/search?search=1234)</b> with a combination of keywords for any of our for documentation results, there is always a Markdown file listed (*.md).

support requests to [examplesupport@example.com](mailto:example.example.com)

Below the search.html code to iterate and get it as json file

---
layout: search
---
<form action="/testrepo/search.html" method="get">
    <label for="search-box">Search</label>
    <input type="text" id="search-box" name="query">
    <input type="submit" value="search">
</form>

<ul id="search-results"></ul>

<script>
 window.store = {
    {% for page in site.pages %}
      "{{ page.url | slugify }}": {
        "title": "{{ page.name | xml_escape }}",
        "content": "{{ page.content | strip_html | strip_newlines }}",
        "url": "{{ page.url | xml_escape }}"
      }
      {% unless forloop.last %},{% endunless %}
    {% endfor %}

  };
</script>
<script src=/js/lunr.min.js"></script>
<script src="/js/search.js"></script>

With this code it’s iterating but it’s breaking like below.

I’m also pasting my code repository and URL:
Github repo: vanamsandeep/testrepo (github.com)
URL: Jekyll Actions Demo with leap-day theme

pls suggest me on this

BillRaymond · May 9, 2022, 5:59pm

I went to your repo and did not see the code I shared on there (or a variant of it). Are you testing on a specific branch?

sandeepv · May 9, 2022, 6:00pm

You can edit on main branch this is my testrepo only

sandeepv · May 9, 2022, 6:02pm

github.com

vanamsandeep/testrepo/blob/main/docs/search.html

---
layout: search
---
<form action="/testrepo/search.html" method="get">
    <label for="search-box">Search</label>
    <input type="text" id="search-box" name="query">
    <input type="submit" value="search">
</form>

<ul id="search-results"></ul>

<script>
 window.store = {
    {% for page in site.pages %}
      "{{ page.url | slugify }}": {
        "title": "{{ page.name | xml_escape }}",
        "content": {{ page.content | strip_html | strip_newlines | replace: 'class' , '' | replace: 'Demothi' , 'class' | jsonify }},
        "url": "{{ page.url | xml_escape }}"
      }
      {% unless forloop.last %},{% endunless %}

This file has been truncated. show original

KhBh · March 2, 2023, 1:21am

In case it helps anyone searching this later, here is my working Lunr.js search index:

github.com

buddhist-uni/buddhist-uni.github.io/blob/deb554ca248c3c5bb6bf0de9ee496ba399aa3a9b/assets/js/search_index.js

---
layout: nil
---
importScripts("/assets/js/lunr.min.js");
importScripts("/assets/js/utils.js");

// Parameters
var BMAX = 250; // Max blurb size in characters
var RMAX = 100;

{%- assign ccurly = "}" -%}
{%- assign ocurly = "{" -%}
{%- assign backtoback = ccurly | append: ocurly -%}
{%- assign doubleo = ocurly | append: ocurly -%}
{%- assign doublec = ccurly | append: ccurly -%}
{%- assign pagesWithoutContent = 'Authors,Highlights,Search' | split: ',' -%}
{%- assign emptylist = '' | split: '' -%}
var store = { {% assign all = site.documents | concat: site.pages %}
  {% for p in all %}
    {% unless p.title %}{% continue %}{% endunless %}

This file has been truncated. show original

The trickiest part was filtering out liquid code that sometimes appears in page.content

Topic		Replies	Views
Elements not rendered Help	2	427	August 14, 2022
Loop through ALL pages of the website Help	4	1972	April 10, 2024
Advice for filtering collection with routes Help	8	1272	April 19, 2021
Organize articles folder Help	4	709	January 30, 2021
Programmatically output pages from collection Help	6	49	July 2, 2025

How to iterate all pages in custom folders in jekyll

List of all posts

List of all pages

More options

Related topics