How to iterate all pages in custom folders in jekyll

HI Team. I have implemented one githubpages website.Andmy project structure is

repository root folder
Docs folder
test folder
file.md
inner-test folder
file1.md
file2.md
file3.md
sample folder
file4.md
file5.md
innersample folder
file6.md
test2 folder
file7.md
home.md
index.md
_config.yml

	So now I want to iterate through all the folders and get the page.name and page.content and page.url to implement the search functionality can you pls help me.
	
	I tried below code and not working. 

{% for post in site.docs %}
“{{ post.url | slugify }}”: {
“title”: “{{ post.name | xml_escape }}”,
“category”: “{{ post.category | xml_escape }}”,
“content”: {{ post.content | strip_html | strip_newlines | jsonify }},
“url”: “{{ post.url | xml_escape }}”
}
{% unless forloop.last %},{% endunless %}
{% endfor %}

I provided my test repo URL also here. pls help.
testrepo/docs at main · vanamsandeep/testrepo (github.com)

There are two primary document collections in Jekyll: posts and pages. Another will be collections if you use them.

Pages and posts may contain their own front matter, so I suggest you treat each one separately.

:writing_hand: List of all posts

Read the Jekyll array containing a list of all the posts:

{% assign posts = site.posts %}
{% for post in posts %}
    title: {{post.title}}
{% endfor %}

:page_facing_up: List of all pages

Read the Jekyll array containing a list of all the pages:

{% assign pages = site.pages | where_exp: 'page', 'page.title' %}
{% for page in pages %}
    title: {{page.title}}
{% endfor %}

Notice I used a where_exp filter in the list of pages. Sometimes a page does not have a title in the front matter. That will remove untitled pages from the listing. If you want all of them, even without a title, then the code would look like this instead:

{% assign pages = site.pages %}
{% for page in pages %}
    title: {{page.title}}
{% endfor %}

:arrow_right: More options

I recommend you check out the list of built-in Jekyll site variables. You can learn about the different arrays you can access that Jekyll automatically collects for you. There may be other document types (like HTML docs) that you want to include as well.

Thank you it worked. I have one more issue. How can I remove the below details from page.content: by using filters in above loop
I have <img src fields in content how can I remove using filters.
I tried “content”: {{ page.content | markdownify | strip_html | markdownify | jsonify }}, but not working.
Also tried : “content”: {{ page.content | strip_html | strip_newlines | jsonify }}
tried multiple remove and replace filters but none of them working.

Screen shot is:
image

Please share the original content you are working with.

I have 100’s of markdown files and one of the sample markdown file is

---
permalink: "/"
---


# SEARCH :mag: 
You can <b>[Search](ww.example.com)</b> with a combination of keywords. For documentation results, there is always a Markdown file listed (*.md).

# Test

Html code for iterating:

---
layout: search
---
<form action="/search.html" method="get">
    <label for="search-box">Search</label>
    <input type="text" id="search-box" name="query">
    <input type="submit" value="search">
</form>

<ul id="search-results"></ul>


 window.store = {
    {% for page in site.pages %}
      "{{ page.url | slugify }}": {
        "title": "{{ page.name | xml_escape }}",
        "content": "{{ page.content | strip_html | strip_newlines | replace: '"', '\"'  }}",
        "url": "{{ page.url | xml_escape }}"
      }
      {% unless forloop.last %},{% endunless %}
    {% endfor %}

  };

<script src="/js/lunr.min.js"></script>
<script src="/js/search.js"></script>

I tried to reformat your post, I think it is correct?

when posting code you should use the </> code button in the editor or wrap it in 2 sets of 3 back tics ( ```)

when you do that do you get what is in your screenshot? if no what do you get?

Yes the this is correct. I’m implementing the search functionality in my blog by using lunr.js search functionality. For that I need to iterate all the pages in the blog using jekyll liquid tag to form as json structure. So while using the above code json structure is breaking because of some html tags. I need help on removing those html tags and make it as valid json structure.
My Markdown file code is:

# SEARCH :mag: 
You can <b>[Search](https://example/search?search=1234)</b> with a combination of keywords for any of our for documentation results, there is always a Markdown file listed (*.md).

support requests to [examplesupport@example.com](mailto:example.example.com)

Below the search.html code to iterate and get it as json file

---
layout: search
---
<form action="/testrepo/search.html" method="get">
    <label for="search-box">Search</label>
    <input type="text" id="search-box" name="query">
    <input type="submit" value="search">
</form>

<ul id="search-results"></ul>

<script>
 window.store = {
    {% for page in site.pages %}
      "{{ page.url | slugify }}": {
        "title": "{{ page.name | xml_escape }}",
        "content": "{{ page.content | strip_html | strip_newlines }}",
        "url": "{{ page.url | xml_escape }}"
      }
      {% unless forloop.last %},{% endunless %}
    {% endfor %}

  };
</script>
<script src=/js/lunr.min.js"></script>
<script src="/js/search.js"></script>

With this code it’s iterating but it’s breaking like below.

I’m also pasting my code repository and URL:
Github repo: vanamsandeep/testrepo (github.com)
URL: Jekyll Actions Demo with leap-day theme

pls suggest me on this

I’m implementing the search functionality in my blog by using lunr.js search functionality. For that I need to iterate all the pages in the blog using jekyll liquid tag to form as json structure. So while using the above code json structure is breaking because of some html tags. I need help on removing those html tags and make it as valid json structure.
My Markdown file code is:

# SEARCH :mag: 
You can <b>[Search](https://example/search?search=1234)</b> with a combination of keywords for any of our for documentation results, there is always a Markdown file listed (*.md).

support requests to [examplesupport@example.com](mailto:example.example.com)

Below the search.html code to iterate and get it as json file

---
layout: search
---
<form action="/testrepo/search.html" method="get">
    <label for="search-box">Search</label>
    <input type="text" id="search-box" name="query">
    <input type="submit" value="search">
</form>

<ul id="search-results"></ul>

<script>
 window.store = {
    {% for page in site.pages %}
      "{{ page.url | slugify }}": {
        "title": "{{ page.name | xml_escape }}",
        "content": "{{ page.content | strip_html | strip_newlines }}",
        "url": "{{ page.url | xml_escape }}"
      }
      {% unless forloop.last %},{% endunless %}
    {% endfor %}

  };
</script>
<script src=/js/lunr.min.js"></script>
<script src="/js/search.js"></script>

With this code it’s iterating but it’s breaking like below.

I’m also pasting my code repository and URL:
Github repo: vanamsandeep/testrepo (github.com)
URL: Jekyll Actions Demo with leap-day theme

pls suggest me on this

I went to your repo and did not see the code I shared on there (or a variant of it). Are you testing on a specific branch?

You can edit on main branch this is my testrepo only

In case it helps anyone searching this later, here is my working Lunr.js search index:

The trickiest part was filtering out liquid code that sometimes appears in page.content