Filter posts by variable to use search function

Hi, there. I’m new to jekyll and programming.

I’ve adapted a theme to build a personal website in 3 languages. Basically, I’ve created the variable lang that distinguishes the pages between br, en and es languages.

But I’m having a hard time figuring out how to filter the pages to use a search function. Basically, there’s a search button in the header, which is a window search panel, that searches for any post. I’d like to filter it in a way that if the current user’s page has a variable lang: br, then if she searches something, it will only show results in br (i.e., posts with the same variable lang: br). The problem is that it is showing posts in every language.

The theme has a JS document (I don’t know anything about js) that I guess is working on the search data, which is the following:

window.TEXT_SEARCH_DATA={
  {%- for _collection in site.collections -%}
    {%- unless forloop.first -%},{%- endunless -%}
    '{{ _collection.label }}':[
      {%- for _article in _collection.docs -%}
      {%- unless forloop.first -%},{%- endunless -%}
      {'title':{{ _article.title | jsonify }},
      {%- include snippets/prepend-baseurl.html path=_article.url -%}
      {%- assign _url = __return -%}
      'url':{{ _url | jsonify }}}
      {%- endfor -%}
    ]
  {%- endfor -%}
};

I’ve tried to filter it with liquid. The closer I get was by adding the line {%- assign _collection = _collection1.docs | where: 'lang', page.lang -%} after '{{ _collection.label }}':[. This way, it filtered only the br posts (I have no idea why), but didn’t showed posts in english or spanish (even if the current page’s lang is en or es).

Does anyone have any tips?

Notes:

site: gustavosabbag.github.io (to test, you can type p in the search button, there is test posts in english and portuguese)
repo: https://github.com/gustavosabbag/gustavosabbag.github.io/blob/master/_includes/search-providers/default/search-data.js (in the file aforementioned)

There’s another JS that I suppose is calling this function and may have something to do with it: https://github.com/gustavosabbag/gustavosabbag.github.io/blob/master/_includes/search-providers/default/search.js

Thank you for any help

Your approach looks okay

Make sure to use a variable after defining.

{%- assign _doc = _collection1.docs | where: 'lang', page.lang -%}
{% for _article in _docs %}
...

Also check your value alone

## Language is

{{ page.lang | inspect }}

1 Like

Also try hardcode your value to see if you can get English on any page and ignore the local page lang.

## Debugging my lang 
{%- assign _doc = _collection1.docs | where: 'lang', 'en' -%}

{{ _doc size }}

{{ _doc | inspect }}

Note i did that outside the JS.

If you still get br results or you get zero results, then something could be wrong with the language on each page

If you get that filtering to work, then proceed to the longer JS logic

1 Like

Also the JS snippet is meant to work with a collection

Search for Collections on the Jekyll docs.

If you want to use collections, the folder must be _br etc. and setup br under collections in config.

If you do that your code can also check the name of the current collection and just show pages in that collection and build up JS data for just that collection. So you filtering where clause could be at a higher level and easier to follow.

1 Like

When I hardcode to ‘en’, then it correctly searches only from posts with lang: en;

By that, I’ve tried hardcoding {% if page.lange == 'br' %} ... {% elsif page.lang == 'en' %}... but it still keep searching from only pages with lang values br, independent of the current page…

Maybe it has something to do with the fact that it is a window search panel?

I also tried to substitute to site.posts, as it follows:

window.TEXT_SEARCH_DATA={
  {%- for _article in site.posts | where: 'lang', page.lang -%}
    {%- unless forloop.first -%},{%- endunless -%}
      {'title':{{ _article.title | jsonify }},
      {%- include snippets/prepend-baseurl.html path=_article.url -%}
      {%- assign _url = __return -%}
      'url':{{ _url | jsonify }}}
      {%- endfor -%}

But it didn’t work either… It seems like someway page.lang always recognize it as br, and then searches from articles with the same lang: brvalue… Do you have an idea why?

I see you have page.lang set as different values on page.s

So you can do a test in page.html layout

# {{ page.title }}

Lang {{ page.lang }}

Lang is English {{ page.lange == 'en'  | inspect }}

That will confirm that your lang value is set correctly on each page. And get a true or false value.

Then change it to

{% if page.lange == 'br' %}
Brazil!
{% elsif page.lang == 'en' %}
English...
{% else %}
Spanish!
{% endif %}

And then you can start using the value in the JS with confidence that is changes.

You can also right click and View Source on your HTML page or check your _site directory to look at the outputted HTML and JS directly

Regarding posts - you only have on post in _posts and it is lang: br. I don’t think Jekyll will recognize en/_posts folder.

And again I didn’t see collections setup in your config before so I would not expect your JS with site.collections to work

If you have a look at _includes/header.html you can see how your navbar is made using the current page’s page.lang value. This avoids use of if statements.

Also if you have trouble with this project as Jekyll beginner, I would recommend following the blog tutorial on the Jekyll site and configuring a theme in your config instead of copying or forking an entire existing project with content and theme mixed together which can get messy to maintain. There are also forum post topics on multi language sites and plugins if you need help there.

1 Like

I think I may have explained the problem in a wrong way… I’ll try to be clearer.

Actually, I don’t think the problem is with the lang values. I just checked the pages again and their lang values are ok. Besides, when I code the JS file this way:

window.TEXT_SEARCH_DATA={
  {%- for _collection1 in site.collections -%}
    {%- unless forloop.first -%},{%- endunless -%}
    '{{ _collection.label }}':[
        {%- assign _collection = _collection1.docs | where: 'lang', 'en' -%}
      {%- for _article in _collection -%}
      {%- unless forloop.first -%},{%- endunless -%}
      {'title':{{ _article.title | jsonify }},
      {%- include snippets/prepend-baseurl.html path=_article.url -%}
      {%- assign _url = __return -%}
      'url':{{ _url | jsonify }}}
      {%- endfor -%}
    ]
  {%- endfor -%}
};

it correctly searches only from the posts with lang: en. That’s why it seems to me that the problem may be with how this function is working, or the search.js file that is calling it. And since I don’t know anything about js, I’m having a hard time to discover

I didn’t understand what you meant by that. Is it another test?

The site is recognizing properly all the posts (differentiating them by the lang values) when you switch the language, by the header language button. The only problem is that when using the search button, it is searching from all posts instead of only the ones with the same lang value as the user’s current language. That is the one thing I’m trying to correct. Maybe I’ve missexplained it before, I’m sorry. But now I’ve explained exactly what the problem is.

I’m avoiding making collections of the languages (en, es and br) because it unsets some codes I’ve written, and it doesn’t seem to be the best way to work with it. But even without the collections, as I told, the search function is searching from the br posts, and when I filter it with {%- assign _collection = _collection1.docs | where: 'lang', 'en' -%}, it searches only from the en posts. The problem is how to make this process automatically: to recognize the user’s current page.lang, and search only from the posts with the same lang value, and not from all posts.

Yes just to confirm language gets used

It sounds like lang gets used on the page but the JS is too broad.

Look at your JS console in the browser. Are there any errors logged?

Try take away code to find your issue.

One theory I have is that JS is not being used at all - perhaps there is a syntax error or it loads too late.
Make your JS empty as an experiment.

window.TEXT_SEARCH_DATA = {}

If you still get results, it means it is not working so changing it won’t help.

If you get zero results in a search it means your JS is snippet is working. So you can put back to the way it was.

So then you can have a look at your JS snippet in the HTML of say an EN page. If you look at the output and see a BR page for example and no EN page, then your problem is your Jekyll part.

If your JS only has the necessary parts for the language, then you can look more at the snippet.

It would be nice if you shared some of the JS snippet to see how it gets rendered too not just as Liquid.

1 Like

I followed your advice about looking at the JS console. In fact, the JS is being used. I managed to reduce the function to this:

  window.TEXT_SEARCH_DATA={
    'post':[
      {%- for _article in site.posts -%}
      {%- if _article.lang == 'br' -%}  
      {%- unless forloop.first or _article.size == 1 -%},{%- endunless -%}
      {'title':{{ _article.title | jsonify }},
      {%- include snippets/prepend-baseurl.html path=_article.url -%}
      {%- assign _url = __return -%}
      'url':{{ _url | jsonify }}}
      {%- endif -%}
      {%- endfor -%}
    ]
};

With it, it searchs only from the posts with lang: br. If I change the line {%- if _article.lang == 'br' -%} to {%- if _article.lang == 'en' -%}, then it will search from posts with lang: en (which is in fact what I want, I just need to match the posts it will search from with the user’s current page lang value).

The problem is I can’t set {%- if _article.lang == page.lang -%}. It’s not recognizing the variable page.lang.

The JS snippet is being called in the layout page, which is the layout of the html pages used in the site (which have the lang variable). I suppose that somehow, because the JS is called in the layout, the custom function isn’t being able to understand page.lang.

Do you know some other way I could call the variable lang of the user’s current page? So I can match it with the posts.lang to be searched by the search data function?

Thank you very much

Add the language about your code using a comment. Or even a console.log

// Lang is: {{ page.lang }}
// Is English: {{ page.lang == 'en' }}
console.log( '{{ page.lang }}')

window.TEXT_SEARCH_DATA = {
  // add code here
}

Use an HTML comment of you prefer.

<!-- lang {{ page.lang }} -->

You can do in an includes file or a layout. If you put this in default.html layout for example then you can use that layout for every page on the site (as other layouts will use it too) and page.lang for a given page will evaluated each time. A layout does not “remember” a value from other pages rendered before.

Another way to set page.lang is using defaults field in the config. So for example all pages in en folder get lang: en set.

But I can’t suggest another way to read or check against page.lang, as the way you are doing it is correct.

I don’t know what else to suggest. Your search works with handcoded value. Your lang value changes on each page as confirmed on your previous comment. Somewhere things are breaking down in the middle of where lang gets used in JS. That is why I said in a previous comment to see what your JS result looks like not just the liquid. And my suggestion above will help identify the lang value. I made a suggestion before on {{ page.lang == 'en' }} which will be true or false and will tell you if English pages will go into the JS.

1 Like

Here is your problem

In your config, you do use defaults to set lang.

But you set everything to br

Do this instead. I made br apply to {{path ‘br’ }} instead of {{path: ‘’}} which matches everything and would make the defaults set after it do nothing.

- scope:
      path: ""
      type: posts
    values:
      layout: article
      sharing: true
      license: true
      aside:
        toc: true
      show_edit_on_github: true
      show_subscribe: true
      pageview: true
  - scope:
      path: "br"
    values:
      lang: br
      base-url: "/br/"
  - scope:
      path: en
    values:
      lang: en
      base-url: "/en/"
  - scope:
      path: es
    values:
      lang: es
      base-url: "/es/"

Or…

You move the br one last and leave path as ''. But this is dangerous as it would mean pages like /index.md get put incorrectly /br/index.html instead of /index.html

- scope:
      path: ""
      type: posts
    values:
      layout: article
      sharing: true
      license: true
      aside:
        toc: true
      show_edit_on_github: true
      show_subscribe: true
      pageview: true
  - scope:
      path: en
    values:
      lang: en
      base-url: "/en/"
  - scope:
      path: es
    values:
      lang: es
      base-url: "/es/"
  - scope:
      path: ""
    values:
      lang: br
      base-url: "/br/"
1 Like

Nice tip, but it didn’t worked either. I tried setting the window search function with {%- if _article.lang == page.lang -%}, like:

window.TEXT_SEARCH_DATA={
    'post':[
      {%- for _article in site.posts -%}
      {%- if _article.lang == page.lang -%}
      {%- unless forloop.first or _article.size == 1 -%},{%- endunless -%}
      {'title':{{ _article.title | jsonify }},
      {%- include snippets/prepend-baseurl.html path=_article.url -%}
      {%- assign _url = __return -%}
      'url':{{ _url | jsonify }}}
      {%- endif -%}
      {%- endfor -%}
    ]
};

And when I tried the first suggestion (path: "br"), it returns an empty object (no result in the search); and when I did the second way, it returns only the lang:br posts to search from…

Maybe we are getting close to the problem.

The JS file is in fact a function, called “window.TEXT_SEARCH_DATA”, which is called in the middle of another JS file, called search.js (this file, line 5: https://github.com/gustavosabbag/gustavosabbag.github.io/blob/master/_includes/search-providers/default/search.js)

When I put the quotted code in search.js, it correctly identifies page.lang. But I need the function window.TEXT_SEARCH_DATA to recognize it, and that it passes its output, recognizing page.lang, to search.js.

But when I put the quotted code before “window.TEXT_SEARCH_DATA”, it doesn’t recognize it, the result is empty. So the problem is that this fuction, specifically, isn’t recognizing page.lang.

I’ve been looking on my mobile up to now. I’ll run it on my laptop today

1 Like

By the way, the way you’ve setup your site is an anti-pattern - I’m guessing using the download ZIP flow in the theme’s docs. You have all the layouts and includes files and the gemspec file all in your project, which makes your project noisy. A gemspec file especially is for a plugin or a theme, not a site.

The typical Jekyll way would be to create repo which has just your pages and posts and your config. And any overrides of your theme. And you would reference a theme in your config file Gemfile. Your theme which is a Gem hosted on RubyGems site or GitHub. And you can override theme values in your project if you need to. If you need to change themes, you change a configuration, without having to redo all the layouts and includes files from another repo.

See the instructions on the theme’s docs which covers what I talk about. https://tianqi.name/jekyll-TeXt-theme/docs/en/quick-start#ruby-gem-method

# Gemfile
gem "jekyll-text-theme"

_config.yml
theme: jekyll-text-theme

You will actually need the remote-theme plugin if you use GitHub pages, but the don’t is it is just a few lines to get an entire theme setup locally or on GH Pages, even if you start with a config fiel and index.md file and nothing else in a site.

See also my demo project based on the Jekyll tutorial. https://github.com/MichaelCurrin/jekyll-blog-demo

I have one layout to override the theme’s layout. Otherwise I use the theme’s page.html and post.html layouts without adding them directly. I also only have one includes file and that is a custom GitHub banner.

Here is the theme I point to. https://github.com/jekyll/minima

1 Like

Ok I found the issue. I fixed it locally and have a Pull Request for you to merge.

Problem

Our assumption was that the code snippet was called one on each page. We wanted to use page.lang in there.

_includes/search-providers/default/search-data.js

But, that script is actually only rendered once at build time, as seen by use here:

Which is available at https://gustavosabbag.github.io/assets/search.js

And then that is loaded like this in a <script> tag on each page.

      paths: {
        search_js: '/assets/search.js'
      }

So the idea is that the theme builds up the search data for all pages on the site once and that is used on all searches regardless of which translated page you are on. Which is not what we want. And changing the JS file as before won’t help us. We have to change a different file.


Solution

Here is my approach.

First lets check on the variable define on window which contains data for the search. Run this in your console on any page. Note the URL in each post item starts with /en etc. and I am going to use that to filter on.

console.log(JSON.stringify( window.TEXT_SEARCH_DATA.posts))
[
  {"title":"Example post in English 1","url":"/en/analysis/2020/08/30/example-post-1-en.html"},
  {"title":"Example post 2","url":"/en/analysis/2020/09/09/example-post-2-en.html"},
  {"title":"Post de Exemplo em portugues","url":"/br/analises/2020/11/09/post-de-exemplo-br.html"}
]

And we want to remove the posts not in the current language.

Here is the JS script which deals with TEXT_SEARCH_DATA which is your search data.

That gets used at the bottom of this file:

You can add a line to it and view your source to check on it.

...
<script>{%- include search-providers/default/search.js -%}</script>

<!-- LANG: {{ page.lang }} -->

And that search.html file in turn gets used on in _layouts/page.html

So what you need to do is after the variable is defined using the global value of all pages, you then update the value to a filtered array of only relevant languages. I am going to use /en/... etc. as the URL prefix to filter on.

I ran this locally.

  var searchData = window.TEXT_SEARCH_DATA || {};
  console.log('searchData', searchData.posts.length)
  var lang = '{{ page.lang }}';
  searchData.posts = searchData.posts.filter(p => p.url.startsWith(`/${lang}`))
  console.log('searchData', searchData.posts.length)

The initial value logged is 3 - two EN and one BR.

Depending which page I am on (EN, BR or EN), the second value logs 2, 1 or 0 respectively. And if I open the search bar and search, then the results match. Huzzar!

Screenshots to show it working:



Sorry, that’s a lot of info. I’ve added a PR for a branch so you can review and it merge it if you like.

If you want to try the code out locally, you can locally use my two new lines in your file.

Or you can checkout my branch locally.

git remote add fork https://github.com/MichaelCurrin/gustavosabbag.github.io.git
git fetch fork
git checkout feat-search-in-language

Start Jekyll with more verbose errors and in hotreload mode - if you save a page, the site will rebuild and your localhost pages will refresh themselves :grinning:

bundle exec jekyll --livereload --trace
1 Like

Thanks for explaining this to me so clearly. So any theme file that I didn’t change, I can delete from the repo, right?

In fact, I used the first method and downloaded all the files. But with your explanation I could solve some minor errors I had (I was unable to run locally, unless using LC_ALL="en_US.UTF-8" bundle exec jekyll serve, because of some gemspec encoding error).

1 Like

That’s it! Thank you very much for helping me and for your patience. It worked!!

Best regards and happy holidays!!!

1 Like

Yes you can delete any layout, includes or assets file you didn’t change and the theme’s one will be used.

You can delete gemspec.

For config file, I would keep most of the file as those values can’t be set by the theme.

In config you should set this.

remote_theme: kitian616/jekyll-TeXt-theme

That uses the Remote Theme plugin, which is necessary if you want to use a theme on GH Pages which isn’t one of the standard 10 or so. This also means you do not need theme: ... in your config.

And I would take everything out of your Gemfile and do this:

source 'https://rubygems.org'

gem "jekyll", "~> 3.9"
gem "kramdown-parser-gfm", "~> 1.1.0"

gem "jekyll-text-theme"

group :jekyll_plugin do
  gem "jekyll-remote-theme", "~> 0.4.2"
end

This installs Jekyll 3.9, a Jekyll rendering dependency and the Remote Theme plugin.

By adding the theme here, Bundle will install all then theme’s necessary dependencies for you. The theme will work with Jekyll 3 and 4 and give you 4, so this above will keep you at 3 for use on GH Pages.

For reference, here is a site I have on gh-pages branch which builds on GitHub Pages using a custom theme on master https://github.com/MichaelCurrin/jekyll-theme-quickstart/tree/gh-pages

Then delete vendor and Gemfile.lock and run bundle install.

You can actually make the Gemfile and config changes first, without deleting the theme files do a bundle install and start Jekyll. Then you can do the removal of files. This means you start from a working point and makes debugging less annoying.

I would recommend using a branch and PR for this change. Make sure you have it working locally and then merge your PR. It makes it easy to associate multiple file changes with this one purpose. Such as if you want to understand or revert something later.

1 Like