Problem with Sitemap and Jekyll Multiple Languages Plugin

Hello,
I am using jekyll-multiple-languages-plugin-1.7.0
and jekyll-sitemap-1.4.0.

My sitemap is only generated for one language (primary).
The same map is created in the _site/ru and _site/en folders.

Example for en:
<loc>https://example.com/ru/about/</loc>

Example for ru:
<loc>https://example.com/ru/about/</loc>

I read that you can create a sitemap_index and connect to it

<loc> https://example.com/en/sitemap.xml </loc>
     </sitemap>
     <sitemap>
         <loc> https://example.com/ru/sitemap.xml </loc>

But the sitemap in the folders is the same. I don’t understand why it doesn’t work?

Having Russian and English pages division makes sense for readers.

But I think you should just have one single sitemap for your whole site. The sitemap plugin will find all pages and list them in sitemap.xml file. And then crawlers will find that sitemap and go through all the pages. The grouping and order doesn’t matter.

I wouldn’t bother splitting multiple sitemaps. Only worthwhile if you hit a limit I think like 50MB or 50 000 items per sitemap. Then it you have to split your sitemap (whether by arbitrary or logical grouping) so Google sees it as valid. But unless you are Pinterest or Instragram I don’t think you will have that many pages.

Posts are located in the _i18n /en /_posts folder and _i18n /ru /_posts
But the plugin only creates a map for the default language - ru

Can you share a repo link so I can reproduce locally?

The sitemap plugin gets site.collections to handle posts (posts is a kind of collection), so you need to figure out what happens if you just do this - I would expect both RU and EN posts would be there if the posts are getting rendered as HTML pages for users to see.

{{ site.collections | jsonify }} 

Here is the link to the repository: https://github.com/awesomebloging/cmsminers/

Okay thanks. I’ll look.

Hmm I ran your site locally.

When running Jekyll serve/build I noticed it builds your site twice.


Building site for language: "ru" to: /Users/mcurrin/public_repos/cmsminers/_site/ru
...
Building site for language: "en" to: /Users/mcurrin/public_repos/cmsminers/_site/en

Putting sitemaps aside for a moment, it looks building the site twice is weird.

Here is the structure I get:

_site/
  en/
    blog/
    ...
    index.html
    CNAME
    sitemap.xml
    sitemap_index.html
  ru/
    blog/
    ...
    index.html
    CNAME
    sitemap.xml
    sitemap_index.html

What I would expected is one site, with one index.html and one sitemap.xml generated at the root which has all the pages in it (as in my earlier comment).

And then language/specific files in dirs.

e.g.

en/
  blog/
  ...
  index.html # English home
ru/
  blog/
  ...
  index.html # Russian home
index.html   # Website home
sitemap.xml

Maybe the two-site way is the way the internationalization plugin is meant to work, or it needs more configuration on your part?


It’s complicated to follow your project because of NPM + a bunch of plugins + Minimal Mistake needing massive config + internationalization + sitemap plugin + your own robots and sitemap files + an unusually high number of layouts and include files.

This is the behavior of the jekyll-multiple-languages-plugin
With config default_locale_in_subfolder: true

Ah okay

Still, the sitemap plugin doesn’t understand that. But maybe you can make two sitemap.xml files based on the one in the plugin repo

Thanks for your reply.
In my case, creating a sitemap manually through a template works:

---
layout: null
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://www.sitemaps.org/schemas/sitemap/0.9 https://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="https://www.sitemaps.org/schemas/sitemap/0.9">
  {% for post in site.posts %}
  <url>
    <loc>{{ site.url }}{{ site.baseurl }}{{ post.url }}</loc>
    {% if post.lastmod == null %}
    <lastmod>{{ post.date | date_to_xmlschema }}</lastmod>
    {% else %}
    <lastmod>{{ post.lastmod | date_to_xmlschema }}</lastmod>
    {% endif %}
    <changefreq>weekly</changefreq>
    <priority>1.0</priority>
  </url>
  {% endfor %}
  {% for page in site.pages %}
  {% if page.sitemap != null and page.sitemap != empty %}
  <url>
    <loc>{{ site.url }}{{ site.baseurl }}{{ page.url }}</loc>
    <lastmod>{{ page.sitemap.lastmod | date_to_xmlschema }}</lastmod>
    <changefreq>{{ page.sitemap.changefreq }}</changefreq>
    <priority>{{ page.sitemap.priority }}</priority>
  </url>
  {% endif %}
  {% endfor %}
</urlset>

This creates the correct map in every language folder.
And you can list the maps like this:

<? xml version = "1.0" encoding = "UTF-8"?>
<sitemapindex
     xmlns = "http://www.sitemaps.org/schemas/sitemap/0.9">
     <sitemap>
         <loc> ru/sitemap.xml </loc>
     </sitemap>
     <sitemap>
         <loc> /en/sitemap.xml </loc>
     </sitemap>
</sitemapindex>

Oh yes good idea. Glad it works

1 Like

Whoever comes here from google, this code regularly generates a sitemap for Jekyll Multiple Languages and allows you to exclude pages you don’t need:

update:
—
layout: null
sitemap: false
—
<?xml version="1.0" encoding="UTF-8"?>

{% for post in site.posts %}

{{ site.url }}{{ site.baseurl }}{{ post.url }}
{% if post.lastmod == null %}
{{ post.date | date_to_xmlschema }}
{% else %}
{{ post.lastmod | date_to_xmlschema }}
{% endif %}
weekly
1.0

{% endfor %}
{% for page in site.pages %}
{% if page.sitemap != false %}{% if page.title %}

{{ site.url }}{{ site.baseurl }}{{ page.url | remove: “index.html” }}
{% if post.sitemap.lastmod %}
{{ post.sitemap.lastmod | date: “%Y-%m-%d” }}
{% elsif post.date %}
{{ post.date | date_to_xmlschema }}
{% else %}
{{ site.time | date_to_xmlschema }}
{% endif %}
monthly
0.5

{% endif %}
{% endif %}
{% endfor %}