Some questions with highlighting words in rouge

For example if i’ve use markdown for language highlight the text only lines don’t get included inside a tag for highlight.

for example:

{%- highlight md -%}
## title

here have some random text
{%- endhighlight -%}

render this:

<figure class="highlight">
<pre>
<code class="language-md" data-lang="md">
<span class="gu">## title</span>
here have some random text
</code>
</pre>
</figure>

Or if I used accented word with config language, rogue exclude this words

{%- highlight config -%}
tílde
no tilde
{%- endhighlight -%}
<figure class="highlight">
<pre>
<code class="language-config" data-lang="config">
<span class="n">t</span>
í
<span class="n">lde</span>
<span class="n">no</span>
 <span class="n">tilde</span>
</code>
</pre>
</figure>

is this normal behavior since the devs use english for most extra text on code blocks? my markdown and kramdown options on _config.yml

markdown: kramdown

kramdown:
  syntax_highlighter: rouge
  syntax_highlighter_opts:
    span:
      line_numbers: false
    block:
      line_numbers: true

Yes, this is normal Kramdown highlighting for markup languages like Markdown, HTML, XML, etc. Plain text is left as plain text, with no styling span. If you need to style the plain text, you can use the .language-* classes.

I’m not familiar with the Config language – maybe it is just an old file format? Possibly, Config only allows ASCII characters in keywords/identifiers, and so the highlighter splits tokens at accented characters?

Other languages/formats allow multilingual identifiers, and they work as expected. For example, Javascript:

{%- highlight js -%}
var tílde=2;
{%- endhighlight -%}

Renders as (rewrapped for clarity):

<figure class="highlight">
<pre><code class="language-js" data-lang="js">
<span class="kd">var</span> 
<span class="nx">tílde</span>
<span class="o">=</span><span class="mi">2</span><span class="p">;</span>
</code></pre>
</figure>
2 Likes

Many thanks for the help.