How to merge Medium posts and local posts in the blog list page

This post is published on SO.

Currently, I am creating my personal website: https://hantsy.github.io

And in the past 2 years, I have posted a lot of articles on Medium.

I want to add Medium indices(title, date, abstract, etc) into my posts list(/blog), and mix the medium blog entries and my personal blog in hantsy.github.io.

  1. Sort all posts(summaries of medium blog, and local posts) by date in the /blog page. (I really need help for this step)
  2. When it is a Medium blog, click it to redirect to Medium. else open it under /blog/year/xxx(this step is easy)

This post is helpful, but it is fully replacing the local posts with Medium posts.

Hi.

Firstly, Jekyll does have a bunch of importers, though Medium is not there.

https://import.jekyllrb.com/docs/home/


Can you show what format you get from Medium as a an export? Like it is a CSV or a folder of files?

Anyway, once you download the data, you can use a script to convert the posts in Jekyll posts. I can help with this as a Bash shell script or Python script, but need to have a sample of posts to work with even if the data is made up.

The conversion output should be markdown files with metadata that makes sense to Jekyll. Categories and tags optional.

# _posts/2020-01-03-my-post-name.md
---
title: My title
description: My description 
categories: food
tags: health, cooking 
---

## Heading 2

Content.

More content for body of post.

You may not care for body but maybe want a custom field like “abstract” or “extract” and then you can use that in your posts list.

If you put those convered posts in _posts directory then you can mix them in with your existing posts.

I wouldn’t force the URL to be

/blog/2020/01/02/my-title.html

Jekyll does it like this

/2020/01/02/my-title.html

And then your listing of all posts can be only your homepage or /blog.html or /posts.html


Regarding ordering, Jekyll already sorts posts by date with most recent first. And you are forced to put date in the filename.

Just iterate over posts.

<ul>
    {% for post in site.posts %}
    <li>
        <h2 class="post-title p-name"><a href="{{ post.url | relative_url }}">{{ post.title }}</a></h2>

        <span>
            <i>{{ post.description | markdownify }}</i>
        </span>

        {% include post-tags.html tags=post.tags %}

        <p class="post-meta">
            {{ post.date | date: site.short_date }}
        </p>
    </li>
    {% endfor %}
</ul>

From my index.md of my blog

Regarding separate Medium and non-Medium posts.

I suggest treating the Medium posts as Jekyll posts but with no content. Or include the body but don’t display it.

Then use a “medium” category or tag on all your medium posts. This can be added easily using the conversion process.

You can even have a “medium” folder inside _posts and the posts in there will get the Medium category.

Once you have category or tag to separate Medium and non-Medium, you can update your layout so that you can handle them different.

I would avoid using the Jekyll redirect plugin as that adds extra complexity and dependency without much benefit.

On your blog listing pages, you can iterate over all posts.

And then you can set conditional statement in your blog.md or the layout for that page.

If it is a standard post, then you link to it internally.

src="{{ post.url | relative_url }}"

result is like

/repo-name/2020/01/02/my-title.html

In the case of your repo being on the root of your GitHub.io site, it will just be outputted as

/2020/01/02/my-title.html

And you can have an if statement which looks for the presence of the “medium” category and then provides an external link to your Medium blog.

Roughly

for post in posts
  H2 {{ title }} {{ post.date }}
 
  p {{ post.description }}

  if medium in post.categories 
     src={{ post.external_url }}
  else
    src={{ post.url }}
  endif

  p i {{ post.abstract | post.extract
endfor

If the abstract key is not set on a post, then use builtin post.extract method. Or use the Medium check there.

You may also want a link or arrow symbol or Medium logo to tell the user they are clicking on an external URL.


You may want to get all the Medium pages on your Jekyll site setup with rel canonical tag or removed from sitemap or disabled output, to avoid duplicating the content. But this is not so important. The ranking of the original medium posts won’t be affected by duplicating on another site, the duplicates just won’t appear in search results.

Medium provides a feed xml for a personal blog.

eg. https://medium.com/feed/@hantsy

PS: I provide a link in 2nd post which provide a solution to write a jekyll plugin to parse Medium feed.
But I know little about ruby and jekyll programming.

1 Like

This is a great idea to split the medium post content.

If I use the plugin in my link to do this, add medium post into a standard collection, such as _medium. if there are some way to merge the _medium and _posts at runtime?

I found jekyll provides a rss, I will try this.
https://import.jekyllrb.com/docs/rss/

1 Like

Sorry I skipped the post before.

The plugin supplied there is short so looks like a good idea.

It uses a gem to handle the RSS feed from Medium.

You might want to adapt the plugin to add to _posts. Since posts is actually a collection already.

Then when you iterate over posts, you’ll have both kinds.

If you add a separate collection , that could work as well, but it is not as clean in Jekyll as you’ll have to mix both your standard posts and new collection together. And you’ll lose the ability to use tags and categories and date sorting if you move away from posts. You’ll have to implement those yourself.

Also you may not want to request Medium every time you hit build locally or on the remote. That can take a while if you have a lot of posts and depends on Medium always being up and fast.

You could save your RSS feed as a local XML file dump and parse that rather.

Or parse the RSS once and generate markdown files which you can modify more easily. Then you don’t have to maintain a plugin or deal with Medium changing their structure - just handle the import now to get markdown and then you don’t need to use the script again.

I hope that RSS importer you found works then you don’t have to use a script yourself.

Oh something else important.

You can’t run custom plugins or custom build flow on GH Pages unless you use GH Actions.

An alternative is to run the conversion once off locally and commit the markdown files for medium posts. Then you can build without issue using the standard GH Pages environment of plugins.

I have tried to add a Github actions workflow to import the medium posts, it seems working well. But the rss importer itself provides no more options to customize the page headers, etc.

Till now, it is enough for me, maybe change it later when I am familiar with jekyll.

1 Like

Thanks for sharing. That’s neat. So, daily it will pull in posts and add commit them.

Glad that it is working fine enough