Starting with CSV/XML

moorlag · December 25, 2021, 1:18pm

Hi there,

I’ve started a new pet project and I am asking for help.

Heidis Koch Klub (Heidi’s Cooking Club). HKK is a paper-based recipe card system from the '70. Easy to understand recipes to feed a family. In the first step of this project, I am working at a PoC. First ten recipes, later I want to scale it up to all 750+ cards with 48 different categories (ranging from first starter, second starter, drinks, main courses, and desserts… a lot of desserts). The third phase will include an option to see a translation of the card in a few languages.

I am having trouble understanding the way how Jekyll handles these files. I followed the quick tutorial - data files and I have a _data folder with the XML files. How can I produce posts with the cards? And how can I in phase two add different tags to the different folders (e.q. dessert, starter, main course).

Parallel to this pet project I am writing an article about starting (and finishing) pet projects. I teach CS at a high school.

rdyar · December 25, 2021, 5:31pm

I don’t think jekyll supports xml files - it does support csv, yaml and json.

When using a data file you are limited to outputting that data inside a single page - you can’t output pages directly from the data file unfortunately. I think I have seen a custom plugin or two that could do that, but jekyll on its own does not.

I made a simple recipe website from a family cook book - I scanned all the pages to pdf with OCR and then found a way to save each page as a markdown file with front matter (some sort of command line script?) and then used them as collection items. Worked pretty well though there were quite a few OCR issues.

pippim · December 25, 2021, 11:38pm

Convert from XML to Markdown

Jekyll uses Kramdown Markdown and Hyper Text Markup Language (HTML). It doesn’t use the eXtensibible Markup Language (XML) file format your Recipes are stored in. For your first step I’d recommend converting your XML files to Markdown format:

Can I generate .MD files from XML Documentation?](c# - Can I generate .MD files from XML Documentation? - Stack Overflow)

I haven’t seen XML in 20 years so can’t really judge the solutions presented in the link.

Convert Markdown to Kramdown

The next step would be to convert Markdown to Kramdown. Although Kramdown is a Markdown language already, it does have it’s idiosyncrasies, For example, you need an extra blank line after a </summary> or </details> html element. I had to do that in my own stack-to-blog.py python program: Convert SE Markdown to Jekyll Kramdown

Create Recipes by Tags

In the Pippim website, under construction, there will be 1,164 answers, which are akin to your “Recipes” which are basically an Answer on “What to eat?” and “How to do it?”. The Answers (“Recipes” if you will), have 771 different tags. Combined there are 3,633 “posts by tag”:

Pippim post tags expand must scroll

These “Posts by Tags” are plain old html generated by the same python program linked to earlier.

As already mentioned the Pippim Website is under construction and changes daily. It’s open-source and you are “free” to have a look today though.

Good luck with your project and ,if it’s not offensive, Merry Christmas. Otherwise, Enjoy the next four days off including today (Saturday).

moorlag · December 26, 2021, 7:55am

Thanks for the extensive help, this really helps. And wishing you the same, enjoy the holidays and or Christmas! Prettige Kerstdagen (in Dutch), Happy Christmas!

I can also transform the XML files to JSON with a tool. (and to be honest, it was also +10 years that I used XML). Let’s have a look at the links you provide (again, really appreciate it!). How can I loop a list of JSON files in a folder? This is from the Jekyll Docs

<ul>
{% for org_hash in site.data.orgs %}
{% assign org = org_hash[1] %}
  <li>
    <a href="https://github.com/{{ org.username }}">
      {{ org.name }}
    </a>
    ({{ org.members | size }} members)
  </li>
{% endfor %}
</ul>

Process

This process is in my head. The magic from paper to XML/CSV (and even XSL) comes from Nanonets. Their service for OCR from PDF files is great (and free for smaller projects)

moorlag · December 26, 2021, 7:57am

Thanks!

If you have the scans still available, perhaps you can re-OCR again. Nanonets is a great service for extracting information from PDFs.

Btw, cool website! And your grandmother was wise, very little salt is needed

Some background information on my ‘data storage device’

The Box

.
I’ve been researching the box and it’s surprising that there are little/no records left of this product. A lot of dishes are new to me. I’m trying to use at least one card/week. It was called Heidis Kock Klub.

BillRaymond · December 26, 2021, 10:32pm

Hi there,

Basically you will design the HTML and CSS (and JavaScript if needed) for the card. To get at the data, you will want to extract it from your data files.

Here’s a video tutorial I created that shows how to use the data files you create (there are two others which you might want to check out, but this one focuses on your specific need):

Topic		Replies	Views
Making an LMS with Jekyll Help	8	1024	September 2, 2020
Problems with upgrade (redcarpet) from Jekyll 3.7 to 4.2 Help	6	694	July 7, 2021
How to handle sidecar files? Help	10	803	July 19, 2023
Please help me! I want to keep it SIMPLE & use a database/csv/excel file & html page template to create a static website Help	6	2544	April 13, 2021
Data or pages to collect a big number of items on one page? Help	5	503	August 15, 2020