M-dex 0.1.1, Mangadex Rest API written in Crystal (Contributors and Testers needed!)

Hi y’all! Yesterday I started doing this project shard called M-dex. It’s a REST API client library that pulls data straight from Mangadex site. This project was started when I want to create a better front-end client using MangaDex’s data. Turns out that at that same period the lead dev of MD announced their plans for a public REST API in which until today it hasn’t been released in the public so it sparked an interest for me to do this.

For those who don’t know (and not weebs), MangaDex is a popular website for reading Japanese mangas. One of the key things I liked about it is that it has a huge library/catalog of manga titles compared to others. And it’s a perfect use case for creating a REST API so why not?

Anyways, I used myhtml shard for scraping the HTML data and the built-in HTTP standard library from Crystal. The codebase right now is insane/dirty and would find ways to utilize and share methods from each endpoints and anytime soon I would seek contributions from you guys in order to make it easier to maintain.

Here’s an example for you guys:

require "m-dex"

mangadex = Mdex::Client.new

puts mangadex.manga(5) # Manga ID for Naruto

When compiled, it would show you this:

{
  "cover_photo": "https://mangadex.org//images/manga/5.jpg?1526016755",
  "name": "Naruto",
  "id": 5,
  "alternate_names": [
    "NARUTO",
    "NARUTO -ナルト-",
    "นินจาคาถาโอ้โฮเฮะ",
    "ナルト",
    "火影忍者",
    "狐忍",
    "나루토"
  ],
  "author": "Kishimoto Masashi",
  "artist": "Kishimoto Masashi",
  "demographics": [
    "Shounen"
  ],
// .. }

Pretty neat and easy.

I am also launching a public API just to test this and so that you could use it in various applications. It’s just a Kemal web app that maps to the endpoints of M-dex. I’m needing testers for this API so that I can test the limits/performance of the Crystal app under heavy loads.

Let me know what you think!

3 Likes

I’m assuming you’re scraping the site and storing the data in some database to serve? Or are you scraping the html on each request? On some of the larger data sets, like https://mangadex-api.herokuapp.com/title/999, it took almost 25 seconds to return the data. It might be worth looking into using some sort of cacheing mechanism + Etags to ease the load.

Also might want to implement some validation, https://mangadex-api.herokuapp.com/title/-1 causes things to bomb out and return CSS/JS.

Otherwise, is good to see some projects utilizing Crystal.

EDIT: Their actual API by comparision takes around 300ms for the same title (https://mangadex.org/api/manga/999).

Hi thanks for your feedback!

  1. In the meantime, it scrapes the HTML for every request. Will try to look into integrating Redis for caching requests.
  2. It really takes a while for titles with larger data sets since it scrapes and at the same time execute another API request directly from the Mangadex API for fetching chapter images. I’ll separate this.
  3. Woah, didn’t noticed that. Thanks for the tip!

Cool sounds good. Yea caching seems like a good idea for this as it prob wont change often so could go with large cache period. Also might be able to leverage async queues to do the scraping, so that could happen periodically without affecting the load times of the data. However that would make more sense if you were also storing the data yourself. Wouldn’t really help if you didn’t have a full copy of their data all the time as kicking off an async task on an HTTP request that wants the data wouldn’t be useful.