Next Level Documentation

These last months, I’ve been working on a vision regarding how an Athena application should be configured. I wanted something that could be compile time type safe, super flexible, and self-documenting. I’ll spare the details in this post (can read through the linked issue if you’re so inclined), but instead focus on the last part; sharing my experience (and some hacks tricks) of how I managed to make it (mostly) self-documenting.

Background

Crystal code is in some regards self-documenting in and of itself. For example, the generated API docs for example can point out methods, its parameters, their type or default value if any, and what type that method returns. This kind of context is invaluable when reading the documentation as it really helps figuring out how things all fit together. This can further be enhanced by using documentation comments to give extra context/information/examples to the methods themselves.

I had some requirements when going into this project in regards to its documentability:

  1. Syntax needs to be as close to native Crystal as possible
    1. Keeps things maintainable by not having to deal with some crazy custom DSL that is hard to document
    2. Make use of the built-in document generator
  2. Values needs to be available at compile time
    1. Compile time errors if value of the wrong type is provided, or required value isn’t provided, etc.

In the end I decided to use a module as a container representing the schema of the configuration values. This makes it native Crystal code that will show up in the API docs. Each module will include a special Schema module that acts as an interface to denote a module being a schema type, and includes some utilities for defining the schema properties. To ensure the values are available at compile time (and so things look familiar), I re-purposed the property macro that accepts a TypeDeclaration (just like the real property macro), which is a perfect representation of what I want: a name, type, and optional default value.

For example, a schema representing the configuration options for the Format Listener (with most doc comments removed for brevity):

module ...
  module FormatListener
    include ADI::Extension::Schema

    # If the listener should be enabled.
    property enabled : Bool = false

    # ... Docs for rules ...
    property rules : Array({path: Regex?, stop: Bool?, prefer_extension: Bool}) = [] of NoReturn
  end
end

Where the user would be able to apply their configuration via:

ADI.configure({
  framework: {
    format_listener: {
      enabled: true,
      rules:   [
        {path: /^\/admin/, prefer_extension: false},
        ...
      ],
    },
  },
})

Now that I came up with how I want the API to work, next came the challenge of actually implementing it.

Version One

Right off the bat there was a problem. The Crystal doc generator does not expose instance variables themselves anywhere. As such I could not simply generate @enabled : Bool = false and have it show up how I wanted. My workaround to this was to have the custom property macro generate an abstract method to have it show up in the docs, then automatically add a comment with the default value, which I think worked well enough.

However, this lead to another problem. By having the property macro add a comment to the method itself, it essentially voided the extra docs within the module the user may have added to its definition (related issue). Fortunately, this ended up being a good use case for some previously existing tickets, and soon after the macro @caller variable and ASTNode#doc_comment were born. It was now possible to do something like this:

macro gen_method(name)
 # {{ @caller.first.doc_comment }}
 #
 # Comment added via macro expansion.
 def {{name.id}}
 end
end

# Comment on macro call.
gen_method foo

And have it generate:

Comment on macro call.

Comment added via macro expansion.

Perfect! Now the user can add their comments and have them be merged together with the auto default value comment! However, I think we can do better…

Version Two

The main thing I didn’t like about this pure native approach is the NamedTuple aspect of it. It is very close to what I want, but I realized that there’s no real good way to self-document default values for each rule object. E.g. that stop defaults to false, and isn’t actually nilable. Sure you could put this information into a doc comment on rules itself but that feels less than ideal.

Because the schemas themselves are not something the average user is going to be working with, I decided straying away from the pure native Crystal approach was okay. I.e. having some custom macros to make the end result better for the end user was worth the extra complexity. As a result, the array_of macro was introduced that allowed defining rules as:

array_of rules,
  path : Regex? = nil,
  stop : Bool = false,
  prefer_extension : Bool = true

Now, similar to the record macro, you can define the name of the array property, and list out the fields that make up that object as TypeDeclarations. Awesome! As usual, fixing one problem lead to another; how to generate documentation for it now that the fields are split out? This is where things get interesting…

You may have noticed the version one screenshot looks different than what you may be used to. This is because Athena makes use of mkdocstrings-crystal instead of the default built-in API doc generator. The gist of it is it uses the JSON output of crystal doc to generate the HTML for the site in a custom way, which also allows more room for customization/integration. Including introducing custom templates.

Custom Schema Template

At this point, the schema module is less like an actual Crystal module, and more like a container of configuration that just happens to be a module for implementation reasons. As such, I thought about making it NOT use the default type.html template, but instead a custom schema.html one. This ended up being pretty straightforward via the templates feature of mkdocstrings. I created a templates directory, updated the custom_templates config to use this directory, then created templates/crystal/materical/schema.html. Great! Unfortunately there wasn’t a great way to introduce the conditional aspect of it, so I essentially just also copied the default type.html and edited it to do what I wanted, which ended up being like:

<div class="doc doc-object doc-type {{ obj.kind }}">
{% if "Athena::DependencyInjection::Extension::Schema" in obj.included_modules  %}
  <div class="doc doc-contents {% if root %}first{% endif %}">
    {% if obj.doc %}{{ obj.doc |convert_markdown_ctx(obj, heading_level, obj.abs_id) }}{% endif %}

    <h2>Configuration Properties</h2>

    {% include "schema.html" with context %}
  </div>
{% else %}
  ...
{% endif %}
</div>

In which it will render schema.html if the current object includes the Schema module, while still rendering docs on the module itself. Otherwise it’ll be rendered as a normal Crystal type.

As the old saying goes “two steps forward, one step back,” we now have a lot more freedom, but now need an entirely new way to know what configuration properties we have in order to render them. As well as a better way to display default values.

Getting Creative

Thinking outside the box can often lead to good workable solutions. In this case I needed a way to expose data to the doc generator in a structured way such that I could use it to build out the page. And what’s everyone’s favorite structure textual format? JSON! Because constants’ values are exposed in the API docs, I figured I should be able to have an array constant consisting of JSON strings that could be parsed in the Jinja2 template. In the end this looked something like:

module Athena::DependencyInjection::Extension::Schema
  macro included
    # ...

    CONFIG_DOCS = [] of Nil
  end

  macro property(declaration)
    {%
      # ...

      CONFIG_DOCS << %({"name":"#{declaration.var.id}","type":"`#{declaration.type.id}`","default":"`#{default.id}`"}).id
    %}
    
    # ...
  end
end

Each property call pushes a valid JSON object that includes the name, type, and default of the property into the array, which would then be exposed in the JSON crystal doc output, which would be available in the template to use as needed.

Parsing the JSON

Parsing the JSON ended up being a bit complex due to the lack of built-in support for a from_json Jinja2 function/filter. But after some sleuthing around I figured out a way to add this in. The Athena doc site makes use of mkdocs-gen-files as a means of generating the API doc structure for each unique type. This process uses a Python script to do this. I figured out I could add the custom filter in at this stage and have it be available before mkdocstrings-crystal goes to render the templates before mkdocs builds the site.

handler = mkdocs_gen_files.config['plugins']['mkdocstrings'].get_handler('crystal')

# get the `update_env` method of the handler
update_env = handler.update_env

# override the `update_env` method of the handler
def patched_update_env(markdown: md.Markdown, config: dict[str, Any]) -> None:
    update_env(markdown, config)

    def from_json(data):
        return json.loads(data.removesuffix('of Nil'))

    # patch the filter
    handler.env.filters["from_json"] = from_json

# patch the method
handler.update_env = patched_update_env

And tada! We now have a nice and new from_json filter. Also note it removes the trailing of Nil from the input string. This is required due to the CONFIG_DOCS being defined with of Nil as part of its default empty array value.

Rendering the Data

After a few iterations of the schema.html template, I came up with a decent first pass:

image

Template Code
{% for param in (obj.constants['CONFIG_DOCS'].value|from_json) %}
  <div class="doc doc-object doc-method doc-doc-instance_method">
    <p>{{ param['name'] | code_highlight(title='', language="crystal", inline=True) }}</p>
    <div class="schema-type">
      <strong>type: </strong>{{ param['type'] }}
    </div>

    <div class="schema-default">
      <strong>default: </strong>{{ param['default'] }}
    </div>
  </div>
  {% if not loop.last %}<hr>{% endif %}
{% endfor %}

Not bad, not bad. But still not great. The type and default should be code blocks, and we’re missing the doc comments from the schema itself.

Skipping ahead a few versions, I ended up with this as something I was happy with:

Template Code
{% for param in (obj.constants['CONFIG_DOCS'].value|from_json) %}
  {% set obj = obj.instance_methods.__getitem__(param['name']) %}
  <div class="doc doc-object doc-method doc-doc-instance_method">
    {% filter heading(heading_level+2, id=obj.abs_id, class="doc schema-heading", toc_label=obj.short_name) -%}
      {{ param['name'] | code_highlight(title='', language="crystal", inline=True) }}
    {%- endfilter %}
    <div class="schema-type">
      <strong>type: </strong>{{ param['type'] |convert_markdown_ctx(obj, heading_level+2, obj.abs_id) }}
    </div>

    <div class="schema-default">
      <strong>default: </strong>{{ param['default'] |convert_markdown_ctx(obj, heading_level+2, obj.abs_id) }}
    </div>

    <div class="doc doc-contents {% if root %}first{% endif %}">
      {% if obj.doc %}{{ obj.doc | convert_markdown_ctx(obj, heading_level, obj.abs_id) }}{% endif %}
    </div>

    {% if 'members' in param %}
      <div class="schema-members">
        <p>This property consists of an object with the following properties:

        <blockquote style="color: inherit;">
          {% for member in param['members'] %}
            {% filter heading(heading_level+3, id="%s.%s" % (obj.abs_id, member['name']), class="doc schema-heading", toc_label="%s.%s" % (obj.short_name, member['name'])) -%}
              {{ member['name'] | code_highlight(title='', language="crystal", inline=True) }}
            {%- endfilter %}

            <div class="schema-type">
              <strong>type: </strong>{{ param['type'] |convert_markdown_ctx(obj, heading_level+4, obj.abs_id) }}
            </div>

            <div class="schema-default">
              <strong>default: </strong>{{ member['default'] |convert_markdown_ctx(obj, heading_level+4, obj.abs_id) }}
            </div>
            {% if not loop.last %}<hr>{% endif %}
          {% endfor %}
        </blockquote>
      </div>
    {% endif %}
  </div>
  {% if not loop.last %}<hr>{% endif %}
{% endfor %}

The abstract methods I was generating back in version one are still around and hold onto the location/comments from the schema module. As such, I am able to look them up to use them to get the docs off of without actually rendering them at all. I also used it as the context within convert_markdown_ctx, which makes it so if you had a configuration property defaulted to some constant, it would render as a link to that constant’s definition. For example:

The name/type/default of the members of a array_of property are rendered in a blockquote to differentiate them, and also show up nested in the nav, where it’s possible to link/navigate directly to. This is definitely a pretty good improvement! But I was still not satisfied; there was still not really a good way to include contextual documentation for each member outside of putting it directly on the rules. Basically like you had to do for the defaults when starting to work on version two…I think we can do better.

Version Three

Exposing the documentation for each member proved to be quite tricky. Unfortunately docs are not attached to TypeDeclaration nodes, so you can’t just do something like:

array_of rules,
  # Use this rules configuration if the request's path matches the regex.
  path : Regex? = nil,

  # ...
  stop : Bool = false,

  # ...
  prefer_extension : Bool = true

I’m definitely open to better ideas, but the only way I could think of was putting them directly on the rules in some structured way that I could extract it to include within the JSON object for each member.

Getting Creative Again

In the end I came up with this:

# ... Docs for rules ...
#
# ---
# >>path: Use this rules configuration if the request's path matches the regex.
# >>stop: If `true`, disables the format listener for this and any following rules.
# Can be used as a way to enable the listener on a subset of routes within the application.
# >>prefer_extension: Determines if the `accept` header, or route path `_format` parameter takes precedence.
# For example, say there is a routed defined as `/foo.{_format}`. When `false`, the format from `_format` placeholder is checked last against the defined `priorities`.
# Whereas if `true`, it would be checked first.
# ---
array_of rules,
  path : Regex? = nil,
  stop : Bool = false,
  prefer_extension : Bool = true

Where:

  • --- denotes the start/end of the block
  • >> denotes start of the name of a parameter, with : denoting end of the parameter name

I was then able to use the following code to parse the docs associated with the array_of call, and construct both a mapping of member names to their docs, but also remove the entire block from the doc comment generated as part of the rules property.

doc_string = ""
member_doc_map = {} of Nil => Nil
in_member_docblock = false
current_member = nil

@caller.first.doc.lines.each_with_index do |line, idx|
  # --- denotes member docblock start/end
  if "---" == line
    in_member_docblock = true

    # >> denotes start of property docs
  elsif in_member_docblock && line.starts_with?(">>")
    current_member, docs = line[2..].split(':')

    member_doc_map[current_member] = "#{docs.id}\\n"
  elsif current_member
    member_doc_map[current_member] += "#{line.id}\\n"
  elsif "---" == line && in_member_docblock
    in_member_docblock = false
    current_member = nil
  else
    # The line where the docs are added in already have a `#`, so no need to add another
    doc_string += "#{idx == 0 ? "".id : "\# ".id}#{line.id}\n"
  end
end

From there, I looked up the docs for a related member when iterating over them to build out their JSON object. All that was left was adding in a:

<div class="doc doc-contents {% if root %}first{% endif %}">
  {{ member['doc'] | convert_markdown_ctx(obj, heading_level+4, obj.abs_id) }}
</div>

After the default div and tada! The final result:

Pretty darn cool if I do say so myself!

Conclusion

Overall I’m really happy with the end result. It’s a lovely DX knowing I can just write the bundle schemas, document them as I do any other Crystal type/method, and have the user facing documentation for them be handled for me. Especially in a more context specific way. I’m sure there will be more tweaks/improvements by the time this is released, so stay tuned for future posts to see the end result!

As usual feel free to join me in the Athena Discord server if you have any suggestions, questions, or ideas!

6 Likes