Shards' `postinstall` considered harmful

I’ve never been a fan of shards’ postinstall hook and I don’t think I have ever explicitly used it myself. So far, I have been mostly indifferent about it. Lately, however, I’ve realized that probably postinstall is often more harmful than not.

ameba is a popular development tool and many shards have it listed as a development dependency. When ameba gets installed, its postinstall script builds the ameba binary in bin/ameba. Every time I run shards install for a shard using ameba, ameba builds itself. But that’s only useful if you want to contribute to the shard. Most of the time, I don’t need it. And I certainly didn’t ask for it.
Why do I have to wait for ameba to build? Ameba shouldn’t build itself every time it’s installed.
I picked ameba because it’s popular and has stressed my patience many times now. But many other shards are very similar.

For demonstration, I have compiled a list of postinstall scripts used by shards that are indexed on shardbox.org:

Without going into details about individual shards, I’m sure that the postinstall scripts should be avoidable for many of them. For a development tool providing an executable, there are less intrusive options. It could install a script at bin/ which lazily builds the binary when requested. User experience would be mostly the same. Except now you don’t have to wait on shards install but the first time you run bin/ameba (and only if you do that).
In other cases, maybe it would be better to have an actual build system in the dependent shard that takes care of setting up dependencies. This requires a little more effort to configure up front, but it offers much more flexibility. The power to decide what is actually necessary lies with the dependent shard’s build system and what part of that the developer needs. You can use the dependency’s build recipes or use custom ones. You’re not locked into a postinstall script that runs without asking and might not do what you need.

Don’t get me wrong: The general idea is certainly nice: If you install a shard as a dependency, a postinstall script can make sure it’s set up correctly for using the dependency. Shards are distributed as source, so they’re very portable. Binary libraries are not. I think there can be good use cases for postinstall scripts related to installing binary dependencies. But that’s not as trivial as it might seem. Authors ship build instructions that work for their development systems. But postinstall scripts need to be portable across platforms. There’s a lot of variability in what system environment you can expect for a postinstall script: shell, libraries, C toolchain, etc.

I don’t think I have to stress that this is really difficult to do right. And even if it works, it’s fragile. If you don’t run broad integration tests for this, the user with a non-expected system environment is the one who has to deal with it.
Sure, you can just pass --skip-postinstall to shards install. But that skips all postinstall scripts of all dependencies. Not just the one that’s broken. A broken postinstall script also affects others.
When I look at the examples I referenced above, I’m pretty sure many of them will easily break if applied on anything else than what the developers use as their development system.

I ask you to not use postinstall. Whatever you want it to do, I’m sure we can do better differently. I’m happy to discuss individual cases - but let’s not dive too deep into edge cases that become bikeshedding.

9 Likes

Whatever you want it to do, I’m sure we can do better differently.

What better option exist if the shard depends on a C? Either just a shim for the bindings, or a more complete bundling of a larger entity. There are unfortunately constructs made available by C header files that are not callable from Crystal, like macros or functions declared as static inline, so sometimes it isn’t really feasible to avoid shims.

Regarding installing binaries, it may be a good idea to have built in support for that - I’m fairly certain any distro packagers that pick up a crystal prog would appreciate that.

1 Like

What good option exists for that anyways? Building even a trivial C library in a portable way isn’t straightforward.

Yes, of course we need some way to build the libraries for shards that bind to C libraries.
But a postinstall script that’s automatically invoked after installation of the dependency just doesn’t seem like a very good idea for that. There’s no easy way to apply configuration.

I think it would be much better to initiate the dependency’s build steps from the dependent’s build system. That requires an extra small step to pull that in instead of having it pushed. But you can actually do what you need and not just have take what you get.

For example, my shard sass.cr defines a dep target in its makefile which builds libsass. If I want to use that shard as a dependency, I’d add a build step to the dependent shard that runs make -C lib/sass.cr dep.
Maybe I don’t even need a custom library built at all. I can just install the library’s binary distribution package. I do that on deployment, for example.
I have no use of getting a library built automatically if I’m going to link to a different one anyways.

I agree that postinstall feels like a sledgehammer for this tiny nail. The idea of the lazy build script sounds like a great solution to some of the issues I had getting binary builds working for my grpc shard, which depends on protobuf, so I had 2 degrees of dependencies to figure out how to get binaries for into the dependent application. Combined with a bug in shards, this was rough. I think the lazy-build-script idea would’ve worked around it.

Do you mean that an application depending on a shard that has a development dependency on ameba installs and builds the ameba binary?

Not sure I understand the question. Every time ameba gets installed as a dependency, it builds itself.

And as previously mentioned, some dependencies might not be available on other platforms, or might be called something else or require different steps per platform. Case in point: make for Windows; e.g.: makefile - How to install and use "make" in Windows? - Stack Overflow .

1 Like

I mean that this part here is ambiguous:

The wording implies you’re inside the shard’s directory (you’re running shards install in it), but it also says you’re not contributing to the shard, and I can’t think of another reason you’d be running shards install for that shard.

If you’re not in that shard’s directory, it sounds like if my application has a runtime dependency on a shard that has a development dependency on ameba, ameba will be installed in my app’s lib directory.

Just trying to get clarity on wording and shards functionality.

What good option exists for that anyways?

I’m not saying there are any good solutions, but there are different levels of badness. My main focus here is developer UX, and as far as I’m concerned, a shard where it doesn’t suffice to do shards install && shards build to use it is broken. The lone exception to that are shards depending on system libraries as setting up dependencies to those is an unsolved problem in the
whole industry.

I see that you totally missed the point about shims - those

a: are project specific and need to be built.
b: must be rebuilt after each update.

Point b is especially nefarious as failing to rebuild it will only fail the build if you are lucky - if you are unlucky it will build but not contain bugfixes. This means that having people manually recompile the shims will also break shards update && shards build and totally lose reproducible builds in the process.

This will not scale well, as it will mean that all developers will have to run these custom build commands any time any shard has updated. As required build commands may also change from
version to version, forcing this onto the developers becomes an unmaintainable mess and will make shards update a lot messier than it should be. Upgrading what libraries a big project depends
on is already hard enough in big projects.

That said, I’m not certain these would be built by postinstall optimally - I’d prefer to have these triggered by shards build or perhaps a new shards builddeps to invoke any building steps, as that would simplify fixing errors in the building process.

TLDR: Please don’t kill developer experience in the name of portability.

BTW, regarding system libraries, one thing that would help there is if there was a field in the shards spec about what libs need to be available when linking. Then it would be possible to give the user an informative error about what shard depends on what external dependency, instead of showing a linker error in the face of the user.

1 Like

I have many reasons for installing a shard without the intention to actively contribute to it.
For a shard that provides an application, a very typical use case would be just building it.
Sometimes I just want to checkout what it does, how it works. Take a look at the code. See if the specs pass. Run benchmarks. Or try experimenting with it.
Neither of this use cases involves ameba.

Why would it be broken? Many shards require more than shards build. This can really only work perfectly fine if the only component necessary for the project’s build workflow is a working Crystal compiler.

As soon as you have any other dependencies, such as a C library, you already need more. It might not be obvious from the shell command, but when you run shards install and a postinstall script runs to setup things, you most likely have lot’s of hidden dependencies required for running that. The main problem with that is that it happens automatically in the background and it’s hard to influence it if you need some specific configuration.

As far as developer experience goes, I don’t think it makes much difference if you run shards install && make build instead of shards install && shards build. Or rather just make build because a build system can perfectly take care of installing dependencies.

1 Like

There is a libraries property in the shard.yml specification for exactly this purpose. It’s purely informational so far, but that could be improved. I’m not sure it could work great though, because library names are also system-specific sometimes.

Funny thing is I just fixed a bug in Lucky over the weekend where just building a new Lucky app always installed Ameba. This was indeed related to the fact that the postinstall script called shards build from the app dependency’s directory. The fix for us was to change it to shards build --without-development.

With that said, the reason we even use the postinstall script is because Lucky comes with a lot of smaller component shards, and many of them provide their own CLI tasks. When you run shards install for your app, these each get compiled and thrown in to your app’s bin/ so when you run lucky some.task.name, it doesn’t have to compile anything extra.

I’m not sure how we could achieve the same flow without postinstall, so my guess is we’d end up having to tell people to run an extra step just to get Lucky running locally. :man_shrugging:

These developer tool tasks are similar to what I wrote about ameba. Sometimes you might need them, and then it’s nice when they’re already available. But why do I have to wait for them to build with shards install when I don’t even want to use any of them? Or maybe I need just one task, not the other dozend (whatever the number of individually built binaries).

Since you already have a single entry point to delegate to these task binaries, it should be pretty easy to have them built on demand. And then cached, of course, so you don’t have to compile a task again.
That way you only have wait time when you actually use a task. It might have a stronger impact on perception when executing a simple task lucky some.task.name takes a bit longer (on the first run), compared to shards install which already takes a noticeable amount of time anyways.
To mitigate that, there could of course be an option to build all tasks at once. So that’s similar to postinstall taking care of that, but it’s opt-in. Not imposed on you at shards install.

1 Like

I’ll admit my first thought was maybe it could just spew out postinstall instructions instead of running a script. You know, instructions to…manually run a script?

Second thought, if we’re brainstorming…talk to the ameba people and get them to convert to an on-demand script?

2 Likes

I think as an industry we lack a common cross-language build tool with an easy syntax. Make works well, and it is what I use, but it is not easy to understand. As soon as you try to do something beyond the basics a Makefile becomes incomprehensible.

4 Likes

Then build better shards that don’t need special configuration. But yes, I agree that I’d prefer for things to be built during a build step.

However, I couldn’t disagree more about this claim, because the difference between something that needs to be maintained by the users of a shard and something that is automated that they don’t need to care to update manually as shards are updated really cannot be overstated. It really is something that needs to be minimized if we wants updates to be painless.

That said, having a shards prebuild or something that would trigger (for example) make commands might give the best of both worlds as having it as a separate step allows any who needs special configuration to apply that while the rest of us could have a maintenance free solution.

Hmm, I’ve seen many cases of packagers putting the libraries in differently named packages, but do they also name the linked .so differently? I havn’t seen that I think, not counting versioning. That would introduce challenges for people linking stuff. My suggestion was about naming the actual library and not the package names as those tend to be more static.

Yeah, that’s a valid point. However, I think it’s relevant to stress that “Users of a shard” actually means developers of a dependent shard. The end users shouldn’t have to deal with that.
Developers have to establish build instructions for their dependencies. Assuming there is a simple, single-invocation step for that (which is a strict necessity for automated postinstall), it should be as simple as make -C lib/foo. Control of the build step is delegated to the depdency. That interface is minimal and shouldn’t require much maintenance.
The interface obviously grows, when you need more build configuration. But the point is that you even can do this if necessary. With automated postinstall, there’s no simple way.

The file extension .so is already system-specific :grin: It probably doesn’t matter much because the extension would be consistent on the respective platform.
But yes, apparently even the names of libraries occasionally differ between systems:

Well, as dependent shards in this context means everyone that develops in crystal and make use of shards I fail to see the relevance. I still think it would be nightmare for any app of decent size (think medium size rails app and up).

No, since it is still up to the user to invoke it at all, or to make sure local automation does it. And then subsequently fail (or do something that is not wanted in this context, like running specs) when the package refactors their makefile (or add one, for that matter).

No. It is small, not minimal. A minimal interface is no interface at all where all steps are embedded in the shards file and invoked from shards.

I do agree that the postinstall is not the right time for it to be executed, but currently there are no better option. Adding a shards command to invoke it would still allow configuration in the places it is needed while keeping it simple for everyone else.

Yikes. I didn’t even knew annotations could be system specific like that :slight_smile:

Not everyone who makes use of shards is a developer. Even if you just want to build the shard, you need to install dependencies. I already mentioned a number of use cases for checking out a shard and running shards install that are not about active development.

That would make shards a build system. I’m strongly convinced this is a very bad idea. We don’t need shards to be a build system. There are other’s available to do that job. None of them is really universally great, as @chillfox mentioned. But there’s no point in trying to do a better job with shards. It’s a dependency manager.