Testing Shards on the users computer

Is it possible, or planned to be possible and easy to run the tests (specs) of all the shards a person installs via shards install?

Why would this be useful?

The shards can have tests that are executed by their authors. They might even set up a CI system (See Migrate CI in crystal-lang projects, GitHub Action to install Crystal and Shards: unified CI across Linux, macOS, and Windows, Github Actions)

However even then the matrix of platform/architecture/Crystal version/Other dependencies is very limited and there is a good chance the person installing a shard has some other mix of these that where the shard was never tested. In most cases the shards will work just fine, but IMHO it would be a good idea to make it easy to run the tests of all the shards that were installed via shards install or at least test the immediate dependencies.

In the Perl world where I spent many years the tests are packaged in the distribution and the cpan client (the tool to install 3rd party packages) by default runs all the tests that come in the 3rd party package.

This is awesome because it verifies that the code I install works in my environment. People can also configure their client to send the test results to a central database called CPAN Testers that can be very useful for both the authors of these packages and other users who might be able to locate an older version of the package that was working on their platform or just evaluate a package.

It also has the disadvantage of making the installation take a long time and causing headache to the beginners who might encounter test failures during installation and don’t know what to do with it.

I think it could be very useful to have similar capabilities in Crystal where you can opt-in to run the tests and opt-in to send the reports. It might need further (?) standardization on how to run the tests of a shard.

2 Likes

Pretty sure you could just cd into the directory and run crystal spec given the source and spec code is cloned. However, this might be more complicated if the shard has a dev dependency that’s required to run them.

Either way, what would be the purpose of this?

I added my explanation in the original post to make it stand out more.

the challenge would be dealing with all the dependencies of the tests
I have a bunch of specs that expect a fresh dockerised instance of redis, snmp server, ssh server etc to test against and the test environment configured by the CI
Would be pretty challenging to unify that into a single CI run

but might be useful as an option flag on shards to shell out to crystal spec in the folder after cloning and report on the results at the end of the clone.

Shards can specify a postinstall script within their shard.yml. I would be extremely hesitant to use this for running specs though. As above commentators have noted, one of the main challenges is that in many cases specs may rely on development_dependencies which should not be mandated for all users of a lib.

More importantly, if the goal of this is to build a central test run DB, how do you fully classify the environment? Given most uses are likely running on mutable systems with a large number of moving parts, the signal noise ratio is likely not good.

If you are wanting strong guarantees that software will run on different environments, these environments need to be hermetic. The good news is, there are projects who’s whole focus is this, and they already work with crystal.

I would not want people to us something like postinstall. I imagine it something like a new command: e.g. shards test-dependencies that would do the work.

There are two goals. One is to allow the user to verify the dependencies work on their system.
The other one is the central DB. The former can work without the latter and would already give value.

I don’t know what do you mean by “fully classify”. The system will collect certain key details of he current system (OS/Architecture/Crystal Version/versions of the dependencies/etc.) and then the central db will allow the users to filter the data based in any of the fields.

As far as I understand NixOS is orthogonal to what I am talking about. I would like to be able to verify that certain shards work in my environment.

I like the general idea. I’m not sure about the details like a testing database etc. But running a full “dependency check” sounds like it might be useful. I share the concerns that running tests in a standardized way might not be that easy. crystal spec often doesn’t cover it if the test suite has external dependencies such as services or libraries.
But I really think this should be made easy. Running specs on any shard should be a single command. That single command should take care to set up the dependencies in a way that the test suite can run.
So maybe this dependency testing use case could be a way to encourage that.

I think it shouldn’t be hard to get started on this with a prototoype to see how it works, what kind of issues are faced etc. And then consider what kind of interfaces need to be dealt with, what are common practices regarding shard specs etc.

This can totally start out as a contrib command. A simple shell script would probably do. It just has to iterate the directories in lib and run crystal spec (or maybe see if there’s a make recipe for example).
Call it shards-test-dependencies and put it in path. You can call it as shards test-dependencies.

That shouldn’t be an issue though. If there’s a tool to run specs in dependencies, it can easily make sure their dev dependencies are installed as well. (and I think we agree that wouldn’t be the default)

1 Like

I probably missed a comment along the way; but, would it suffice to maybe fork the repo and run the tests on the fork with applicable dependencies?

I don’t understand how fork would be relevant here. Maybe clone? But when we run shards install we already get a full copy of all the files in the git repository of the dependency. Recursively. So that’s not an issue.

I think Crystal could set a standard on how to prepare a shard for testing and how to run the tests and each shard developer could decide what kind of tests to include in that standard process.

Some of the tests can be environment dependent. e.g. a generic ORM that supports several types of databases might be tested only on databases that are installed in the current environment. The developer of the ORM will arrange for all the DBs to be available on the CI, but users of the ORM would only care about the DBs they are using.

Some tests might be executed only if some environment variable is on. (e.g. DEVELOPER_TESTING=1 )

3 Likes

Just my opinion: I never had to run tests for a dependency in any language. Usually on the Github page there’s a CI status. And if the shard is published it must surely mean specs work? And if something doesn’t work, wouldn’t you find out when running or compiling your app?

So I’m just curious about this use case.

2 Likes

So I’m just curious about this use case

Yeah, I’m not sure what @szabgab 's use case is either. The reason why I mentioned forking it was that if I am trying to use a library and am having issues, I might check if the repo’s latest tests passed (assuming it has decent test coverage and the code is not too out of date), or I might fork it (or clone it) and try running the tests locally to make sure I can at least get the tests to pass locally; like see if maybe I am missing some dependency that was not documented or that I overlooked when trying to use it in my app.

Two answers:

I just (ok it was more than a year ago) had a case when the upgrade of one plugin of pytest (the library to test python code) broke another plugin. The CI of both of them passed because they did not test together with the other plugin.

Shard X uses shard Y and the CI of X used the latest version of Y so it passed.
I use an older version of Y.
Or I use it on some Operating System that was not in the mix of the CI of X.

When my code fails to work I usually think I mad a mistake. Then, sometimes hours or days later (yes I know I am slow) I find out that X does not even pass its own specs on my machine. Either because of the old Y or because of an OS that was never used by the developer of X.

@drhuffman12 what I am saying is similar to what you would do, but in a standard and well-structured way to make it easy to do it: Get the repo of the dependency and run its spec on my computer.

shards install already gets the repo, or at least the files from a specific version of the repo, so I was asking if there is a standard and easy way to run the specs for all the dependencies. (I know I can write a program to do that, I was asking if this already existed or was in the works.)

1 Like

@szabgab , I have had a case where a (nested?) dependency broke. (A Ruby gem’s release was deleted from the repo and upgrading to the newer release had some issues and wasn’t easily automated by the existing CI.) So, I can imagine scenarios for concern about (nested) dependencies.

Some ideas:

(a) Version-lock your dependencies (via version: ... setting) in your app’s shards.yml, e.g.:

development_dependencies:
  minitest:
    git: https://github.com/ysbaddaden/minitest.cr.git
    version: ~> 0.3.1

(b) For cases where it is not possible/desirable to version-lock, maybe set up a schedule in the CI config for your app (or what ever you are using to verify that your dependencies are still working)? Then you’ll have a repeating (e.g.: weekly) eyeball on any dependency changes that break your library. (NOTE: Your CI might have limits on scheduled jobs. Github has some.)

If you are using Github Workflows, it might look like:

on:
  push:
  pull_request:
  schedule:
    - cron: '0 6 * * 1'  # Every monday 6 AM

… like I’m using here: smcr/ci.yml at master · drhuffman12/smcr · GitHub

… and copied/edited from somewhere. (I forget exactly.)

(c) For mission-critical dependencies, fork a copy and use your forked copy instead.
(d) I’m not sure, but I’m guessing you could automate some way to update a forked (or cloned) copy if a repo and re-run tests for it, maybe on a CI schedule.

1 Like