Segmentation fault when running crystal build

Hi,

I’m one of the maintainers of a GitHub repository that is linked to a YouTube video series that is aimed at comparing programming languages(’ performance) by calculating prime sieves.
The repo has included a Crystal solution for quite some time now, but since about a day, the docker-based GitHub CI builds for the Crystal solutions are consistently failing. When I run a docker build on the solution locally, I get the following output:

 => [internal] load build definition from Dockerfile                                                               0.1s
 => => transferring dockerfile: 399B                                                                               0.0s
 => [internal] load .dockerignore                                                                                  0.1s
 => => transferring context: 2B                                                                                    0.0s
 => [internal] load metadata for docker.io/library/alpine:3.13                                                     5.0s
 => [internal] load build context                                                                                  0.1s
 => => transferring context: 1.96kB                                                                                0.0s
 => CACHED [build 1/5] FROM docker.io/library/alpine:3.13@sha256:f51ff2d96627690d62fee79e6eecd9fa87429a38142b5df8  0.0s
 => => resolve docker.io/library/alpine:3.13@sha256:f51ff2d96627690d62fee79e6eecd9fa87429a38142b5df8a3bfbb26061df  0.0s
 => [build 2/5] RUN sed -i -e 's/v[[:digit:]]\..*\//edge\//g' /etc/apk/repositories     && apk add --no-cache bu  14.9s
 => [build 3/5] WORKDIR /opt/app                                                                                   0.2s
 => [build 4/5] COPY primes.cr .                                                                                   0.1s
 => ERROR [build 5/5] RUN crystal build primes.cr --release --static --no-debug                                    0.5s
------
 > [build 5/5] RUN crystal build primes.cr --release --static --no-debug:
#9 0.413 Invalid memory access (signal 11) at address 0x555ad47af84a
#9 0.453 [0x555a946a4186] ???
#9 0.453 [0x555a946a4153] ???
#9 0.453 [0x7f6e43397c8a] ???

I would like to restore the CI workflow within the repo as soon as possible, and will as such appreciate any pointers anyone can give me towards solving this.

EDIT: I’ve opened issue #456 on the Primes repo concerning this, as it’s blocking CI and benchmark runs. Any input/suggestions concerning what I’ve described can also be added as comments on that issue. I would have included a link to the issue, but the forum post editor won’t allow me to add more than 2 links in this post because I’m a new user.

Thank you,

Rutger

Since the CI workflow is docker-based, you should be able to easily work around by running on the 1.0.0 docker image instead of latest.

EDIT: I see Crystal fix by marghidanu · Pull Request #457 · PlummersSoftwareLLC/Primes · GitHub did just that.

primes.cr builds fine locally with both 1.0.0 and 1.1.0, as well as with the official Alpine and Ubuntu docker images of both versions.

It seems like the way crystal was installed in the alpine image manually before #457 is at fault:

sed -i -e 's/v[[:digit:]]\..*\//edge\//g' /etc/apk/repositories \
    && apk add --no-cache build-base crystal

Replacing all APK repositories by edge doesn’t look like a smart move and it would appear that this might be the reason for failure.
It seems to install the compiler this way makes it unusable. Even crystal eval 'puts "Hello World"' fails.

I’m aware of #457 working around the issue, as the author of that PR is one of the other maintainers of the repo.

The use of Alpine edge doesn’t explain the segfault in itself, because it has been running in a stable fashion in exactly that configuration for months, using crystal 1.0.0. The segfault started occurring when Alpine edge switched from crystal 1.0.0 to 1.1.0, so some relation with that seems likely to me.

In any case, we consider the aforementioned PR a workaround, because we effectively dropped support for arm64 by merging it.

Yes, the breaking behaviour reproduces with the compiler installed from sports in the alpine edge base image (alpine:edge). But it does not with the compiler in the official alpine-based crystal image (crystallang/crystal:1.1.0-alpine).

So it seems to be caused by the way the compiler is built in aports. I’m wondering why this hasn’t been caught before. Even the simplest program won’t compile.

/cc @mps

The reason for installing manually was that we needed to support arm64. The Crystal official images are only available for amd64.

Maybe the aarch64 binary from Alpine edge even works. :man_shrugging:

The segfault appears in the parser stage. That’s already quite surprising, actually.
According to strace it seems to happens while parsing src/crystal/dwarf/abbrev.cr.

... [shortened] ...
#12 0.784 read(6, "require \"crystal/dwarf\"\n{% if fl"..., 4096) = 2048
#12 0.785 read(6, "", 4096)                       = 0
#12 0.785 close(6)                                = 0
#12 0.786 access("lib/crystal/dwarf.cr", F_OK)    = -1 ENOENT (No such file or directory)
#12 0.786 getcwd("/opt/app", 4096)                = 9
#12 0.786 access("/opt/app/lib/crystal/src/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.787 getcwd("/opt/app", 4096)                = 9
#12 0.787 access("/opt/app/lib/crystal/src/crystal/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.788 getcwd("/opt/app", 4096)                = 9
#12 0.788 access("/opt/app/lib/crystal/dwarf/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.788 getcwd("/opt/app", 4096)                = 9
#12 0.788 access("/opt/app/lib/crystal/src/dwarf/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.789 getcwd("/opt/app", 4096)                = 9
#12 0.790 access("/opt/app/lib/crystal/src/crystal/dwarf/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.790 access("/usr/lib/crystal/shards/crystal/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.790 getcwd("/opt/app", 4096)                = 9
#12 0.791 access("/usr/lib/crystal/shards/crystal/src/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.792 getcwd("/opt/app", 4096)                = 9
#12 0.792 access("/usr/lib/crystal/shards/crystal/src/crystal/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.793 getcwd("/opt/app", 4096)                = 9
#12 0.793 access("/usr/lib/crystal/shards/crystal/dwarf/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.794 getcwd("/opt/app", 4096)                = 9
#12 0.794 access("/usr/lib/crystal/shards/crystal/src/dwarf/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.795 getcwd("/opt/app", 4096)                = 9
#12 0.795 access("/usr/lib/crystal/shards/crystal/src/crystal/dwarf/dwarf.cr", F_OK) = -1 ENOENT (No such file or directory)
#12 0.795 access("/usr/lib/crystal/core/crystal/dwarf.cr", F_OK) = 0
#12 0.796 getcwd("/opt/app", 4096)                = 9
#12 0.796 open("/usr/lib/crystal/core/crystal/dwarf.cr", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 6
#12 0.797 fcntl(6, F_SETFD, FD_CLOEXEC)           = 0
#12 0.797 fcntl(6, F_GETFL)                       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
#12 0.797 fstat(6, {st_mode=S_IFREG|0644, st_size=955, ...}) = 0
#12 0.798 read(6, "require \"./dwarf/abbrev\"\nrequire"..., 4096) = 955
#12 0.798 read(6, "", 4096)                       = 0
#12 0.798 close(6)
#12 0.799 access("/usr/lib/crystal/core/crystal/./dwarf/abbrev.cr", F_OK) = 0                                                                                                                                                         [37/1859]
#12 0.799 getcwd("/opt/app", 4096)                = 9
#12 0.799 open("/usr/lib/crystal/core/crystal/dwarf/abbrev.cr", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 6
#12 0.800 fcntl(6, F_SETFD, FD_CLOEXEC)           = 0
#12 0.800 fcntl(6, F_GETFL)                       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
#12 0.800 fstat(6, {st_mode=S_IFREG|0644, st_size=9295, ...}) = 0
#12 0.800 read(6, "require \"../dwarf\"\n\nmodule Cryst"..., 4096) = 4096
#12 0.801 read(6, "   DW_AT_artificial           = "..., 4096) = 4096
#12 0.801 read(6, "esent = 0x19 # flag\n      RefSig"..., 4096) = 1103
#12 0.801 read(6, "", 4096)                       = 0
#12 0.802 close(6)                                = 0
#12 0.802 access("/usr/lib/crystal/core/crystal/dwarf/../dwarf.cr", F_OK) = 0
#12 0.803 getcwd("/opt/app", 4096)                = 9
#12 0.803 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x562d84afe82a} ---
#12 0.803 writev(2, [{iov_base="Invalid memory access (signal 11"..., iov_len=60}, {iov_base=NULL, iov_len=0}], 2Invalid memory access (signal 11) at address 0x562d84afe82a
#12 0.805 ) = 60
#12 0.805 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a94bd000
#12 0.805 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a94ad000
#12 0.807 munmap(0x7fe9a94ad000, 65536)           = 0
#12 0.808 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a94bb000
#12 0.808 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a94b9000
#12 0.808 munmap(0x7fe9a94b9000, 8192)            = 0
#12 0.809 mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a94b7000
#12 0.810 munmap(0x7fe9a94b7000, 16384)           = 0
#12 0.812 mmap(NULL, 650156, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a941c000
#12 0.812 mmap(NULL, 650156, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a937d000
#12 0.843 munmap(0x7fe9a937d000, 651264)          = 0
#12 0.844 mmap(NULL, 40960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a9412000
#12 0.844 mmap(NULL, 40960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a9408000
#12 0.846 munmap(0x7fe9a9408000, 40960)           = 0
#12 0.846 mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a940e000
#12 0.847 munmap(0x7fe9a940e000, 16384)           = 0
#12 0.847 writev(2, [{iov_base="[0x562c849f3186] ???\n", iov_len=21}, {iov_base=NULL, iov_len=0}], 2[0x562c849f3186] ???
#12 0.848 ) = 21
#12 0.848 mmap(NULL, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a940c000
#12 0.848 mmap(NULL, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a9406000
#12 0.849 munmap(0x7fe9a9406000, 24576)           = 0
#12 0.849 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a940a000
#12 0.850 munmap(0x7fe9a940a000, 8192)            = 0
#12 0.850 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe9a940b000
#12 0.850 munmap(0x7fe9a940b000, 4096)            = 0
#12 0.851 writev(2, [{iov_base="[0x562c849f3153] ???\n", iov_len=21}, {iov_base=NULL, iov_len=0}], 2[0x562c849f3153] ???
#12 0.851 ) = 21
#12 0.851 writev(2, [{iov_base="[0x7fe9b3254c8a] ???\n", iov_len=21}, {iov_base=NULL, iov_len=0}], 2[0x7fe9b3254c8a] ???
#12 0.852 ) = 21
#12 0.852 exit_group(11)                          = ?
#12 0.855 +++ exited with 11 +++

I’ve posted an issue to the aports bug tracker to inform about the broken package: Crystal 1.1.0 binary seems to be broken (#12854) · Issues · alpine / aports

Tested alpine crystal 1.1.0 on aarc64 alpine edge with some simple programs and it works. Looks like it only segfaults on x86_64 alpine edge