Lessons from the trenches, with map and sort

In Ruby this works fine, and as expected:

rangeres.each_cons(2).map { |n1, n2| n2 - n1 }.tally.sort

It takes an array of numbers (increasing +ints), finds the diff between each pair,
tallys their values into hash pairs, then sorts them in ascending order per the keys.

Here’s sample output:

> print rangeres.each_cons(2).map { |n1, n2| n2 - n1 }.tally.sort

=> [[2, 291946], [4, 291658], [6, 507719], [8, 220068], [10, 282635], [12, 351830], [14, 190819], [16, 139980]]

But the same code produces this error in Crystal (1.6.1):

In rangeres.cr:6:29

 6 | print rangeres.each_cons(2).map { |n1, n2| n2 - n1 }.tally.sort
                                 ^--
Error: wrong number of block parameters (given 2, expected 1)

OK, a little surprising, so after looking at the docs, I tried this and got:

In rangeres.cr:6:62

 6 | print rangeres.each.cons_pair.map { |n1, n2| n2 - n1 }.tally.sort
                                                                  ^---
Error: undefined method 'sort' for Hash(Int32, Int32)

Ok, map works now, and sort is the problem, so getting rid of it for now works:
This produces unsorted output.

print rangeres.each.cons_pair.map { |n1, n2| n2 - n1 }.tally

After reading the Docs again, I got the other form to work too:

print rangeres.each_cons(2).map { |pair| pair[1] - pair[0] }.tally.

I finally got around the sort issue from experience with Crystal.
It wants an array, so I gave it an array.

print rangeres.each_cons(2).map { |pair| pair[1] - pair[0] }.tally.to_a.sort

Now I’m not saying Crystal had the problems to get this snippet to finally work.
After I took the time to search for, and read, the docs it was clear what to do.

However, compared to Ruby it was sooooo unituitive what the Crystal syntax had to be.

I share this story, because like it not, Life Is Unfair!

I took at least 30 minutes to figure this out, because I knew Crystal could do this too.
I just needed to know how it did it. Then I took another hour to write this up.

Because Crystal is the yougin on the block, it’s going to be compared to older siblings,
and for Crystal that’s mostly (firstly) going to be Ruby. And small little differences
between them like this may turn the not so invested potential user away, because they
ain’t going to go thru the trouble of tracking the Docs down and reading them.

So please, if possible, somewhere in the Docs put this example as something to be aware
of, for both people coming from Ruby to Crystal, and vice versa.

But…I think sort should work with hashes the same way Ruby does. It should be easy to do.
Maybe for 2.0 (or sooner) you can consider and include that behavior. :slightly_smiling_face:

ADD:
Ah…I just remembered, from a long ago prior problem, this works too. Still not intuitively to do.

print rangeres.each.cons(2).map { |(n1, n2)| n2 - n1 }.tally.to_a.sort

Yes, Crystal does not have implicitly deconstructuring of arrays in block arguments, that works only for tuples.
But for this exact purpose there’s Enumerable#each_cons_pair. It is even recommended as an improved alternative in the API docs of #each_cons.
As a result you get a more performant implementation without unnecessary array allocations (Ruby has no tuples, only arrays).

However, compared to Ruby it was sooooo unituitive what the Crystal syntax had to be.

I’m wondering how much of that is biased by the similar, but slightly different Ruby syntax.
If you were starting in a language sufficiently different from Ruby and Crystal and had to translate an equivalent piece of code, I think both, Ruby and Crystal, versions would be similarly intuitive.

You have an expression in Ruby and you might expect it to work in Crystal (it’s similar enough and often it does actually work), but when it doesn’t is it Crystal’s fault? Crystal is decidedly not a Ruby dialect with typing. It takes a whole lot of inspiration from Ruby, which is great, but it deviates in some places very much with the intention of making it even better.

Talking about Hash#sort: In Ruby the array transformation is implicit. It’s basically just a short cut for to_a.sort. That saves five key strokes for writing it. But I’d argue it’s less intuitive. The implicitl transformation is quite surprising.
I expect sort to return an exact copy of the collection just with the contents sorted. That’s how it usually operates. This implies having the same type, of course. The signature in Crystal is always def sort : self, for example.
So I think it’s better to not have a short cut method with unexpected effects. This forces you to make a conscious decision about the data type conversion and - maybe the most important aspect - explicitly document that in the code.

4 Likes

For another discussion of not implementing Hash#sort see `sort` and `sort_by` methods of Hash class · Issue #1915 · crystal-lang/crystal · GitHub

1 Like

Thank you, I was looking for that reference :bowing_man:

Actually, this was the first thing I tried, like this, and got errors.

print rangeres.each_cons_pair.map { |n1, n2| n2 - n1 }.tally.to_a.sort

In rangeres.cr:9:16

 9 | print rangeres.each_cons_pair.map { |n1, n2| n2 - n1 }.tally.to_a.sort
                    ^-------------
Error: 'Array(Int32)#each_cons_pair' is expected to be invoked with a block, but no block was given

I didn’t (still don’t) understand the error message, so I tried the other variants.
Now I know that I need to pass the internal members as one group, so this works.

print rangeres.each_cons_pair.map { |(n1, n2)| n2 - n1 }.tally.to_a.sort

I consider this a deficiency in documentation, not the language, because from the (1) use example in the docs you’d never know that it wouldn’t work in that form with map.

https://crystal-lang.org/api/1.6.2/Enumerable.html#each_cons_pair(&:T,T->):Nil-instance-method

In fact, I think the docs should explicitly show its use as:

each_cons_pair do |(a, b)| ...

and not

each_cons_pair do |a, b| ...

But again, if you came to Crystal and didn’t know anything about Ruby (others) it would just be the way it was supposed to be. But that won’t be the case (currently) with most potential users.

Oh that’s a good point. There’s no equivalent to #each_cons_pair(&) without a block. This could be a worthy addition. Maybe each_cons_pair could just be a shortcut for each.cons_pair :thinking:

1 Like

Something doesn’t make sense in your last comment: each_cons_pair do |(a, b)| is not even valid unless the collection item type can be destructured. It has an entirely different meaning than each_cons_pair do |a, b|.

I’m sorry, got too many things mixed up.

These work:

rangeres.each.cons_pair.map { |n1, n2| n2 - n1 }.tally.to_a.sort
rangeres.each.cons(2).map { |(n1, n2)| n2 - n1 }.tally.to_a.sort
rangeres.each_cons(2).map { |pair| pair[1] - pair[0] }.tally.to_a.sort!

These don’t:

rangeres.each_cons_pair.map { |n1, n2| n2 - n1 }.tally.to_a.sort
rangeres.each_cons_pair.map { |(n1, n)| n2 - n1 }.tally.to_a.sort
rangeres.each_cons_pair.map { |pair| pair[1] - pair[0] }.tally.to_a.sort

So the docs are silent on how the method won’t work chained with map, and doesn’t explain you must use the other variants to do that.

In think this behavior with each_cons_pair is certainly surprising and unexpected.
I would also consider it an implementation bug, especially if its intent is to be an optimized replacement for each.cons_pair, since it works chained with map.

I’m sorry, got too many things mixed up.

These work:

rangeres.each.cons_pair.map { |n1, n2| n2 - n1 }.tally.to_a.sort
rangeres.each.cons(2).map { |(n1, n2)| n2 - n1 }.tally.to_a.sort
rangeres.each_cons(2).map { |pair| pair[1] - pair[0] }.tally.to_a.sort!

These don’t:

rangeres.each_cons_pair.map { |n1, n2| n2 - n1 }.tally.to_a.sort
rangeres.each_cons_pair.map { |(n1, n)| n2 - n1 }.tally.to_a.sort
rangeres.each_cons_pair.map { |pair| pair[1] - pair[0] }.tally.to_a.sort

So the docs are silent on how the method won’t work chained with map, and doesn’t explain you must use the other variants to do that.

In think this behavior with each_cons_pair is certainly surprising and unexpected.
I would also consider it an implementation bug, especially if its intent is to be an optimized replacement for each.cons_pair, since it works chained with map.

I wouldn’t say the behavior of #each_cons_pair(&) is surprising. It takes a block as its argument, like #each(&), and it doesn’t return (or returns nil, rather). So when you fail to give it a block you’re trying to call an overload that doesn’t exist. I think what you’re looking for is that “missing” overload, #each_cons_pair(), which would return an Iterator (probably the same Iterator as would be returned by Iterator#cons_pair).

1 Like

Iterable#each_cons_pair() is definitely missing. It should be straightforward to create a PR for that.

3 Likes