Path#join ""

Path.new("a","b")  == Path.new("a","b") # true
Path.new("a","b")  == Path.new("a","b").join("") #false

And yet…

Path.new("a","b").join("c")  == Path.new("a","b").join("").join("c") #true

Is this a political disagreement between the lexical and the semantic?

The greater irony is that the Crystal community seems to desire Crystal to be as faithful to Ruby as possible.

However, in Ruby:

Pathname.new("a/b") == Pathname.new("a/b").join("")  # true

This would have been an irony if this behavior was intentional. But I think this is just a bug. Could you please report it? Thank you!

2 Likes

I’m not so sure this is a bug. It could as well be expected behaviour.

Path.new("a","b") is Path["a/b"]. Joining an empty component gives Path["a/b/"]. The question is if a/b and a/b/ should be considered equal. It depends. In most cases, they are actually equivalent. But some programs interpret a trailing slash indicating a directory and treat it differently. An example would be ls with a path that is a symbolic link to a directory: Without trailing slash, it lists only the link itself, but with trailing slash it lists the linked directory.

So in doubt, a/b and a/b/ should not be considered equal because it makes a difference in some circumstances.

Path.new("a","b").join("c") == Path.new("a","b").join("").join("c") is not a contradiction, because joining c to either a/b or a/b/ results in a/b/c.

8 Likes

This is actually similar to paths that include a current directory entry, like ./a/b:

Path["a/b"] == Path["./a/b"] # => false
Path["a/b"] == Path["a/./b"] # => false

So Path#== is very strict about equality. And I think that’s good, because those syntactical differences usually result in the same semantic meaning, but not necessarily always.

If you want more flexible equality, you can operate on normalized paths:

(Path.new("a/b")).normalize == ((Path.new("a/b")).join("")).normalize # => true
(Path.new("a/b")).normalize == (Path.new("./a/b")).normalize          # => true

Oh, sorry, yeah, it’s probably not a bug.

I was just being a bit defensive against that “He, you said you wanted Ruby compatibility but this is not working like in Ruby.”

Actually, I’m in favor of their inequality. However, if they are to be unequal then perhaps…

Path.new("a","b").basename == "b" # as it is
Path.new("a","b").join("").basename == "" # as it isn't, but might be?

After all, given:

Path.new("a","b").join("c").basename == "c" # true

… shouldn’t one expect:

Path.new("a","b").join("").basename == ""

Considering:

Path.new("a","b").join("").to_s == "a/b/"

It’s “lexically reasonable” to comprehend “a/b/” as a path with an empty/null basename.

That said, a more sophisticated ‘conventional’ interpretation may exist. “a/b/” is interpreted by several various *nix tools as “a/b/.”

Additionally, from the Path docs:

A Path is considered to be an empty path if it consists solely of one name

element that is empty. Accessing a file using an empty path is equivalent

to accessing the default directory of the process.

… and from Path#parent docs:

Path[""].parent # => Path["."]

So, perhaps a very reasonable argument exists for:

Path.new(“a”,“b”).join("").basename == “.”

I’m not an expert, just a concerned pre-1.0-code-jockey. Forgive me if I sense what seems to be some lingering ‘unclarity’ even within the core team on this matter. Best for all of us for the Path class to be officially affirmed as “as correct as possible” before 1.0. Yes?

1 Like

Yes, that’s probably the best interpretation and why I mentioned the similarity to "./a/b".

So with that explanation, maybe Path.new("a","b").join("").basename == "." would make sense.
However, I think it would also be highly unexpected. Especially considering the unix tool basename a/b/ prints b.
So IMO everything’s fine as it is. Except maybe a spec for join("") could be added.

1 Like

This sounds like some good ''gotcha’s" that probably should be well documented.

Well… to be specific basename simply prints the filename with any leading directory components removed. Although it’s technically correct that the basename of a/b/ is ., the . is not the real file name but a standardized reference to the current folder, which in this context is b.

So I don’t think there’s anything ambiguous or inconsistent with keeping the current functionality of basename and the path equality. But that doesn’t mean we shouldn’t add a little extra documentation here.

1 Like