Deleting arbitrary string char

I have strings that I want to delete chars at arbitrary index locations.

From the strings api there are methods like lchop and rchop to delete a char off the left|right.

What I want is something like a chop_index(index).
Thus: abcde.chop_index(2) -> abde

What’s the fastest|safest way to currently do this in Crystal, and would you consider adding a method like this to the stdlib?

Yeah, there’s no method to do that, but I do think it’s useful. In Ruby it’s called slice! and it has different flavors: you can delete at an index, at a range, or even a string and a regex.

The current state of the API is that there’s:

  • delete(Char): deletes all occurrences of char removed
  • delete(*sets): some weird API that deletes some sets of characters. For example "aabbccdd".delete("a-c") gives “dd” (all chars from ‘a’ to ‘c’ are deleted)

My preference would be to remove delete(*sets) because it’s a pretty strange method and I’m not sure how intuitive that is or how useful. Or we can rename it to delete_sets.

Then we can add these overloads:

  • delete(index: Int): deletes the char at the given index
  • delete(start: Int, count : Int): deletes count chars starting from start
  • delete(range : Range(Int?, Int?): deletes chars at the given range
  • delete(String, first : Bool = false): deletes all or just the first occurrence of the given string
  • delete(Regex, first : Bool = false): deletes all or just the first occurrence of the given regex match
  • delete(Char, first : Bool = false): deletes all or just the first occurrence of the given regex match (an overload for this exists, but one that just deletes the first occurrence is missing)

Thoughts?

For now you can do:

string = "abcde"
index = 2
"#{string[0...index]}#{string[index + 1..]}" # => "abde"
1 Like

It may be simpler to just allow [l|r]chop to take an index, at least this one particular use case.

Something like: "abcde".rchop(3) → abde``
This would actually be more intuitive, chop the third char from the right.

But yes, these types of string methods are necessary to make string manipulation easier and more intuitive.

rchop(3) could also be confused with deleting 3 chars from the right.

I’m curious @jzakiya, what is your use case?

Yes, that’s true, I was merely thinking out loud. In fact, that would be the more logical interpretation. It would be more consistent with e.g. last(3). That’s why I suggested chop_index initially. (Now that you mention it, that would be a good feature to add to [l|r]drop.) Documentation would make clear the use|context.

It initially sprang from thinking about doing this Rosetta Code task, but there are other tasks involving string manipulations where this would be useful too.

delete_at might be best to match array.

2 Likes

That’s good too, though chop_at may be more consistent with the Strings api nomenclature, as it visually denotes object types. (You chop string chars and delete array elements) But I’m agnostic on names, as long as they’re well documented.

Oh, yeah, delete_at is probably best because of it’s consistency. “chop” always has the meaning of “from one of the ends”.

Hm I think in terms of naming in stdlib the word delete is only used in names of mutating methods.

Ah, with the important exception of String itself :grinning:

FYI.
In the above referenced Rosetta Code task, I changed this:

palindrome_half = (front_half // 10).to_s 
palindrome = (palindrome_half + "0" + palindrome_half.reverse).to_u64

to this:

palindrome_half = front_half.to_s
palindrome = (palindrome_half.rchop + palindrome_half.reverse).to_u6

and got an every so slight performance increase. To prove it was real, I changed the last output example from the 10 millionth to 20 millionth palindromic gapful numbers, and saw a consistent ~1.5ish second improvement.

But probably a bigger consideration, by getting rid of that (..// 10) math operation, the code will now run “as is” on versions < 0.31, when the / to // change was made. Having version agnostic code is a big consideration in code construction.

I’ll continue to play around doing the architecture for using @asterite’s hack for the equivalent of delete_at to see if it would actually be “better”, for this task. I know for other tasks I’m looking at it would definitely make life easier, and the code nicer.

OK, here’s an example how delete_at(index) would have simplified and quickened the initial design to solve this task.

Here’s the short version for the task, which takes about 164 seconds for outputs.

def palindromicgapfuls(digit, count)
  gapfuls = [] of UInt64              # array of palindromic gapfuls
  nn = digit * 11                     # digit gapful divisor: 11, 22,...88, 99
  palindrome = 0_u64                  # init for compiler
  (2..).select do |power|
    base    = 10_u64**(power >> 1)    # value of middle digit position: 10..
    this_lo = base * digit            # starting half for this digit: 10.. to  90..
    next_lo = base * (digit + 1)      # starting half for next digit: 20.. to 100..
    this_lo.step(to: next_lo - 1, by: 10) do |front_half|   # d_00; d_10; d_20; ...
      num_half = front_half.to_s                            # d_n0
      (front_half..front_half + 9).each_with_index do |upper_half, i|
        palindrome = (upper_half.to_s + upper_half.to_s.reverse).to_u64    if power.odd?
        palindrome = (num_half.rchop + num_half.reverse).to_u64 + base * i if power.even?
        gapfuls << palindrome if palindrome % nn == 0
        return gapfuls if gapfuls.size == count
      end
    end
  end
end

Here’s @asterite’s functional hack for delete_at, which took 660 seconds.

def palindromicgapfuls(digit, count)
  gapfuls = [] of UInt64              # array of palindromic gapfuls
  nn = digit * 11                     # digit gapful divisor: 11, 22,...88, 99
  (2..).select do |power|
    mid     = (power >> 1) + 1        # mid index of righthand side
    base    = 10_u64**(power >> 1)    # value of middle digit position: 10..
    this_lo = base * digit            # starting half for this digit: 10.. to  90..
    next_lo = base * (digit + 1)      # starting half for next digit: 20.. to 100..
    this_lo.step(to: next_lo - 1, by: 10) do |front_half|   # d_00; d_10; d_20; ...
      (front_half..front_half + 9).each do |upper_half|
        palindrome = (upper_half.to_s + upper_half.to_s.reverse).to_u64
        palindrome = (palindrome.to_s[0..mid-2] + palindrome.to_s[mid..-1]).to_u64 if power.even?
        gapfuls << palindrome if palindrome % nn == 0
        return gapfuls if gapfuls.size == count
      end
    end
  end
end

It’s shorter, and if you don’t care about speed, it’s the easiest conceptual way to do it.

Now here’s how simple it would be using delete_at.

def palindromicgapfuls(digit, count)
  gapfuls = [] of UInt64              # array of palindromic gapfuls
  nn = digit * 11                     # digit gapful divisor: 11, 22,...88, 99
  (2..).select do |power|
    mid     = (power >> 1) + 1        # mid index of righthand side
    base    = 10_u64**(power >> 1)    # value of middle digit position: 10..
    this_lo = base * digit            # starting half for this digit: 10.. to  90..
    next_lo = base * (digit + 1)      # starting half for next digit: 20.. to 100..
    this_lo.step(to: next_lo - 1, by: 10) do |front_half|   # d_00; d_10; d_20; ...
      (front_half..front_half + 9).each do |upper_half|
        palindrome = (upper_half.to_s + upper_half.to_s.reverse).to_u64
        palindrome = (palindrome.to_s.delete_at(mid-1)).to_u64 if power.even?
        gapfuls << palindrome if palindrome % nn == 0
        return gapfuls if gapfuls.size == count
      end
    end
  end
end

Now it’s even more conceptually clear what’s going on.
If you’re doing any kind of text processing, this stuff is done all the time.
Surprisingly, even Ruby doesn’t have an equivalent method to do exactly this.

Addition:
This is even more concise and efficient, and much faster, only ~340 seconds vs 660.

def palindromicgapfuls(digit, count)
  gapfuls = [] of UInt64              # array of palindromic gapfuls
  nn = digit * 11                     # digit gapful divisor: 11, 22,...88, 99
  (2..).select do |power|
    mid     = (power >> 1) + 1        # mid index of righthand side
    base    = 10_u64**(power >> 1)    # value of middle digit position: 10..
    this_lo = base * digit            # starting half for this digit: 10.. to  90..
    next_lo = base * (digit + 1)      # starting half for next digit: 20.. to 100..
    this_lo.upto(next_lo - 1) do |left_half|
      palindrome = left_half.to_s + left_half.to_s.reverse
      palindrome = (palindrome[0..mid-2] + palindrome[mid..-1]) if power.even?
      palindrome = palindrome.to_u64
      gapfuls << palindrome if palindrome % nn == 0
      return gapfuls if gapfuls.size == count
    end
  end
end

PR sent here: https://github.com/crystal-lang/crystal/pull/9398

Thank you @asterite. Will this make it into 0.35?

I’m not sure, 0.35.0 is pretty much closed by now.

Is it possible to include this in 0.35.1?

No, minor versions are just about bug fixes. But it will be included for sure in 1.0. And you can always reopen String and those methods from the PR I linked.

OK. And thanks again for creating them.

1.0 will be something to look forward to, in many ways. :relaxed: