6
votes

I have a long text in which I'd like to replace dots with spaces but only in the middle of the text. For example:

Domain:...................google.com

I need this to be:

Domain:                   google.com

I came upon this regex that replaces the dots with a single space:

str.gsub!(/(?<=:)\.+(?=[^\.])/, ' ')

But it isn't sufficient because it produces:

Domain: google.com

I need to keep as many spaces as dots were. How would you solve it?

3
I'd suggest .gsub(/(?<!^|\w)\.|\.(?!\w|$)/, ' '), no need for blocks. Or with a block - /\.{2,}/ (and use the same block as in the answer below) - Wiktor Stribiżew
@WiktorStribiżew I thought about that, but in my opinion, regexps are to be as precise as possible, to eliminate false positives at most. Non-block version will fail on dots as sentence delimiters and the latter on ellipsis. - Aleksei Matiushkin

3 Answers

4
votes

You are nearly there, your regexp is fine, just use block version of String#gsub to calculate the length of match for replacement:

▶ str = 'Domain:...................google.com'
#⇒ "Domain:...................google.com"
▶ str.gsub(/(?<=:)\.+(?=[^\.])/) { |m| ' ' * m.length }
#⇒ "Domain:                   google.com"
3
votes

If you need to do that in the context you describe (a key/value separated with : where the value is a domain name), you can simply use:

> s='Domain:............www.google.com'
 => "Domain:............www.google.com" 
> s.gsub(/(?<=[:.])\./, ' ')
 => "Domain:            www.google.com"

Because a domain name doesn't contain a : or consecutive dots.

For a more general use, see @mudasobwa answer or you can do that too:

s.gsub(/(?:\G(?!\A)|\A[^:]*:\K)\./, ' ')

(Where the \G anchor that matches the position after the previous match, forces the next results to be contiguous).

1
votes

It sounds like you wish to replace a period with a space if it is preceded or followed by a period, and I've assumed that there is not necessarily a colon preceding a string of periods. If so, here are two ways to do that.

str = "Domain:...................google.com"

Use Enumerable#each_cons instead of a regex

" #{str} ".each_char.each_cons(3).map { |before,ch,after|
  ch=='.' && (before=='.' || after== '.') ? ' ' : ch }.join
  #=> "Domain:                   google.com"

The steps are as follows.

s = " #{str} "
  #=> " Domain:...................google.com " 
a = s.each_char
  #=> #<Enumerator: " Domain:...................google.com ":each_char> 
e = a.each_cons(3)
  #=> #<Enumerator: #<Enumerator: " Domain:...................google.com ":
  #     each_char>:each_cons(3)> 

Notice how e can be thought of as a compound enumerator. We can see the elements that will generated by this enumerator by converting it to an array.

e.to_a
  #=> [[" ", "D", "o"], ["D", "o", "m"], ["o", "m", "a"], ["m", "a", "i"],
  #    ["a", "i", "n"], ["i", "n", ":"], ["n", ":", "."], [":", ".", "."],
  #    [".", ".", "."], [".", ".", "."], [".", ".", "."], [".", ".", "."],
  #    [".", ".", "."], [".", ".", "."], [".", ".", "."], [".", ".", "."], 
  #    [".", ".", "."], [".", ".", "."], [".", ".", "."], [".", ".", "."],
  #    [".", ".", "."], [".", ".", "."], [".", ".", "."], [".", ".", "."],
  #    [".", ".", "."], [".", ".", "g"], [".", "g", "o"], ["g", "o", "o"],
  #    ["o", "o", "g"], ["o", "g", "l"], ["g", "l", "e"], ["l", "e", "."],
  #    ["e", ".", "c"], [".", "c", "o"], ["c", "o", "m"], ["o", "m", " "]] 

Continuing,

b = e.map { |before,ch,after| ch=='.' && (before=='.' || after== '.') ? ' ' : ch }
  #=> ["D", "o", "m", "a", "i", "n", ":", " ", " ", " ", " ", " ", " ", " ",
  #    " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", "g", "o",
  #    "o", "g", "l", "e", ".", "c", "o", "m"] 
b.join
  #=> "Domain:                   google.com" 

Use a regex

r = /
    (?<=\A|\.) # match the beginning of string or a period in a positive lookbehind
    \.         # match a period
    |          # or
    \.         # match a period
    (?=\.|\z)  # match a period or the end of the string
    /x         # free-spacing regex definition mode 

str.gsub(r,' ')
  #=> "Domain:                   google.com"