1
votes

I have a set of special characters that can be thought of as brackets. When a user encloses some text between these brackets, I simply want the program to replace whatever brackets were used, with <, >. So, if *, * are the brackets and the string is "Hello *world*", the program should return "Hello <world>".

The problem is I want to avoid nested occurrences of these bracket pairs, and I only want the program to focus on the outermost pair. In other words, once an opening bracket is made, treat all of the characters as normal characters until the opening bracket is closed.

For example, if my brackets are *, * and #, # and my string is "Hello *wo#rl#d*", the program should return: "Hello <wo#rl#d>" instead of: "Hello <wo<rl>d>"

I've tried using string.gsub to find all patterns of text between the defined special characters, but of course, it won't ignore nested occurrences of them.

local specialChars = {"*", "#", "-"}
local text = "Hello, world. *Won#der#ful* day, -don't- you #th*in*k?#"

for i = 1, #specialChars do
    local bracket = specialChars[i]
    local escBracket = "%" .. bracket

    text = string.gsub(text, escBracket .. "(.-)" .. escBracket, function(content)
        return "<" .. content .. ">"
    end)
end

print(text)

The code above should display:

"Hello, world. <Won#der#ful> day, <don't> you <th*in*k?>"

but instead displays:

"Hello, world. <Won<der>ful> day, <don't> you <th<in>k?>"

Any help would be greatly appreciated.

1
text = text:gsub("([%*%#%-])(.-)%1", "<%2>")Egor Skriptunoff
@EgorSkriptunoff Thank you very much! That does exactly what I wanted. Could this also work if I wanted my brackets to be more than one character? i.e, **, **?WillWillington
No. But you can replace all multi-character combinations with 1-character items (example of unused characters: \001, \002, etc.) prior to converting and replace them back after conversion.Egor Skriptunoff
@EgorSkriptunoff Wow, you're a genius. Thank you so much for your help and for the follow-up support.WillWillington
@EgorSkriptunoff I suppose I have one more concern, if it's not asking too much. Now I'm facing a problem where I can't assume the closing bracket will be the same characters as the opening bracket. For example, the case where [[, ]] or &_, _& are the bracket pairs. For this, I feel like my only option may be to iterate over all the characters and parse the string that way...WillWillington

1 Answers

1
votes
local text = "[[**Hello**, &_world_&.]] &_*Won#der#ful* day_&, **-don't- you** #th*in*k?#"
print(text)

local single_char = "*#-"
-- "o."=open, ".c"=close, "oc"=both open and close
local multi_char = { -- use chars "\1","\2",...,"\6" to start each group
   ["\1o."] = "[[", 
   ["\1.c"] = "]]",  
   ["\2o."] = "&_", 
   ["\2.c"] = "_&",
   ["\3oc"] = "**",
}
for k, v in pairs(multi_char) do
   text = text:gsub(v:gsub("%p", "%%%0"), k)
end
text = text
   :gsub("["..single_char:gsub("%p", "%%%0").."]", "%0oc")
   :gsub("([\1-\6"..single_char:gsub("%p", "%%%0").."])o.(.-)%1.c", "<%2>")
   :gsub("(["..single_char:gsub("%p", "%%%0").."])..", "%1")
   :gsub("[\1-\6]..", multi_char)
print(text)

Output:

[[**Hello**, &_world_&.]] &_*Won#der#ful* day_&, **-don't- you** #th*in*k?#
<**Hello**, &_world_&.> <*Won#der#ful* day>, <-don't- you> <th*in*k?>