4
votes

My goal is to find the number of times a substring exists within a string. The substring I'm looking for will be of type "[n]", where n can be any variable.

My attempt involved splitting the string up using the words function, then create a new list of strings if the 'head' of a string was '[' and the 'last' of the same string was ']'

The problem I ran into was that I entered a String which when split using the function words, created a String that looked like this "[2]," Now, I still want this to count as an occurrence of the type "[n]"

An example would be I would want this String,

asdf[1]jkl[2]asdf[1]jkl

to return 3.

Here's the code I have:

-- String that will be tested on references function
txt :: String
txt = "[1] and [2] both feature characters who will do whatever it takes to " ++
  "get to their goal, and in the end the thing they want the most ends " ++
  "up destroying them.  In case of [2], this is a whale..."

-- Function that will take a list of Strings and return a list that contains
-- any String of the type [n], where n is an variable
ref :: [String] -> [String]
ref [] = []
ref xs = [x | x <- xs, head x == '[', last x == ']']

-- Function takes a text with references in the format [n] and returns
-- the total number of references.
-- Example :  ghci> references txt -- -> 3
references :: String -> Integer   
references txt = len (ref (words txt))

If anyone can enlighten me on how to search for a substring within a string or how to parse a string given a substring, that would be greatly appreciated.

3

3 Answers

4
votes

I would just use a regular expression, and write it like this:

import Text.Regex.Posix

txt :: String
txt = "[1] and [2] both feature characters who will do whatever it takes to " ++
  "get to their goal, and in the end the thing they want the most ends " ++
  "up destroying them.  In case of [2], this is a whale..."


-- references counts the number of references in the input string
references :: String -> Int
references str = str =~ "\\[[0-9]*\\]"

main = putStrLn $ show $ references txt -- outputs 3
2
votes

regex is huge overkill for such a simple problem.

references = length . consume

consume []       = []
consume ('[':xs) = let (v,rest) = consume' xs in v:consume rest
consume (_  :xs) = consume xs

consume' []       = ([], []) 
consume' (']':xs) = ([], xs)
consume' (x  :xs) = let (v,rest) = consume' xs in (x:v, rest)

consume waits for a [ , then calls consume', which gathers everything until a ].

0
votes

Here's a solution with sepCap.

import Replace.Megaparsec
import Text.Megaparsec
import Text.Megaparsec.Char
import Data.Either
import Data.Maybe

txt = "[1] and [2] both feature characters who will do whatever it takes to " ++
  "get to their goal, and in the end the thing they want the most ends " ++
  "up destroying them.  In case of [2], this is a whale..."

pattern = single '[' *> anySingle <* single ']' :: Parsec Void String Char
length $ rights $ fromJust $ parseMaybe (sepCap pattern) txt
3