22
votes

Basically I need to find a way to figure out a way to find the EXACT word in a string. All the information i have read online has only given me how to search for letters in a string so

98787This is correct

will still come back as true in an if statement.

This is what I have so far.

  elif 'This is correct' in text:
    print("correct")

This will work with any combination of letters before the Correct... For example fkrjCorrect, 4123Correct and lolcorrect will all come back as true in the if statement. When I want it to come back as true only IF it exactly matches "This is correct"

7
Strange that you know about in operator, and not == operator. - Rohit Jain
in searches for substrings while == checks for exact equality. From your question title I got the impression that you are indeed searching for substrings (in which case in is correct). But from your text I am getting the impression you actually want to check equality? (In which case == is correct) - Michael Aquilina
I do want to check for substrings, I just want to make sure it's got nothing before it - user2750103
What do you mean by "EXACT word"? Do you mean delimited by spaces? Punctuation? - Andrew Jaffe
You can use spaCy PhraseMatcher. spacy.io/usage/rule-based-matching#matcher - JStrahl

7 Answers

27
votes

You can use the word-boundaries of regular expressions. Example:

import re

s = '98787This is correct'
for words in ['This is correct', 'This', 'is', 'correct']:
    if re.search(r'\b' + words + r'\b', s):
        print('{0} found'.format(words))

That yields:

is found
correct found

EDIT: For an exact match, replace \b assertions with ^ and $ to restrict the match to the begin and end of line.

14
votes

Use the comparison operator == instead of in then:

if text == 'This is correct':
    print("Correct")

This will check to see if the whole string is just 'This is correct'. If it isn't, it will be False

5
votes

Actually, you should look for 'This is correct' string surrounded by word boundaries.

So

import re

if re.search(r'\bThis is correct\b', text):
    print('correct')

should work for you.

4
votes

I suspect that you are looking for the startswith() function. This checks to see if the characters in a string match at the start of another string

"abcde".startswith("abc") -> true

"abcde".startswith("bcd") -> false

There is also the endswith() function, for checking at the other end.

3
votes

You can make a few changes.

elif 'This is correct' in text[:len('This is correct')]:

or

elif ' This is correct ' in ' '+text+' ':

Both work. The latter is more flexible.

0
votes

Below is a solution without using regular expressions. The program searches for exact word in this case 'CASINO' and prints the sentence.

    words_list = [ "The Learn Python Challenge Casino.", "They bought a car while at 
    the casino", "Casinoville" ]
    search_string = 'CASINO'
    def text_manipulation(words_list, search_string):
        search_result = []
        for sentence in words_list:
            words = sentence.replace('.', '').replace(',', '').split(' ')
            [search_result.append(sentence) for w in words if w.upper() == 
              search_string]
        print(search_result)

    text_manipulation(words_list, search_string)

This will print the results - ['The Learn Python Challenge Casino.', 'They bought a car while at the casino']

-1
votes

Break up the string into a list of strings with .split() then use the in operator.

This is much simpler than using regular expressions.