I want to extract the article titles from a webpage with a multi-page list of articles.
I get the article titles on the first page using:
titles = browser.find_elements_by_xpath(r'path')
for i in range(len(titles)):
titles_list.append(titles[i].text)
I navigate to the next page using:
next_page = browser.find_element_by_xpath(r'path')
next_page.click()
Then, I return to the first step (i.e. getting the article titles).
The problem is, using the codes above, I sometimes get the article titles of a page twice and I sometimes miss the article titles of a page.
I believe the solution is to wait until the page fully loads after the second step and before repeating the first step: I should store something unique to the first page (e.g. the first article's title) in a variable (e.g. 'first_item'), and I should wait until the corresponding element does not contain that text.
I found the answer to my question but in Java which used ExpectedConditions.not
, but the following code (the EC.not()
part) is not valid in Python and raise a SyntaxError:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
next_page.click()
wait = WebDriverWait(browser, 10)
wait.until(EC.not(EC.text_to_be_present_in_element((By.XPATH, r'path'), first_item)))
How can I wait until a text is not present in an element in Python?