I am new to python and webcrawling in general. I started with BeautifulSoup but quickly learned that sites that use JavaScript cant be crawled with bs4, so I started using selenium. Selenium, however, also returns an error and cant find the elements (search box) I am trying to scrape. So far I have also learned, that the page I am trying to crawl probably uses Angular, which somehow hides the elements I am looking for. Is there a way I could still use selenium or another package to enter search queries and crawl the site?
Any element I try to find cant be found, ive also tried finding them via xpath or name with out luck. I believe anything inside <app-root></app-root> cant be found simply with selenium.
Here is my code so far
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located
import time
import sys
chrome_driver_path = "path"
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument('--no-sandbox')
webdriver = webdriver.Chrome(
executable_path=chrome_driver_path,
options=chrome_options
)
useBaseURL = "https://ec.europa.eu/info/funding-tenders/opportunities/portal/screen/home"
with webdriver as driver:
# timeout
wait = WebDriverWait(driver, 10)
driver.get(useBaseURL)
searchbox = driver.find_element_by_class_name("ng-tns-c6-0 ui-inputtext ui-widget ui-state-default ui-corner-all ui-autocomplete-input ng-star-inserted")
driver.close()