3
votes

I am using watir web driver for web scraping, where I fill the form and get the results. The results also includes the dropdown list, from where I have to extracted the currently selected text. I have used following queries.

selectedList = browser.select_list(:id => "itemType")

From this i can use following query:

selectedText = selectedList.selected_options.map(&:text)[0]

Using this query, I am able to get the selected text. But the problem is, the drop-down list contains thousands of options, and watir is taking too long to find the selected option using this query.

Any faster method is appreciated. Also I have tried with following queries:

selected = selectedList.selected_options()[0]
selectedText = selected.text

But the problem is same. I have other dropdowns with smaller options, where both of these queries are performing good, but with thousands of options, its really slow.

1

1 Answers

2
votes

The problem is that to get the selected options, a call to the browser is made for each individual option. Even if each call takes a fraction of a second, it adds up quite quickly.

You could get the selected options in a single wire call by using execute_script:

selected_list  = browser.select_list(id: 'itemType')
selected_options = browser.execute_script("return arguments[0].selectedOptions;", selected_list)
selected_text = selected_options.map(&:text)

For a page with just a 1000 option select list, this dropped the execution time from 64 seconds to only 0.2 seconds.

The above works for both dropdowns and multi-selects. If you know it is a dropdown (ie single selected option), you could get even faster by just returning the text (rather than the collection of selected options):

selected_list  = browser.select_list(id: 'itemType')
selected_text = browser.execute_script("return arguments[0].selectedOptions[0].text;", selected_list)

This cut the time in half to 0.08 seconds. However, for such a small gain, I personally prefer the first approach as it minimizes the JavaScript code.