12
votes

I've got a suite of Selenium tests that work perfectly in my local environment and using Browserstack Automate, but fail on Azure DevOps.

There are no configuration or setting changes when running on Azure Devops.

We've followed all the documentation here: https://docs.microsoft.com/en-us/azure/devops/pipelines/test/continuous-test-selenium?view=vsts

Random tests fail, never the same ones.

The tests always fail because of timeouts. I wait for the pages to load for 5 minutes so it's not a case of the timeouts being too low.

There are no firewalls in place, the application is public.

Authentication always succeeds so the tests are able to load the application.

Not sure what to try next.

Below is a copy of the Azure DevOps log. 4 tests passed but all the other's failed. Usually, only 4-5 tests fail.

This tests works perfectly using BrowserStack Automate (remote selenium) and locally.

2018-11-17T05:40:28.6300135Z  Failed   StripeAdmin_WhenOnTab_DefaultSortIsByIdDescending
2018-11-17T05:40:28.6300461Z Error Message:
2018-11-17T05:40:28.6304198Z  Test method CS.Portal.E2e.Tests.Admin.StripeAdmin.StripeAdminTests.StripeAdmin_WhenOnTab_DefaultSortIsByIdDescending threw exception: 
2018-11-17T05:40:28.6305677Z OpenQA.Selenium.WebDriverTimeoutException: Timed out after 300 seconds
2018-11-17T05:40:28.6307041Z Stack Trace:
2018-11-17T05:40:28.6307166Z     at OpenQA.Selenium.Support.UI.DefaultWait`1.ThrowTimeoutException(String exceptionMessage, Exception lastException)
2018-11-17T05:40:28.6307999Z    at OpenQA.Selenium.Support.UI.DefaultWait`1.Until[TResult](Func`2 condition)
2018-11-17T05:40:28.6308188Z    at CS.Portal.E2e.Tests.Utility.WebDriverUtilities.WaitForElement(IWebDriver driver, By by, Boolean mustBeDisplayed) in D:\a\1\s\CS.Portal.E2e.Tests\Utility\WebDriverUtilities.cs:line 26
2018-11-17T05:40:28.6319651Z    at CS.Portal.E2e.Tests.Admin.StripeAdmin.StripeAdminTests.StripeAdmin_WhenOnTab_DefaultSortIsByIdDescending() in D:\a\1\s\CS.Portal.E2e.Tests\Admin\StripeAdmin\StripeAdminTests.cs:line 51
2018-11-17T05:40:28.6319982Z 
2018-11-17T05:40:34.4671568Z Results File: D:\a\1\s\TestResults\VssAdministrator_factoryvm-az416_2018-11-17_03_08_24.trx
2018-11-17T05:40:34.4692222Z 
2018-11-17T05:40:34.4695222Z Attachments:
2018-11-17T05:40:34.4697610Z   D:\a\1\s\TestResults\672f4d28-5082-42e9-a7e7-f5645aadcfd8\VssAdministrator_factoryvm-az416 2018-11-17 03_02_43.coverage
2018-11-17T05:40:34.4697943Z 
2018-11-17T05:40:34.4698278Z Total tests: 34. Passed: 4. Failed: 30. Skipped: 0.
2
Is there a common exception when the tests fails?Guy
Do you use Hosted agent or Private agent?Shayki Abramczyk
@Guy Hosted, the exceptions are always timeouts.John Farrell
The timeout occurs only in page loading? or in driver.findElement() as well?Guy
@jfar Update the question with your code trials and error stack traceDebanjanB

2 Answers

7
votes

A few lines from your code block would have helped to analyze your issue in a better way.

However, as your tests always fail because of timeouts it is worth to mention that, in general TimeoutException is the outcome of failed ExpectedConditions. However there can be other issues as well.

Some of the approaches to avoid these issues are as follows:

  • As you mentioned, I wait for the pages to load for 5 minutes... that would be against all the best practices. Instead you need to implement PageLoad, ImplicitWait or WebDriverWait

WARNING: Do not mix implicit and explicit waits. Doing so can cause unpredictable wait times.

  • You can find a detailed discussion in How can I make sure if some HTML elements are loaded for Selenium

  • If you are using ChromeDriver and Chrome Browser you must ensure that the binaries are compatible as per the entries below:

    • ChromeDriver v2.44 : Supports Chrome v69-71 (same as ChromeDriver 2.43, but with additional bug fixes, released Nov 20, 2018)
    • ChromeDriver v2.43 : Supports Chrome v69-71
    • ChromeDriver v2.42 : Supports Chrome v68-70
    • ChromeDriver v2.41 : Supports Chrome v67-69
  • Different Browsers renders the HTML DOM differently. So you need to ensure that the Locator Strategies which you are using are optimized.
  • As per the current WebDriver-W3C Recommendation the following is the list of preferred Locator Strategies:

Locator Strategies_W3C

  • There is some difference in the performance using CssSelector and XPath. A few take aways:
    • For starters there is no dramatic difference in performance between XPath and CSS.
    • Traversing the DOM in older browsers like IE8 does not work with CSS but is fine with XPath. And XPath can walk up the DOM (e.g. from child to parent), whereas CSS can only traverse down the DOM (e.g. from parent to child). However not being able to traverse the DOM with CSS in older browsers isn't necessarily a bad thing as it is more of an indicator that your page has poor design and could benefit from some helpful markup.
    • An argument in favor of CSS is that they are more readable, brief, and concise while it is a subjective call.
    • Ben Burton mentions you should use CSS because that's how applications are built. This makes the tests easier to write, talk about, and have others help maintain.
    • Adam Goucher says to adopt a more hybrid approach -- focusing first on IDs, then CSS, and leveraging XPath only when you need it (e.g. walking up the DOM) and that XPath will always be more powerful for advanced locators.
    • You can find a detailed discussion in Why should I ever use CSS selectors as opposed to XPath for automated testing?

Conclusion

Keeping the above mentioned factors in consideration you need to implement the Locator Strategy wisely along with the other approaches discussed above which will help you to get rid of the timeouts.

0
votes

Here are some steps I would do :

  1. What helped us in a similar case is to temporarily add a video recorder to tests then watch the test execution process on a VM from start to fail. There could be some clues on a video that help to see what is actually going wrong I was able to find this link for a C# example

  2. Also, I would double check to make sure browser versions on Azure are exactly the same as in the run where it all works well. Making them the same is crucial to make sure there is no 'magic'. Same for the default browser window size.

  3. I would do more detailed analysis of places where different tests fail.

    • is it possible to spot the similarities between different test failures. Does it always happen after clicks? after reloading pages? after anything else that is similar? If yes - try with the weirdest yet simple and sometimes lifesaving solution and add a 3-5 sec sleep before/after an action that is before the failure. (add sleeps with a condition to only happen when it's an Azzure run) (yes, sleeps are not recommended and {a lot of well-known info why they are not recommended could have been here} but... if they magically save your runs you can then replace them with some smart waits for sure)
    • is it possible that failures happen at some certain time? After same time after run start? At the same time during the day?
  4. If you use Date/time APIs in your code, make sure the System time/locale/timezone settings are exactly the same. Or that the days do not change during the test runs. All in all - investigate around dates.

I know that the above is more like a general advice, but from my experience such "random failures" could be caused by literally anything that seems "not worth attention".