2
votes

We use Travis CI to maintain our project on git. The issue here is on Travis we have 2 processes running a random selection of specs each with different seed numbers, now when there is a failure, I try to run:

  1. the exact spec with the seed number
  2. the exact spec without a seed number
  3. the spec file with a seed number
  4. the spec file with a seed number and --bisect
  5. the spec file without a seed number but with --bisect

In the above 5 scenarios whether locally or even on a ssh when debugging the travis build, I find no failures and bisect always fails.

Also in a completely different scenario if I run parallel:spec locally with the default 8 processes, I do get failures but if I run each alone with the 'rspec' cmd, gives no failures.

I've also tried locally to run parallel:spec whilst having the --bisect option in the .parallel-spec file in the root of our application. the minimal reproduction commands I get still give no failures.

What am I missing here? does this issue have to do with running multiple processes and having to run the minimal reproduction lines with rspec? becuase currently it seems to me that if specs are run on more than 1 process I'm never able to reproduce the failing specs. On the other hand if locally i run rspec --bisect after 8 hours I find it has not started 1 process even and I'm on a macbook pro (but yeah we have around 4k specs)

p.s. we're on rails 4.2.7.1, ruby 2.3.3 and rspec 3.4.4

Thanks

Update: ran parallel spec verbose to obtain the specs order and then ran the process command in which a spec fails with the seed number then another time with the seed number and --bisect. still no failures.

1

1 Answers

0
votes

Among those 5 things you try I can't see Try to run all the specs from the process with the seed

It is possible that your specs interfere with each other and if spec A is run before spec B, it will cause B to fail... Or even if A is run before B it can cause C to fail.

So if you run all the specs from one process with the seed - maybe you'll reproduce the fail - only then you run same thing with --bisect to find the smallest set that gives you the fail.

If you can not reproduce it that way - I can see another option: your parallel specs are using shared resource (DB, files?) and the fails are caused by a race conditions. Those are hard to find - especially among specs. Make sure each process actually uses a separate DB (simple mistakes, like forgetting to change database.yml can cause that).

If that doesn't help - inspect your code for other possible shared resources. You didn't mention how many specs usually fail. If it's a small number - you can focus on those.