1
votes

We are using Amazon Web Services EC2 to create two servers which are then attached to an Elastic Load Balancer (ELB). The instances eventually use the url of the load balancer itself to request WCF services.

In few situation an instance is not able to resolve the load balancer’s url for 10 minutes of so, and then it work fine. Here in summary is what we do :

  1. We create a load balancer
  2. We create two instances in the same zone
  3. We connect the instances to the load balancer and we wait for them both to be ready (ie able to process request).

Sometime an instance attached to the load balancer is not able to resolve the load balancer’s url once we start the testing. After about 10 minutes is then able to resolve the name. Here is the error we are getting:

---> System.Net.WebException: The remote name could not be resolved: 'nightlyblb13083105564592203800-455163519.us-east-1.elb.amazonaws.com'

Any idea ? We added all the checks to make sure that both instances are ready once we start using the load balancer and we are pretty confident that this is the case, however the problem described above happen about 1 out of 20 tests.

1
This sounds like a DNS resolution issue. Are you resolving directly with AWS DNS servers? Or are you resolving against your local or isp DNS servers?datasage
We resolve directly with AWS DNS server, so in other terms we have not changes the Amazon machine configuration.AWS User

1 Answers

0
votes

This is normal, if I correctly understand your testing framework. The way that ELB is scaling, it starts out running on a very small machine, and as traffic increases, it's directed to even larger and larger machines. However, ELB is not configured to handle flash traffic, especially from a small number of hosts, as is the case with a load testing scenario. This is because the DNS record is changed whenever ELB scales, and it sometimes takes a while to propagate. Load testing frameworks sometimes cache the DNS lookup, making things even slower. The official ELB documentation (http://aws.amazon.com/articles/1636185810492479) states that traffic should not be increased by more than 50% every 5 minutes. I found that scaling takes even longer if you're looking to get over 150-200k RPM.