0
votes

I am trying to set up Route53 so that instances on the same VPC as the consul cluster can hit .consul endpoints.

For experimental purpose I got one of the three server nodes set up with DNS forwarding set up using BIND (private IP 172.31.56.55) to act as the nameserver as suggested here with the addition of allow-query { any; } and listen-on port 53 { any; };

I have a "consul." hosted zone with the following SOA, NS, and A (glue) records: SOA:

ns1.consul. hostmaster.consul. 1 7200 900 1209600 86400

NS:

ns1.consul.

ns1.consul A:

172.31.56.55

If I specify @ns1.consul in the dig command it works, but if I leave it out it doesn't. What am I missing/misconfiguring?

[ec2-user@ip-172-31-56-55 ~]$ dig @ns1.consul consul.service.dc1.consul

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.39.amzn1 <<>> @ns1.consul consul.service.dc1.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46403
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;consul.service.dc1.consul. IN  A

;; ANSWER SECTION:
consul.service.dc1.consul. 0    IN  A   172.31.51.192
consul.service.dc1.consul. 0    IN  A   172.31.56.55
consul.service.dc1.consul. 0    IN  A   172.31.52.9

;; Query time: 5 msec
;; SERVER: 172.31.56.55#53(172.31.56.55)
;; WHEN: Sat Oct 17 18:07:32 2015
;; MSG SIZE  rcvd: 91
ec2-user@ip-172-31-56-55 ~]$ dig consul.service.dc1.consul

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.39.amzn1 <<>> consul.service.dc1.consul
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 24575
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;consul.service.dc1.consul. IN  A

;; AUTHORITY SECTION:
consul.         60  IN  SOA ns1.consul. hostmaster.consul. 1 7200 900 1209600 86400

;; Query time: 2 msec
;; SERVER: 172.31.0.2#53(172.31.0.2)
;; WHEN: Sat Oct 17 18:20:34 2015
;; MSG SIZE  rcvd: 94
2
Can you post your resolv.conf as well please? The @ns1.consul is telling me that it's probably resolving the .consul domain properly, but the default resolver path isn't following it. Also, are you trying to do a VPC-private DNS:docs.aws.amazon.com/Route53/latest/DeveloperGuide/… ? Dig is actually routing to two different servers here: ;; SERVER: 172.31.56.55#53(172.31.56.55) and ;; SERVER: 172.31.0.2#53(172.31.0.2)HighlyUnavailable
I've been trying to sort this out as well. It looks like at this time Route53 does not support zone delegation. It will reply with the NS record you give it, but it will answer authoritatively for the zone and not forward the request to the NS provided. Can anyone validate this?lmickh
I may have same problem and I'm pretty sure it's not the problem from Route53, as I have couple of other sub-domains delegated to either Route53 (public domain) or bind (internal only) and both of them works fine. It seems to me consul does not play with DNS well, for example, it returns SOA call with ns. while there no such an A record. If we know what's the difference between dig and dig +trace we will know the reason - consul did not respond to +trace request properly, I got "BAD REFERRAL" from it.C.B.

2 Answers

1
votes

It seems that there is no option to set this up using private hosted zones in Route 53: https://forums.aws.amazon.com/thread.jspa?threadID=218389

One solution could be to assign public IP addresses to your consul servers, configure consul to use a valid TLD within its domain (e.g. my-consule-zone.com) and use a public hosted zone.

0
votes

I think the problem is that bind does not play with consul well, I didn't dig into the problem so I cannot tell which party has the problem, once I have consul acts as the DNS server (listen on port 53 of a non-localhost interface), everything works just fine.