I have an internal domain, say example.com, in Windows AD DNS. I have created a sub-domain delegation, aws.example.com, with a glue record pointing to a BIND 9.8 instance in AWS (over site-to-site VPN). The BIND instance has a single zone configured as a forward only (with forwarder) pointing to the AWS VPC subnet resolver which has an AWS Rt. 53 zone (aws.example.com) associated. The problem is resolution is not functioning correctly, sometimes.... from my internal network if I dig or nslookup against the Windows DNS for hosts in the Rt. 53 zone, i get no answer (although I do see the query hitting BIND). If I then dig/nslookup against the BIND instance directly it works. Now if I go back to the first step, dig/nslookup against Windows DNS, I do get successful resolution. It's as if the initial dig/nslookup, which is coming via Windows DNS, isn't triggering the forward only behavior and the direct query is & then caching the answer. Can anyone provide insight into what I've done wrong or how to change this behavior?
BIND config:
acl goodclients {
172.31.0.0/16;
192.168.0.0/16;
localhost;
localnets;
};
options {
directory "/var/cache/bind";
recursion yes;
allow-query { goodclients; };
forwarders {
172.31.0.2;
};
#forward only;
dnssec-enable yes;
dnssec-validation yes;
auth-nxdomain no; # conform to RFC1035
listen-on-v6 { any; };
querylog yes;
};
zone "aws.example.com" {
type forward;
forward only;
forwarders { 172.31.0.2; };
};
here's a sample of the fail-succeed-succeed sequence running queries to windows then bind then windows again from 2 different clients:
windows AD dns domain example.com
\_ subdomain aws.example.com —> NS 172.31.32.5 (bind instance in AWS )
\_ —> forwarding to:172.31.0.2 (aws VPC resolver IP) to Rt.53 associated zone
client 1:
user1@vfvps-server:~ #date
Wed Sep 14 14:18:41 EDT 2016
user1@vfvps-server:~ #nslookup
> lserver 192.168.4.147 <—————windows dns
Default server: 192.168.4.147
Address: 192.168.4.147#53
> server1.aws.example.com
Server: 192.168.4.147
Address: 192.168.4.147#53
** server can't find server1.aws.example.com: NXDOMAIN
> exit
client 2:
KWK-MAC:~ user1$ date
Wed Sep 14 14:19:29 EDT 2016
KWK-MAC:~ user1$ dig @172.31.32.5 server1.aws.example.com <—— 172.31.32.5 = bind
; <<>> DiG 9.8.3-P1 <<>> @172.31.32.5 server1.aws.example.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23154
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 13, ADDITIONAL: 0
;; QUESTION SECTION:
;server1.aws.example.com. IN A
;; ANSWER SECTION:
server1.aws.example.com. 300 IN A 172.31.14.41
client 1:
user1@vfvps-server:~ #date
Wed Sep 14 14:19:40 EDT 2016
user1@vfvps-server:~ #nslookup
> lserver 192.168.4.147
Default server: 192.168.4.147
Address: 192.168.4.147#53
> server1.aws.example.com
Server: 192.168.4.147
Address: 192.168.4.147#53
Non-authoritative answer:
Name: server1.aws.example.com
Address: 172.31.14.41
allow-recursion { goodclients; };
– Dusan Bajic