Busted DNS
We've been having considerable issues at work over the move of a particular DNS record on Monday. I had to take the domain over with little or no warning and as a result something appears to have gone awry. All this week, the boss and I have been wracking our brains trying to figure out what exactly is broken (if anything) and I thought I would post the progress here.
Ted, as the Elder Geek on my viewership list, if you have anything helpful to contribute, PLEASE comment or email me.
We have 2 DNS servers:
cohen.somedomain.com (aa.bb.cc.189) brazilian.somedomain.com (xx.yy.zz.12)
Cohen is the master, Brazilian the slave, but both are visible to the world:
dns-1.somedomain.com -> xx.yy.zz.12 -> Brazilian dns-2.somedomain.com -> aa.bb.cc.189 -> Cohen
Given that Brazilian is a slave of Cohen, the records for someotherdomain.com are the same. Here is what we have for that domain:
$TTL 14400 @ IN SOA dns-1.somedomain.com. root.someotherdomain.com. ( 2006050301 ; serial 7200 3600 1209600 86400 ) someotherdomain.com. 14400 IN NS dns-1.somedomain.com. someotherdomain.com. 14400 IN NS dns-2.somedomain.com. localhost 14400 IN A 127.0.0.1 someotherdomain.com. 14400 IN A xx.yy.zz.43 dns-1 14400 IN A aa.bb.cc.189 dns-2 14400 IN A xx.yy.zz.12 dns-3 14400 IN A xx.yy.zz.12 dns-4 14400 IN A xx.yy.zz.12 dns-5 14400 IN A xx.yy.zz.12 ; local cohen 14400 IN A aa.bb.cc.189 ; atlanta brazilian 14400 IN A xx.yy.zz.12 entropy 14400 IN A xx.yy.zz.43 ; cnames --------------------------------------------------------------------- smtp 14400 IN CNAME cohen www 14400 IN CNAME entropy ad 14400 IN CNAME some.other.domain.thats.not.ours1. as 14400 IN CNAME some.other.domain.thats.not.ours2. content 14400 IN CNAME some.other.domain.thats.not.ours3. ; mx ------------------------------------------------------------------------- someotherdomain.com. IN MX 0 cohen.someotherdomain.com. someotherdomain.com. IN MX 10 brazilian.someotherdomain.com.
Now you might see a problem with the dns-1 A records listed there. They were created because I had to take over this domain from someone who was running it independent of any other domain and was asked to instead link it to the somedomain.com's NS records as I have above. Nonetheless, there still appears to be requests for dns-#.someotherdomain.com out there so I created these A records as a stand-in. Please let me know if you feel this to be a Bad Idea.
I am aware of the fact that a CNAME to a record outside of the zone is considered Bad Form and likely even illegal, but since we had to point those domains to these other hosts, I know of no other way to do it.
Also, I have reservations as to the content of the 2nd line in the file. As this version was adapted from an example on another server I'd like to know if it's alright to have an SOA record for dns-1.somedomain.com in the someotherdomain.com zone file.
Lastly, Reverse-DNS for our subclass has been delegated to us as well. So, instead of our ISP managing reverse lookups, I've had to set that up on Cohen and slave it out to Brazilian (note that Cohen is on a different network).
Here are the contents of the reverse lookup file:
$ORIGIN 2-62.zz.yy.xx.in-addr.arpa. $TTL 86400 @ IN SOA cohen.somedomain.com. root.somedomain.com. ( 2006062705 ; serial 21600 ; refresh after 6 hours 3600 ; retry after 1 hour 604800 ; expire after 1 week 86400 ) ; minimum TTL of 1 day IN NS dns-1.somedomain.com. IN NS dns-2.somedomain.com. 2 IN PTR dallaire.somedomain.com. 12 IN PTR brazilian.somedomain.com. 13 IN PTR ethiopian.somedomain.com. 14 IN PTR survivor.somedomain.com. 15 IN PTR tsing-tao.somedomain.com. 16 IN PTR kenyan.somedomain.com. 22 IN PTR absinthe.somedomain.com. 23 IN PTR absolut.somedomain.com. 24 IN PTR bailey.somedomain.com. 25 IN PTR espresso.somedomain.com. 26 IN PTR laurier.somedomain.com. 27 IN PTR margarita.somedomain.com. 28 IN PTR martini.somedomain.com. 29 IN PTR mcclung.somedomain.com. 30 IN PTR packeteer.somedomain.com. 42 IN PTR anomaly.somedomain.com. 43 IN PTR entropy.somedomain.com.
I took take over this domain on Monday, but the servers hosting DNS for this domain had been offline since Friday (oops). When I brought up the domain on my own servers, there were the usual hiccups that could have been caught by some, but the experimental period was short.
Now the problem: We have two issues, one more pressing than the other, but they may be related.
A significant percentage (>5%, <30%) of sites running ad-code using this domain (as.someotherdomain.com) have been complaining of dead images. Instructions from our end asking them to flush their DNS have been met with "I did, but it's still broken"
One site administrator was quick enough to try out DNSReport.com and found this for as.someotherdomain.com:
A timeout occurred getting the NS records from your nameservers! None of your nameservers responded fast enough. They are probably down or unreachable. I can't continue since your nameservers aren't responding. If you have a Watchguard Firebox, it's due to a bug in their DNS Proxy, which must be disabled.
However, when I looked into this, I re-ran the report using only someotherdomain.com and everything checked out:
I'd very much like to know if this is indeed a problem or if I'm worrying about nothing.
From some Windows machines, the following command returns some very odd responses when querying Brazilian for information on any domain it controls:
nslookup someotherdomain.com xx.yy.zz.12 (root) nameserver = E.ROOT-SERVERS.NET (root) nameserver = F.ROOT-SERVERS.NET (root) nameserver = G.ROOT-SERVERS.NET (root) nameserver = H.ROOT-SERVERS.NET (root) nameserver = I.ROOT-SERVERS.NET (root) nameserver = J.ROOT-SERVERS.NET (root) nameserver = K.ROOT-SERVERS.NET (root) nameserver = L.ROOT-SERVERS.NET (root) nameserver = M.ROOT-SERVERS.NET (root) nameserver = A.ROOT-SERVERS.NET (root) nameserver = B.ROOT-SERVERS.NET (root) nameserver = C.ROOT-SERVERS.NET (root) nameserver = D.ROOT-SERVERS.NET *** Can't find server name for address xx.yy.zz.12: No information Server: UnKnown Address: xx.yy.zz.12 Name: someotherdomain.com Address: xx.yy.zz.43
Here's hoping it "fixes itself" somehow by tomorrow...
Comments
Carolina
4 May 2006, 12:01 a.m. |
I don't know if you check your blog comment from a year ago so I thought that I would pretend to comment on this blog entry. Actually, I'm getting back to your question from July of last year when you asked my how I found your blog. Mike Munro of a all people found your blog. Isn't that bizzarre? Anyway, I think you sent me an e-mail a while back and I totally rudely spaced out on writing you back. My bad. But I had a slow day at work so I'm all cought up on what's going on with you but I'd love to hear from you anyways.
Daniel
4 May 2006, 1:33 a.m. |
Heh. I do in fact check old comments. Actually, I've coded my site to send me a text message every time someone posts one (this caused quite the headache when I was ambushed by spammers over the Christmas holiday)
Thanks for catching up to me though. I'll be sure to put some time aside to send you a proper email soon.
noreen
4 May 2006, 1:42 a.m. |
ok i had no idea what i was doing for you earlier with regards to above but hope it helped?
Daniel
4 May 2006, 1:48 a.m. |
Heh, yeah. That was an interesting experience. It sorta helped though... it at least helped us see how broken things were. You weren't reporting any problem, whereas other users were which sounds like a caching issue, but *we* were having the problem which kinda creeped me out. We'll see how things go tomorrow I guess.
Ted
7 May 2006, 8:49 p.m. |
I wish I could help you in this regard, but I can't think of anything off the top of my head. Mind you, I'm not a network admin either so it's kinda outside of my area of expertise.
I know a lot about security related issues and DNS (cache poisoning, DNS tunneling [neat stuff], data exfiltration, network recon, etc.). I also know how to secure one (securing dynamic DNS, preventing zone xfers, spoof protection, shadow DNS, etc. but I've neither administred a DNS server nor set one up.
That being said, if it's still a problem at this point, I can see if any of my network admin friends can come up with something for you if you so desire - let me know.
Cheers,
Ted
Daniel
8 May 2006, 2:38 a.m. |
Actually, it looks like the problem has solved itself. My guess is that the DNS setup is just fine and that the not-working clients that were complaining were using a badly-configured DNS that was caching the data for longer than the specified TTL. Regardless, I appreciate the effort ;-)
I'd really like to learn about how to do all the stuff you do though. It sounds interesting to me.
Post a Comment
Preview