4
votes

I'm writing a client that grabs a page from a web server. On one particular server, it would work fine from my web browser, but my code was consistently getting the response:

HTTP/1.1 503 Service Unavailable
Content-Length:62
Connection: close
Cache-Control: no-cache,no-store
Pragma: no-cache

<html><body><b>Http/1.1 Service Unavailable</b></body> </html>

I eventually narrowed this down to the User-Agent header I was sending: if it contains Mozilla, everything is fine (I tried many variations of this). If not, I get 503. As soon as I realized it was User-Agent, I remembered having this same issue in the past (different project, different servers), but I've never figured out why.

In this particular case, the web server I'm connecting to is running IIS 7.5, but I'm not sure if there are any proxies/firewalls/etc in front of it (I suspect there probably is something because of this behaviour).

There's an interesting history to User-Agents which you can read about on this question: Why do all browsers' user agents start with "Mozilla/"?

It's clearly no issue for me to have Mozilla in my User-Agent, but my question is simply: what is the configuration or server that causes this to happen, and why would anyone want this behaviour?

2
Actually, I would love to find the answer to this question as well. There are many mobile phones who's user-agent doesn't start with Mozilla, such as LG, Samsung, RT - also, the Opera browsers, they never start with Mozilla and always start with "Opera".Erx_VB.NExT.Coder
FWIW I can use Linux's curl to make get requests to my IIS 7.5 server. There must be something in front of the server that you are trying to make a request to.chue x

2 Answers

2
votes

Here is an interesting history of this phenomenon: User Agent String History

The main reason that this exists is because the internet, web, and browsers were not designed, but evolved, with high amounts of backwards compatibility, but then a lot of vender exclusive extensions. In particular, frames (which are widely considered a bad idea these days) were not well supported by Mosaic, but were by Netscape (which had Mozilla as it's user agent).

Server administrators then had a choice: did they use the new hip cool frames and only support Netscape, or did they use old boring pages that everyone can use? Their choice was a hack; if someone tells me they are Mozilla, send them frames; if not, send them not frames.

This ruined everything. IE had to call itself Mozilla compatible, everyone impersonated everyone else, it's all well detailed in the link at the top. But this problem more or less went away in the modern era, as everyone impersonated everyone, and everyone supported more and more of a common subset of features.

And then mobile browsers and smart phone browsers became wide spread. Suddenly, there wasn't just a few main browsers with basically the same features, and a few outlying browsers you could easily ignore. Now it was dozens of small browsers, with less power and less ability and a disjoint odd set of capabilities! And so, many servers took the easy road and simply did not send the proper data, or any data at all, to any browser they did not recognize.

Now rather than a poorly rendered or inoperable website, you had...no website on certain platforms, and a perfect one on others. This worked, but wasn't tolerable for many businesses; they wanted to work right on ALL platforms, because that's how the web was supposed to work.

Mobile versions, mobile first, responsive design, media queries, all these were designed to fill in those gaps. But for the most part, a lot of websites still just ignore less than modern browsers. And media queries were quickly subverted: no one wants to declare their browser is handheld, oh no. We're a real display browser, even if our screen is only 3 inches, yes sir!

In summary, some servers are configured to drop any browser which is not Mozilla compatible because they think it's better to serve no page than a poorly rendered one.

I've also seen some arguments that this improves security because then the server doesn't have to deal with rogue programs that aren't browsers (much like your own) connecting to them. As the user agent is easy to change, this holds no water for me; it's simply security through obscurity.

0
votes

Many firewalls are configured to drop all requests which do not have "proper" user agent, as many DDoS attacks do not bother to send it - this is easy, reliable filter.