0
votes

Very simple problem. When I go to my Web site in any browser, it redirects. When I do a requests.get(my_url) on the exact same URL and protocol, I get a 200 response. Why? It doesn't matter if I set allow_redirects to True or False, I still get the same behavior. I tried setting a header so requests pretends to be Firefox, I still get the same behavior.

How can I get requests.get to give me the redirect that a browser gets on the exact same URL?

#!/usr/bin/env/python

import requests

result = requests.get("https://bgjkfgbjfgghbjdfbhdgfjkh.fake")

print(result.status_code)

The above prints "200".

1
Hard to say without the real URL... - CherryDT
The site is not publicly available so it wouldn't help you. It is the same as the URL I gave except with a different (internal) domain name. - Jeff White
Yes but the URL (and the user agent) are not the only things sent in a request. I'd suggest using Fiddler (you'd have to make Python trust its certificate) as Man-in-the-middle proxy to compare the exact requests sent by browser and Python and then manually play with those differences until you identify what causes it. I'd have done it for you, that's why I asked for the URL, but when it's internal, then you have to do it yourself. - CherryDT
Is the response content the same as expected or totally different ? - bruno desthuilliers
@JeffWhite then given your comments ("result.is_redirect is false, result.next is empty, result.history is empty"), it's probably either a javascript redirect or a test on some cookie. - bruno desthuilliers

1 Answers

-1
votes

from all your comments, it sounds like the status 200 is not signifying that the end page was fetched correctly (you say response.text is completely different in both cases - i.e. it isn't the end page you expect).

In this case, either:

  • You are getting a simple 'click this to redirect' page, and the automated redirect that you experience from firefox is being implemented in javascript (which requets.get can't handle).
  • Or the website is giving another 'error' page (without a 400 series error), due to something else missing, such as a cookie.

The fact that response.history is empty even when allow_redirects is True is yet more evidence that you are never getting the end page that you see in firefox, and that the 200 you are getting is not the end page, but a 'click here' or error page.