I would like to subscribe to an RSS/XML feed from Google News that captures the following query:
Articles mentioning "studie" (German for "study"), written in German, emanating from any country.
I'm using https://news.google.com/rss/search, but for this example, it's easier to see the UI output at https://news.google.com/search, so I'll use the latter URL base in this example.
Now, in the XML API reference, Google mentions four different parameters that influence either language or country:
hl
(host language): the language that the end user is assumed to be typing in. I.e., an English-language speaker types "study," and Google assumes that term is in English and then machine-translates the results back to English. For me, navigating to will redirect a URL withhl=en-US
(full URL is https://news.google.com/?hl=en-US&gl=US&ceid=US:en).gl
: boosts search results whose country of origin matches the parameter value. The default in my web browser isgl=US
.lr
(language restrict): restricts search results to documents written in a particular languagecr
(country restrict): restricts search results to documents originating in a particular country
Based on all of the above, that would imply a URL of*:
That attempt, however, fails miserably; it shows English-language results from the U.S., and it 302 redirects to:
https://news.google.com/search?q=study&lr=lang_de&hl=en-US&gl=US&ceid=US:en
So, to that end:
- How can I properly structure URL parameters to capture 'Articles mentioning "studie" (German for "study"), written in German, from any country.'?
- What the heck is
ceid
and why is it documented absolutely nowhere by Google?
* I.e.:
>>> import urllib.parse
>>> urllib.parse.parse_qs('q=study&hl=en-US&lr=lang_de')
{'q': ['study'], 'hl': ['en-US'], 'lr': ['lang_de']}
Related but not resolving any of this:
client
,output
, andcx
parameters are all required – Ezphareshl
andlr
to also be valid only in that context – Ezphares