6
votes

We are getting lot of 404 page not found error which we can't find the reason with same pattern. We are not doing any redirect for this 404 pages or we are not giving link anywhere for this broken page links.

The common point for this errors is partly manipulated with same pattern; removing first segment of URL's. Example;

This is true URL's;

site.com/category/news/title-of-the-content

site.com/another-category/news/another-title-of-the-content

...

This is what we getting 404;

site.com/category/news/

site.com/another-category/news/

...

We are collection user agents for 404 errors, and here they are;

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.2; WOW64; Trident/6.0; .NET4.0E; .NET4.0C; .NET CLR 3.5.30729; .NET CLR 2.0.50727; .NET CLR 3.0.30729; InfoPath.2)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; InfoPath.2; .NET4.0C; .NET4.0E)

So this error only occurs in Internet Explorer.

We think this happens because of some IE plugins but can't find. Anyone have same problem? Or what we can do about this situation?

Thanks in advance.

Bonus: enter image description here

2
Guilty. Sometimes I manually remove the article title from the url because I want to see ALL the news articles. Speaking seriously though, is there a referrer for the 404 entries?chue x
@chue I don't think it's common behavior :) Getting thousands of error with this pattern. And there is no referrer.musa
@chuex I found some referrals which are from true URL. But most of them coming without referral.musa
At first glance, my thought was that it could be an automated crawler looking for common patterns in your URLs (possibly even to try and determine what CMS or site framework you're using). The only reasons I can think of why would be to crawl and scrape content for syndication on other sites, or to discover what site you're using and look for exploits (WordPress, Joomla, etc.)trnelson
Is site.com/category/news (without the last slash) a valid page? Is it reachable by clicking links on the home page? Nice bonus, btw.ADTC

2 Answers

1
votes

It could have something to do with users using 'safe mode' from their browsers. This link shows how the Mozilla Support addressed a similar situation https://support.mozilla.org/en-US/questions/1003468

0
votes

This might be the thing: site.com/category/news/title-of-the-content. Category, news all refer to the folders within your server title of the content also refers to the folder but it might have a web page within the folder which is being shown on the browser. When you shorten the url, you are referring to the folder which might not have any page within it. This might be the cause of 404 pages being shown up as it is unable to find any page to be shown.