0
votes

Lately there's been a rash of bot posts to our forms for the purpose of posting links (backlink bots).

We've added reCaptcha only to find it's been cracked and the bots can figure out the proper response.

We've added a Honeypot field but find these submissions are posting without using the form. Looks like the first catalog the form for required fields then submit only those, crack the captcha and post their spam to a comment field.

I'm sure they are blindly posting to any forms thinking they will be a blog comment so our contact us forms are getting caught up in the mess.

Next step is to block the submission if the contents contain a link. This seems like the best logical method as it goes for their goal as the trigger to block.

Question: Would looking for url characters be the best method to isolate a spam submission or could FILTER_SANITIZE_URL be used as a hook to trigger the deny?

3
Add a nonce as a hidden form field, if the right random nonce from the request is not returned with the form submission then bin itAnigel
take the shotgun approache and just remove all html from the submission? can't have links if there's no html.Marc B
do you use csrf tokens?piddl0r
only reCaptcha and honeypot at this time. Looking for next level that will work as these methods are not working. I'll research Nonce. And simply removing the html will not help as we are trying to block any submission that contain html not strip the html out and allow the post.Burndog

3 Answers

0
votes

Just expanding on my comment as a few people seemed to think it had merit.

There are several approaches to this.

I would suggest adding a nonce to your forms as a hidden field. When the user requests a form from your site a nonce is created that has to be returned with the form by that user for it to be a valid submission.

This answer gives more information about that. How to create and use nonces

You also say about blocking all entries that contain any html. a simple method of doing that is

if (strip_tags($string) !== $string)
    // Drop post as it had html in it
0
votes
if (strstr($data,'http://')) {
    echo "contains link";
}
-1
votes

You can pass some checksum along with the form data to validate it. On client-side, add an event listener on form's "submit" event, where you, for example, calculate sum of length of all form fields, and then put it in hidden field. On server-side, calculate the checksum the same way, and compare it with what you got in that hidden field.

Sum of lengths is the simplest way of checking, but it would cut off all automated bots, that are not targeting exactly your site. But if that would not be enough, you can create more complex algorithm to calculate checksum, and then obfuscate that javascript, so it would take a lot more effort to adapt these bots for your protection system.