To prevent web application input from XSS or any other attack, we would like to decode all the input coming from the client (browser).
To bypass the standard validation, bad guys encode the data. Example:
<IMG SRC=javascript:alert('XSS')>
That gets translated to
<IMG SRC=javascript:alert('XSS')>
In C#, we can use HttpUtility.HtmlDecode & HttpUtility.UrlDecode to decode the client input. But, it does not cover all the type of encoding. For example, following encoded values are not getting translated using above methods. However, all the browser decode and execute them properly. One can verify them at https://mothereff.in/html-entities as well.
<img src=x onerror="javascript:alert('XSS')">
It gets decoded to <img src=x onerror="javascript:alert('XSS')">
There are some more encoded text that does not get decoded using HtmlDecode method. In Java, https://github.com/unbescape/unbescape handles all such varieties.
Do we have a similar library in .Net or how do handle such scenarios?
<IMG
.. looks an awful lot like html to me. – Sam Axe