3
votes

I am using the following code in servlet:

protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
    PrintWriter out=response.getWriter();
    response.setContentType("text/html");


    out.println("<html>");
    out.println("<body>");
    out.println("<script>alert(1)</script>");
    out.println("</body>");
    out.println("</html>");
}

And following code for the filter:

public class SampleFilter implements Filter {
  protected FilterConfig config;

  public void init(FilterConfig config) throws ServletException {
    this.config = config;
  }

  public void destroy() {
  }

  public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
      throws ServletException, IOException {
      long startTime = System.currentTimeMillis();
    ServletResponse newResponse = response;

    if (request instanceof HttpServletRequest) {
        System.out.println("in filter if1");
      newResponse = new CharResponseWrapper((HttpServletResponse) response);
    }
    System.out.println("after filter if1");
    chain.doFilter(request, newResponse);
    long elapsed = System.currentTimeMillis() - startTime;
    if (newResponse instanceof CharResponseWrapper) {
        System.out.println("in filter if2");
      String text = newResponse.toString();
      if (text != null) {
        text = SampleFilter.HTMLEntityEncode(text);//.toUpperCase();
        response.getWriter().write(text);
      }
    }
    System.out.println("after filter if2");
    config.getServletContext().log(" took " + elapsed + " ms");
    System.out.println(elapsed);
  }

  private static String HTMLEntityEncode(String input) {

        StringBuffer sb = new StringBuffer();

        for (int i = 0; i < input.length(); i++) {

          char ch = input.charAt(i);

          if (Character.isLetterOrDigit(ch) || Character.isWhitespace(ch)) {

            sb.append(ch);

          } else {

            sb.append("&#" + (int)ch + ";");

          }

        }

        return sb.toString();

  }

}

I want to get the following display data in the browser:

<script>alert(1)</script>

rather i am getting

<html>
<body>
<script>alert(1)</script>
</body>
</html>

in the browser.

Any help will be great.

4

4 Answers

3
votes

Don't do it the hard way. Just use JSP for generating HTML output. The JSP standard tag library (JSTL) offers builtin ways to escape user-controlled data from XSS attack holes in flavor of <c:out> tag and ${fn:escapeXml()} function..

<p>Welcome, <c:out value="${user.name}" />!</p>
...
<input type="text" name="foo" value="${fn:escapeXml(param.foo)}" />

They will escape predefinied XML entities like < by &gt; so that it becomes totally harmless.

Servlets are not designed for generating HTML output. They're designed with the purpose to control the request/response.

See also:

3
votes

When trying to prevent XSS attacks you have to separate valid code from potential dangerous parts from valid expressions. There are different techniques to achieve this:

Escaping bound data: In this case you have to use some kind of templating technology. Anyting defined in the temeplate is considered secure. In the simplest case all bound data is considered dangerous and therefore escaped. One solution making this simple is Snippetory. (Yes, I develop that. You can get it from Sourceforge or maven repo) The template might look like this:

 <html>
   <body>
     $attack
     $text
   </body>
 </html>

Then the binding code could look like this:

Template page = Syntaxes.FLUYT_X.readResource("template.html")
    .encoding(Encodings.html);
page.set("attack", "<script>alert(0)</script>");
page.set("text", "text <--> escaping");
page.render(response.getWriter());

However, the disadvatage is that the entire ouput processing has to be done the rigtht way. But I think for serious projects this is the most important way.

Now some approaches that could be used after processing, however typically are used in combination with escaping of bound data to implement complex things like editor field here on Stackoverflow:

White listing: Essentially you analyse the data (maybe using an html parser) and escape everything what is not part of a tag you put on your white list. And remove every attribute that you don't allow. This is pretty secure, but very restrictive, too. In addition it's pretty complex, so I can't provide an example here.

Black Listing: Pretty much the same, just you let through what's not on your vlack list. If you forgot something dangerous attacks are still possible.

1
votes

in your case using filler is impossible, since there its no way of separating legitimate content from any content that has been injected. heuristic black box defense against xss could be applied by filtering input rather than output.

1
votes

I've implemented an XSS Filter for a Jersey REST API. The code can easily be extracted and applied to a standard Java Filter.

Most people recommend encoding the output, but as our data can be accessed through a JavaScript API and there is no way of guaranteeing our customers will filter out XSS vulnerabilities, we opted for filtering out the XSS vulnerabilities on input. An additional benefit of this approach is that the filtering is done once and not every time data is output.

Note that the filter needs to be used in conjunction with JSR 303's @SafeHtml annotation to ensure that the contents of POST data are correctly filtered.

I've documented this on my blog here: http://codehustler.org/blog/jersey-cross-site-scripting-xss-filter-for-java-web-apps/