1
votes

I have a web-service which return the encoded pdf but when i try to extract the data in it by using regular expression extractor(JMeter) it does not extract. I check the value of variable, it shows null value. I googled various sites but didn't succeed. After extracting the data i will save this in to one file.

I googled and refer various sites but didn't succeed. Below here are some references: https://dzone.com/articles/how-to-read-a-pdf-file-in-apache-jmeter https://www.blazemeter.com/blog/what-every-performance-tester-should-know-about-extracting-data-files-jmeter/

i got nothing in my variable when i see in debug sampler.

1

1 Answers

1
votes

If you want to extract text from the PDF file into a JMeter Variable the only way of doing this is using JSR223 PostProcessor and PDFBox

  1. Download tika-app.jar and put it to JMeter Classpath
  2. Restart JMeter to pick the .jar up
  3. Add JSR223 PostProcessor as a child of the request which returns the PDF
  4. Put the following code into "Script" area:

    def handler = new org.apache.tika.sax.BodyContentHandler();
    def metadata = new org.apache.tika.metadata.Metadata();
    def inputstream = new ByteArrayInputStream(prev.getResponseData());
    def context = new org.apache.tika.parser.ParseContext();
    def pdfparser = new org.apache.tika.parser.pdf.PDFParser();
    pdfparser.parse(inputstream, handler, metadata, context);
    vars.put('pdfText', handler.toString())
    
  5. That's it, you should have the text from the PDF file as ${pdfText} JMeter Variable

More information: