1
votes

I am trying to load json data from the following link http://swapi.co/api/films/ into pentaho.

I used 3 steps Generate Rows, HTTP Client and Json Input

Generate Rows Step :

Limit: 1
Name: movies
Type: string
Value:http://swapi.co/api/films/?format=json 

HTTP Client Step:

General Tab
 Accept URL from field? Yes
 URL field name: movies
 Result fieldname: json

fields Tab
 Name: movies
 Parameters: movies

Json Input Step:

(Fields Tab)
  (would like to get all the fields in the "results" array eg. title,episode, director.....)
  Name: title
  Path: $.results[0]
  Type: String
(fields Tab)
  Name: movies
  Parameters: movies

I get this error:

2016/02/24 12:05:00 - Json Input.0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at  org.pentaho.di.trans.steps.jsoninput.JsonReader.readString(JsonReader.java:127)
2016/02/24 12:05:00 - Json Input.0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : ... 7 more 
2016/02/24 12:05:00 - Json Input.0 - Finished processing (I=0, O=0, R=1, W=0, U=0, E=1)
2
Please post code, or screen shots of your setup.bolav
Hi @bolav, I have just amended the question with more details. any suggestions on how I went wrong. I am a fresher to pentaho pdi. regardstottihope
What errors do you get?bolav
@bolav 2016/02/24 12:05:00 - Json Input.0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.di.trans.steps.jsoninput.JsonReader.readString(JsonReader.java:127) 2016/02/24 12:05:00 - Json Input.0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : ... 7 more 2016/02/24 12:05:00 - Json Input.0 - Finished processing (I=0, O=0, R=1, W=0, U=0, E=1)tottihope
Updated my answer now. Looks like the problem is with the api, when using Pentaho HTTP Client.bolav

2 Answers

0
votes

That URL doesn't give pure json output. Try using http://swapi.co/api/films/?format=json instead.

0
votes

It seems that swapi.co is using Cloudfare Browser Integrity Check, and it doesn't give a real response to Pentaho HTTP Client:

      <div class="cf-wrapper cf-header cf-error-overview">
        <h1>
          <span class="cf-error-type" data-translate="error">Error</span>
          <span class="cf-error-code">1010</span>
          <small class="heading-ray-id">Ray ID: 279b7723caa3426d &bull; 2016-02-24 13:20:00 UTC</small>
        </h1>
        <h2 class="cf-subheadline" data-translate="error_desc">Access denied</h2>
      </div><!-- /.header -->

          <div class="cf-column">
            <h2 data-translate="what_happened">What happened?</h2>
            <p>The owner of this website (swapi.co) has banned your access based on your browser's signature (279b7723caa3426d-ua31).</p>
          </div>

Here is an example to use the JSON Input Step:

JSON Input Step