0
votes

I have two questions about Pentaho Kettle, and I need some help please! So, I have a CSV file with some data. In one the column's, the file have some dates(in years).. The first problem its, some rows have the "None" in that column and other rows have the date in the right format. This image should help to "see" the problem:

Problem One

To resolve this problem, I changed the data type in input file and in the database to String. That works, but i thing that's not the correct way to do. I also tryed to use the "Filter Rows " step, but don't worked.. Some help please? :)

The second problem its about a null value in the date field. The database expects to received a date value, but some of tha values are null.. Once again, this image should help to "see" the problem:

Problem Two

What I can do to resolve the both problems? What is the right way to not only resolve the problem, but have a good performance to query the data later?

Thanks very much!

Best regards!

3

3 Answers

2
votes

for the first query use input step as a string that's fine after that use select value step use can change string to date formate. for the second step use filter rows step and separate rows which has none after that replace none with null and link to your next step.

0
votes

For the "None" String value in the Year column you can first read that column as String then you can use the Step called "Null if" and give "None" as the Value to turn to NULL. Later you can make this Year column as Integer type in the Select Values.

For the second problem, since you are table design expects a non-null value for the date column, you could either change the not-null constraint to nullable. Or if you want a default value for such null values then you can use the step "If field value is null" and you can specify the default value there. If you want to use the non-null value of the date from the previous previous rows, you can set Repeat to Y in Fields tab of the step Text file input

0
votes

Alternatively, for both cases, you can try to use a "Value Mapper" from None to something your database can accept.