0
votes

I have a CSV file that doesn't have a fixed number of columns, like this:

  col1,col2,col3,col4,col5    
  val1,val2,val3,val4,val5 
  column1,column2,column3
  value1,value2,value3

Is there any way to read this kind of CSV file with Spring Batch?

I tried to do this:

<bean id="ItemReader" class="org.springframework.batch.item.file.FlatFileItemReader">

    <!-- Read a csv file -->
    <property name="resource" value="classpath:file.csv" />

    <property name="lineMapper">
        <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
            <!-- split it -->
            <property name="lineTokenizer">
                <bean
                    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                    <property name="names"
                        value="col1,col2,col3,col4,col5,column1,column2,column3" />
                </bean>
            </property>
            <property name="fieldSetMapper">
                <bean
                    class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                    <property name="prototypeBeanName" value="myBean" />
                </bean>
            </property>

        </bean>
    </property>

</bean>

But the result was this error:

IncorrectTokenCountException stack trace

2
Take a look at the AbstractLineTokenizer#setStrict(boolean) (which DelimitedLineTokenizer inherits from) and set it to false.fateddy
this method don't work :(marie

2 Answers

2
votes

You can use the PatternMatchingCompositeLineMapper to delegate to the appropriate LineMapper implementation per line based on a pattern. From there, each of your delegates would use a DelimtedLineTokenizer and a FieldSetMapper to map the line accordingly.

You can read more about this in the documentation here: http://docs.spring.io/spring-batch/trunk/apidocs/org/springframework/batch/item/file/mapping/PatternMatchingCompositeLineMapper.html

1
votes

AbstractLineTokenizer#setStrict(boolean) in your DelimitedLineTokenizer should do the job.

From the javadoc :

Public setter for the strict flag. If true (the default) then number of tokens in line must match the number of tokens defined (by Range, columns, etc.) in LineTokenizer. If false then lines with less tokens will be tolerated and padded with empty columns, and lines with more tokens will simply be truncated.

You should change this part of your configuration to:

<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
    <property name="names" value="col1,col2,col3,col4,col5,column1,column2,column3" />
    <property name="strict" value="false" />
</bean>