1
votes

I am trying to load a tab delimited file from a remote Linux server and load it in to Postgres database using Spring Cloud Data Flow, but stuck on choosing appropriate source and sink.

For source, I tried using File source as well SFTP source. File source doesn't seem to have any option to connect to a remote Linux server and SFTP source has those options, but it mostly appears to be used for transferring files rather than parsing it.

For Sink, I have installed PgCopy sink and planning to use it to load the data. I would like to know if this would be right sink to go for my use case.

Thanks.

1

1 Answers

0
votes

You can use the sftp source with --mode lines to split the file content by line, and send a message for each line.

Then, you can parse each line using a CSV parser. I did a processor that is able to split each line to a java.util.Map using jackson-csv given a configuration. I think you could use it as a base for your own, or use it as I show it in this video (in french).

To finish, you can publish each map to your pgsql table using jdbc-sink as I did in the given video with this configuration.