0
votes

I am currently using Flume 1.7 . Configured a spooling directory source. I have enabled recursiveDirectorySearch=true to look in to the sub directories for files.

source.spoolDir=/tmp/test

and under /tmp/test, subdirectories get created with data files /tmp/test/data1/file.csv , /tmp/test/data2/file2.csv .

I want the exact sub directory structure to be created in the HDFS sink path.

/sink/data1/file.csv /sink/data2/file2.csv

When i use the %{file} for HDFS sink filepath, i get the complete absolute path, and %{basename} gives me only the file name. I want to extract the sub directory structure from the spooldir source path. Any way to achieve this?

1

1 Answers

0
votes

You can make use of the fileHeader and fileHeaderKey properties and refer to this header variable at your sink configuration to get the absolute path.

https://flume.apache.org/FlumeUserGuide.html#spooling-directory-source