6
votes

I have a customer who ftp's a file over to our server. I have a route defined to select certain files from this directory and move them to a different directory to be processed. The problem is that it takes it as soon as it sees it and doesn't wait till the ftp is complete. The result is a 0 byte file in the path described in the to uri. I have tried each of the readLock options (masterFile,rename,changed, fileLock) but none have worked. I am using spring DSL to define my camel routes. Here is an example of one that is not working. camel version is 2.10.0

    <route>
        <from uri="file:pathName?initialDelay=10s&amp;move=ARCHIVE&amp;sortBy=ignoreCase:file:name&amp;readLock=fileLock&amp;readLockCheckInterval=5000&amp;readLockTimeout=10m&amp;filter=#FileFilter" />
        <to uri="file:pathName/newDirectory/" />
    </route>

Any help would be appreciated. Thanks!

Just to note...At one point this route was running on a different server and I had to ftp the file to another server that processed it. When I was using the ftp component in camel, that route worked fine. That is it did wait till the file was received before doing the ftp. I had the same option on my route defined. Thats why I am thinking there should be a way to do it since the ftp component uses the file component options in camel.


I am taking @PeteH's suggestion #2 and did the following. I am still hoping there is another way, but this will work.

I added the following method that returns me a Date that is current.minus(x seconds)

public static Date getDateMinusSeconds(Integer seconds) {
Calendar cal = Calendar.getInstance();
cal.add(Calendar.SECOND, seconds);
return  cal.getTime();
}

Then within my filter I check if the initial filtering is true. If it is I compare the Last modified date to the getDateMinusSeconds(). I return a false for the filter if the comparison is true.

    if(filter){
        if(new Date(pathname.getLastModified()).after(DateUtil.getDateMinusSeconds(-30))){
            return false;
        }
    } 
3
Isn't readLock=changed plus adjusting readLockCheckInterval and readLockTimeout does the same thing? - Kenneth Xu
That is true @Ken. I had looked at the code for the readlock=change and at the time it did look like it should just work, but I remember for whatever reason it was not working. I dont recall if I tried adjusting thereadLockCheckInterval or not. I would have thought I did. - Curt
We ended up taking this out because what I mentioned in my other comment below. - Curt

3 Answers

5
votes

I have not done any of this in your environment, but have had this kind of problem before with FTP.

The better option of the two I can suggest is if you can get the customer to send two files. File1 is their data, File2 can be anything. They send them sequentially. You trap when File2 arrives, but all you're doing is using it as a "signal" that File1 has arrived safely.

The less good option (and this is the one we ended up implementing because we couldn't control the files being sent) is to write your code such that you refuse to process any file until its last modified timestamp is at least x minutes old. I think we settled on 5 minutes. This is pretty horrible since you're essentially firing, checking, sleeping, checking etc. etc.

But the problem you describe is quite well known with FTP. Like I say, I don't know whether either of these approaches will work in your environment, but certainly at a high level they're sound.

3
votes

camel inherits from the file component. This is at the top describing this very thing..

Beware the JDK File IO API is a bit limited in detecting whether another application is currently writing/copying a file. And the implementation can be different depending on OS platform as well. This could lead to that Camel thinks the file is not locked by another process and start consuming it. Therefore you have to do you own investigation what suites your environment. To help with this Camel provides different readLock options and doneFileName option that you can use. See also the section Consuming files from folders where others drop files directly.

To get around this problem I had my publishers put out a "done" file. This solves this problem

0
votes

A way to do so is to use a watcher which will trigger the job once a file is deposed and to delay the consuming of the file to a significant amount of time, to be sure that it's upload is finished.

from("file-watch://{{ftp.file_input}}?events=CREATE&recursive=false")
                .id("FILE_WATCHER")
                .log("File event: ${header.CamelFileEventType} occurred on file ${header.CamelFileName} at ${header.CamelFileLastModified}")
                .delay(20000)
                .to("direct:file_processor");

from("direct:file_processor")
                .id("FILE_DISPATCHER")
                .log("Sending To SFTP Uploader")
                .to("sftp://{{ftp.user}}@{{ftp.host}}:{{ftp.port}}//upload?password={{ftp.password}}&fileName={{file_pattern}}-${date:now:yyyyMMdd-HH:mm}.csv")
                .log("File sent to SFTP");

It's never late to respond. Hope it can help someone struggling in the deepest creepy places of the SFTP world...