I was about to answer NO but it seems that it is possible.
You are able to cache variables inside the ExecuteScript processor.
general idea
Using a simple script with the ExecuteScript processor using the EcmaScript engine shows that you actually are able to store state inside the processor.
var flowFile = session.get();
if (flowFile !== null) {
var x = (x || 0) + 1;
log.error('this is round: ' + x);
session.transfer(flowFile, REL_SUCCESS);
}
Using this script inside the processor will result in something along the lines being logged:
...
ExecuteScript[id=...] this is round: 3
ExecuteScript[id=...] this is round: 2
ExecuteScript[id=...] this is round: 1
updating the file at most every x time units
I borowed the base code from the existing NiFi ValidateXML processor.
The basic idea is to update the file when
- it is not set yet or
- at least x units of time have passed since last update
The following code will achieve this, whereby SCHEMA_FILE_PATH is the path to the schema file. In this case x is thirty seconds:
// type definitions
var File = Java.type("java.io.File");
var FileNotFoundException = Java.type("java.io.FileNotFoundException");
var System = Java.type("java.lang.System");
// constants
var SCHEMA_FILE_PATH = "/foo/bar"; // exchange with real path
var timeoutInMillis = 30 * 1000; // 30 seconds
// initialize
var schemaFile = schemaFile || null;
var lastUpdateMillis = lastUpdateMillis || 0;
var flowFile = session.get();
function updateSchemaFile() {
schemaFile = new File(SCHEMA_FILE_PATH);
if (!schemaFile.exists()) {
throw new FileNotFoundException("Schema file not found at specified location: " + schemaFile.getAbsolutePath());
}
lastUpdateMillis = System.currentTimeMillis();
}
if (flowFile !== null) {
var now = System.currentTimeMillis();
var schemaFileShouldBeUpdated = (schemaFile == null) || ((lastUpdateMillis || 0) + timeoutInMillis) < now;
if (schemaFileShouldBeUpdated) {
updateSchemaFile();
}
// TODO Do with the file whatever you want
log.error('was file updated this round? ' + schemaFileShouldBeUpdated + '; last update millis: ' + lastUpdateMillis);
session.transfer(flowFile, REL_SUCCESS);
}
DISCLAIMER
I cannot tell if, let alone when, the variable/s may be purged. Inspecting the source code used in the ExecuteScript processor indicates that the script file is reloaded periodically. I am not sure about the consequences of that.
Also I haven't tried using one of the other ScriptingLanguage supported as I'm most familiar with JavaScript.
Handling processor start & stopsection in nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/… - daggett