TailFile

Description:

"Tails" a file, or a list of files, ingesting data from the file as it is written to the file. The file is expected to be textual. Data is ingested only when a new line is encountered (carriage return or new-line character or combination). If the file to tail is periodically "rolled over", as is generally the case with log files, an optional Rolling Filename Pattern can be used to retrieve data from files that have rolled over, even if the rollover occurred while NiFi was not running (provided that the data still exists upon restart of NiFi). It is generally advisable to set the Run Schedule to a few seconds, rather than running with the default value of 0 secs, as this Processor will consume a lot of resources if scheduled very aggressively. At this time, this Processor does not support ingesting files that have been compressed when 'rolled over'.

Additional Details...

Tags:

tail, file, log, text, source

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

NameDefault ValueAllowable ValuesDescription
Tailing modeSingle file
  • Single file In this mode, only the one file indicated in the 'Files to tail' property will be watched by the processor. In this mode, the file may not exist when starting the processor.
  • Multiple files In this mode, the 'Files to tail' property accepts a regular expression and the processor will look for files in 'Base directory' to list the files to tail by the processor.
Mode to use: single file will tail only one file, multiple file will look for a list of file. In Multiple mode the Base directory is required.
File(s) to TailPath of the file to tail in case of single file mode. If using multifile mode, regular expression to find files to tail in the base directory. In case recursivity is set to true, the regular expression will be used to match the path starting from the base directory (see additional details for examples).
Supports Expression Language: true (will be evaluated using variable registry only)
Rolling Filename PatternIf the file to tail "rolls over" as would be the case with log files, this filename pattern will be used to identify files that have rolled over so that if NiFi is restarted, and the file has rolled over, it will be able to pick up where it left off. This pattern supports wildcard characters * and ?, it also supports the notation ${filename} to specify a pattern based on the name of the file (without extension), and will assume that the files that have rolled over live in the same directory as the file being tailed. The same glob pattern will be used for all files.
Base directoryBase directory used to look for files to tail. This property is required when using Multifile mode.
Supports Expression Language: true (will be evaluated using variable registry only)
Initial Start PositionBeginning of File
  • Beginning of Time Start with the oldest data that matches the Rolling Filename Pattern and then begin reading from the File to Tail
  • Beginning of File Start with the beginning of the File to Tail. Do not ingest any data that has already been rolled over
  • Current Time Start with the data at the end of the File to Tail. Do not ingest any data thas has already been rolled over or any data in the File to Tail that has already been written.
When the Processor first begins to tail data, this property specifies where the Processor should begin reading data. Once data has been ingested from a file, the Processor will continue from the last point from which it has received data.
State LocationLocal
  • Local State is stored locally. Each node in a cluster will tail a different file.
  • Remote State is located on a remote resource. This Processor will store state across the cluster so that it can be run on Primary Node Only and a new Primary Node can pick up where the last one left off.
Specifies where the state is located either local or cluster so that state can be stored appropriately in order to ensure that all data is consumed without duplicating data upon restart of NiFi
Recursive lookupfalse
  • true
  • false
When using Multiple files mode, this property defines if files must be listed recursively or not in the base directory.
Lookup frequency10 minutesOnly used in Multiple files mode and Changing name rolling strategy. It specifies the minimum duration the processor will wait before listing again the files to tail.
Maximum age24 hoursOnly used in Multiple files mode and Changing name rolling strategy. It specifies the necessary minimum duration to consider that no new messages will be appended in a file regarding its last modification date. This should not be set too low to avoid duplication of data in case new messages are appended at a lower frequency.

Relationships:

NameDescription
successAll FlowFiles are routed to this Relationship.

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
tailfile.original.pathPath of the original file the flow file comes from.

State management:

ScopeDescription
LOCAL, CLUSTERStores state about where in the Tailed File it left off so that on restart it does not have to duplicate data. State is stored either local or clustered depend on the <File Location> property.

Restricted:

Required PermissionExplanation
read filesystemProvides operator the ability to read from any file that NiFi has access to.

Input requirement:

This component does not allow an incoming relationship.

System Resource Considerations:

None specified.