RecordReaderFactory
Implementations: CSVReader
JsonPathReader
AvroReader
CEFReader
Syslog5424Reader
JsonTreeReader
WindowsEventLogReader
XMLReader
SyslogReader
JASN1Reader
ReaderLookup
ParquetReader
GrokReader
ScriptedReader
YamlTreeReader
ExcelReader
Supports Sensitive Dynamic Properties: No
Dynamic Properties allow the user to specify both the name and value of a property.
Name | Value | Description |
---|---|---|
The name of the relationship to route data to | A SQL SELECT statement that is used to determine what data should be routed to this relationship. | Each user-defined property specifies a SQL SELECT statement to run over the data, with the data that is selected being routed to the relationship whose name is the property name Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables) |
Name | Description |
---|---|
failure | If a FlowFile fails processing for any reason (for example, the SQL statement contains columns not present in input data), the original FlowFile it will be routed to this relationship |
original | The original FlowFile is routed to this relationship |
A Dynamic Relationship may be created based on how the user configures the Processor.
Name | Description |
---|---|
<Property Name> | Each user-defined property defines a new Relationship for this Processor. |
Name | Description |
---|---|
mime.type | Sets the mime.type attribute to the MIME Type specified by the Record Writer |
record.count | The number of records selected by the query |
QueryRecord.Route | The relation to which the FlowFile was routed |
Filter out records based on the values of the records' fields
"Record Reader" should be set to a Record Reader that is appropriate for your data.
"Record Writer" should be set to a Record Writer that writes out data in the desired format.
One additional property should be added.
The name of the property should be a short description of the data to keep.
Its value is a SQL statement that selects all columns from a table named FLOW_FILE
for relevant rows.
The WHERE clause selects the data to keep. I.e., it is the exact opposite of what we want to remove.
It is recommended to always quote column names using double-quotes in order to avoid conflicts with SQL keywords.
For example, to remove records where either the name is George OR the age is less than 18, we would add a property named "adults not george" with a value that selects records where the name is not George AND the age is greater than or equal to 18. So the value would be SELECT * FROM FLOWFILE WHERE "name" <> 'George' AND "age" >= 18
Adding this property now gives us a new Relationship whose name is the same as the property name. So, the "adults not george" Relationship should be connected to the next Processor in our flow.
Keep only specific records
"Record Reader" should be set to a Record Reader that is appropriate for your data.
"Record Writer" should be set to a Record Writer that writes out data in the desired format.
One additional property should be added.
The name of the property should be a short description of the data to keep.
Its value is a SQL statement that selects all columns from a table named FLOW_FILE
for relevant rows.
The WHERE clause selects the data to keep.
It is recommended to always quote column names using double-quotes in order to avoid conflicts with SQL keywords.
For example, to keep only records where the person is an adult (aged 18 or older), add a property named "adults" with a value that is a SQL statement that selects records where the age is at least 18. So the value would be SELECT * FROM FLOWFILE WHERE "age" >= 18
Adding this property now gives us a new Relationship whose name is the same as the property name. So, the "adults" Relationship should be connected to the next Processor in our flow.
Keep only specific fields in a a Record, where the names of the fields to keep are known
"Record Reader" should be set to a Record Reader that is appropriate for your data.
"Record Writer" should be set to a Record Writer that writes out data in the desired format.
One additional property should be added.
The name of the property should be a short description of the data to keep, such as relevant fields
.
Its value is a SQL statement that selects the desired columns from a table named FLOW_FILE
for relevant rows.
There is no WHERE clause.
It is recommended to always quote column names using double-quotes in order to avoid conflicts with SQL keywords.
For example, to keep only the name
, age
, and address
fields, add a property named relevant fields
with a value of SELECT "name", "age", "address" FROM FLOWFILE
Adding this property now gives us a new Relationship whose name is the same as the property name. So, the relevant fields
Relationship should be connected to the next Processor in our flow.
Route record-oriented data for processing based on its contents
"Record Reader" should be set to a Record Reader that is appropriate for your data.
"Record Writer" should be set to a Record Writer that writes out data in the desired format.
For each route that you want to create, add a new property.
The name of the property should be a short description of the data that should be selected for the route.
Its value is a SQL statement that selects all columns from a table named FLOW_FILE
. The WHERE clause selects the data that should be included in the route.
It is recommended to always quote column names using double-quotes in order to avoid conflicts with SQL keywords.
A new outbound relationship is created for each property that is added. The name of the relationship is the same as the property name.
For example, to route data based on whether or not it is a large transaction, we would add two properties:
small transaction
would have a value such as SELECT * FROM FLOWFILE WHERE transactionTotal < 100
large transaction
would have a value of SELECT * FROM FLOWFILE WHERE transactionTotal >= 100