QueryRecord

Description:

Evaluates one or more SQL queries against the contents of a FlowFile. The result of the SQL query then becomes the content of the output FlowFile. This can be used, for example, for field-specific filtering, transformation, and row-level filtering. Columns can be renamed, simple calculations and aggregations performed, etc. The Processor is configured with a Record Reader Controller Service and a Record Writer service so as to allow flexibility in incoming and outgoing data formats. The Processor must be configured with at least one user-defined property. The name of the Property is the Relationship to route data to, and the value of the Property is a SQL SELECT statement that is used to specify how input data should be transformed/filtered. The SQL statement must be valid ANSI SQL and is powered by Apache Calcite. If the transformation fails, the original FlowFile is routed to the 'failure' relationship. Otherwise, the data selected will be routed to the associated relationship. If the Record Writer chooses to inherit the schema from the Record, it is important to note that the schema that is inherited will be from the ResultSet, rather than the input Record. This allows a single instance of the QueryRecord processor to have multiple queries, each of which returns a different set of columns and aggregations. As a result, though, the schema that is derived will have no schema name, so it is important that the configured Record Writer not attempt to write the Schema Name as an attribute if inheriting the Schema from the Record. See the Processor Usage documentation for more information.

Additional Details...

Tags:

sql, query, calcite, route, record, transform, select, update, modify, etl, filter, record, csv, json, logs, text, avro, aggregate

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Record Readerrecord-readerController Service API:
RecordReaderFactory
Implementations: GrokReader
JsonTreeReader
WindowsEventLogReader
ReaderLookup
ParquetReader
CSVReader
Syslog5424Reader
ExcelReader
CEFReader
XMLReader
ScriptedReader
SyslogReader
JsonPathReader
AvroReader
YamlTreeReader
Specifies the Controller Service to use for parsing incoming data and determining the data's schema
Record Writerrecord-writerController Service API:
RecordSetWriterFactory
Implementations: FreeFormTextRecordSetWriter
CSVRecordSetWriter
ParquetRecordSetWriter
RecordSetWriterLookup
ScriptedRecordSetWriter
XMLRecordSetWriter
JsonRecordSetWriter
AvroRecordSetWriter
Specifies the Controller Service to use for writing results to a FlowFile
Include Zero Record FlowFilesinclude-zero-record-flowfilestrue
  • true
  • false
When running the SQL statement against an incoming FlowFile, if the result has no data, this property specifies whether or not a FlowFile will be sent to the corresponding relationship
Cache Schemacache-schematrue
  • true
  • false
This property is no longer used. It remains solely for backward compatibility in order to avoid making existing Processors invalid upon upgrade. This property will be removed in future versions. Now, instead of forcing the user to understand the semantics of schema caching, the Processor caches up to 25 schemas and automatically rolls off the old schemas. This provides the same performance when caching was enabled previously and in some cases very significant performance improvements if caching was previously disabled.
Default Decimal Precisiondbf-default-precision10When a DECIMAL/NUMBER value is written as a 'decimal' Avro logical type, a specific 'precision' denoting number of available digits is required. Generally, precision is defined by column data type definition or database engines default. However undefined precision (0) can be returned from some database engines. 'Default Decimal Precision' is used when writing those undefined precision numbers.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Default Decimal Scaledbf-default-scale0When a DECIMAL/NUMBER value is written as a 'decimal' Avro logical type, a specific 'scale' denoting number of available decimal digits is required. Generally, scale is defined by column data type definition or database engines default. However when undefined precision (0) is returned, scale can also be uncertain with some database engines. 'Default Decimal Scale' is used when writing those undefined numbers. If a value has more decimals than specified scale, then the value will be rounded-up, e.g. 1.53 becomes 2 with scale 0, and 1.5 with scale 1.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)

Dynamic Properties:

Supports Sensitive Dynamic Properties: No

Dynamic Properties allow the user to specify both the name and value of a property.

NameValueDescription
The name of the relationship to route data toA SQL SELECT statement that is used to determine what data should be routed to this relationship.Each user-defined property specifies a SQL SELECT statement to run over the data, with the data that is selected being routed to the relationship whose name is the property name
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)

Relationships:

NameDescription
failureIf a FlowFile fails processing for any reason (for example, the SQL statement contains columns not present in input data), the original FlowFile it will be routed to this relationship
originalThe original FlowFile is routed to this relationship

Dynamic Relationships:

A Dynamic Relationship may be created based on how the user configures the Processor.

NameDescription
<Property Name>Each user-defined property defines a new Relationship for this Processor.

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
mime.typeSets the mime.type attribute to the MIME Type specified by the Record Writer
record.countThe number of records selected by the query
QueryRecord.RouteThe relation to which the FlowFile was routed

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.