PutKudu

Description:

Reads records from an incoming FlowFile using the provided Record Reader, and writes those records to the specified Kudu's table. The schema for the table must be provided in the processor properties or from your source. If any error occurs while reading records from the input, or writing records to Kudu, the FlowFile will be routed to failure

Tags:

put, database, NoSQL, kudu, HDFS, record

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

NameDefault ValueAllowable ValuesDescription
Kudu MastersList all kudu masters's ip with port (e.g. 7051), comma separated
Supports Expression Language: true (will be evaluated using variable registry only)
Table NameThe name of the Kudu Table to put data into
Supports Expression Language: true (will be evaluated using variable registry only)
Skip head linefalse
  • true
  • false
Deprecated. Used to ignore header lines, but this should be handled by a RecordReader (e.g. "Treat First Line as Header" property of CSVReader)
Record ReaderController Service API:
RecordReaderFactory
Implementations: Syslog5424Reader
AvroReader
JsonPathReader
ScriptedReader
XMLReader
GrokReader
JsonTreeReader
SyslogReader
CSVReader
The service for reading records from incoming flow files.
Insert OperationINSERT
  • INSERT
  • INSERT_IGNORE
  • UPSERT
Specify operationType for this processor. Insert-Ignore will ignore duplicated rows
Flush ModeAUTO_FLUSH_BACKGROUND
  • AUTO_FLUSH_SYNC
  • AUTO_FLUSH_BACKGROUND
  • MANUAL_FLUSH
Set the new flush mode for a kudu session. AUTO_FLUSH_SYNC: the call returns when the operation is persisted, else it throws an exception. AUTO_FLUSH_BACKGROUND: the call returns when the operation has been added to the buffer. This call should normally perform only fast in-memory operations but it may have to wait when the buffer is full and there's another buffer being flushed. MANUAL_FLUSH: the call returns when the operation has been added to the buffer, else it throws a KuduException if the buffer is full.
Batch Size100The maximum number of FlowFiles to process in a single execution, between 1 - 100000. Depending on your memory size, and data size per row set an appropriate batch size. Gradually increase this number to find out the best one for best performances.
Supports Expression Language: true (will be evaluated using variable registry only)

Relationships:

NameDescription
successA FlowFile is routed to this relationship after it has been successfully stored in Kudu
failureA FlowFile is routed to this relationship if it cannot be sent to Kudu

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
record.countNumber of records written to Kudu

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.