PutHBaseRecord

Description:

Adds rows to HBase based on the contents of a flowfile using a configured record reader.

Additional Details...

Tags:

hadoop, hbase, put, record

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Record Readerrecord-readerController Service API:
RecordReaderFactory
Implementations: GrokReader
JsonTreeReader
WindowsEventLogReader
ReaderLookup
ParquetReader
CSVReader
Syslog5424Reader
ExcelReader
CEFReader
XMLReader
ScriptedReader
SyslogReader
JsonPathReader
AvroReader
YamlTreeReader
Specifies the Controller Service to use for parsing incoming data and determining the data's schema
HBase Client ServiceHBase Client ServiceController Service API:
HBaseClientService
Implementation: HBase_2_ClientService
Specifies the Controller Service to use for accessing HBase.
Table NameTable NameThe name of the HBase Table to put data into
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Row Identifier Field NameRow Identifier Field NameSpecifies the name of a record field whose value should be used as the row id for the given record.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Row Identifier Encoding StrategyRow Identifier Encoding StrategyString
  • String Stores the value of row id as a UTF-8 String.
  • Binary Stores the value of the rows id as a binary byte array. It expects that the row id is a binary formatted string.
Specifies the data type of Row ID used when inserting data into HBase. The default behavior is to convert the row id to a UTF-8 byte array. Choosing Binary will convert a binary formatted string to the correct byte[] representation. The Binary option should be used if you are using Binary row keys in HBase
Null Field Strategyhbase-record-null-field-strategySkip Field
  • Empty Bytes Use empty bytes. This can be used to overwrite existing fields or to put an empty placeholder value if you want every field to be present even if it has a null value.
  • Skip Field Skip the field (don't process it at all).
Handle null field values as either an empty string or skip them altogether.
Column FamilyColumn FamilyThe Column Family to use when inserting data into HBase
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Default Visibility Stringhbase-default-vis-stringWhen using visibility labels, any value set in this field will be applied to all cells that are written unless an attribute with the convention "visibility.COLUMN_FAMILY.COLUMN_QUALIFIER" is present on the flowfile. If this field is left blank, it will be assumed that no visibility is to be set unless visibility-related attributes are set. NOTE: this configuration will have no effect on your data if you have not enabled visibility labels in the HBase cluster.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Visibility String Record Path Rootput-hb-rec-visibility-record-pathA record path that points to part of the record which contains a path to a mapping of visibility strings to record paths
Timestamp Field Nametimestamp-field-nameSpecifies the name of a record field whose value should be used as the timestamp for the cells in HBase. The value of this field must be a number, string, or date that can be converted to a long. If this field is left blank, HBase will use the current time.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Batch SizeBatch Size1000The maximum number of records to be sent to HBase at any one time from the record set.
Complex Field StrategyComplex Field StrategyText
  • Fail Route entire FlowFile to failure if any elements contain complex values.
  • Warn Provide a warning and do not include field in row sent to HBase.
  • Ignore Silently ignore and do not include in row sent to HBase.
  • Text Use the string representation of the complex field as the value of the given column.
Indicates how to handle complex fields, i.e. fields that do not have a single text value.
Field Encoding StrategyField Encoding StrategyString
  • String Stores the value of each field as a UTF-8 String.
  • Bytes Stores the value of each field as the byte representation of the type derived from the record.
Indicates how to store the value of each field in HBase. The default behavior is to convert each value from the record to a String, and store the UTF-8 bytes. Choosing Bytes will interpret the type of each field from the record, and convert the value to the byte representation of that type, meaning an integer will be stored as the byte representation of that integer.

Dynamic Properties:

Supports Sensitive Dynamic Properties: No

Dynamic Properties allow the user to specify both the name and value of a property.

NameValueDescription
visibility.<COLUMN FAMILY>visibility label for <COLUMN FAMILY>Visibility label for everything under that column family when a specific label for a particular column qualifier is not available.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
visibility.<COLUMN FAMILY>.<COLUMN QUALIFIER>visibility label for <COLUMN FAMILY>:<COLUMN QUALIFIER>.Visibility label for the specified column qualifier qualified by a configured column family.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)

Relationships:

NameDescription
successA FlowFile is routed to this relationship after it has been successfully stored in HBase
failureA FlowFile is routed to this relationship if it cannot be sent to HBase

Reads Attributes:

NameDescription
restart.indexReads restart.index when it needs to replay part of a record set that did not get into HBase.

Writes Attributes:

NameDescription
restart.indexWrites restart.index when a batch fails to be insert into HBase

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.