ConsumeKafka_0_10

Description:

Consumes messages from Apache Kafka specifically built against the Kafka 0.10.x Consumer API. Please note there are cases where the publisher can get into an indefinite stuck state. We are closely monitoring how this evolves in the Kafka community and will take advantage of those fixes as soon as we can. In the meantime it is possible to enter states where the only resolution will be to restart the JVM NiFi runs on. The complementary NiFi processor for sending messages is PublishKafka_0_10.

Additional Details...

Tags:

Kafka, Get, Ingest, Ingress, Topic, PubSub, Consume, 0.10.x

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

NameDefault ValueAllowable ValuesDescription
Kafka Brokerslocalhost:9092A comma-separated list of known Kafka Brokers in the format <host>:<port>
Supports Expression Language: true
Security ProtocolPLAINTEXT
  • PLAINTEXT PLAINTEXT
  • SSL SSL
  • SASL_PLAINTEXT SASL_PLAINTEXT
  • SASL_SSL SASL_SSL
Protocol used to communicate with brokers. Corresponds to Kafka's 'security.protocol' property.
Kerberos Service NameThe Kerberos principal name that Kafka runs as. This can be defined either in Kafka's JAAS config or in Kafka's config. Corresponds to Kafka's 'security.protocol' property.It is ignored unless one of the SASL options of the <Security Protocol> are selected.
Kerberos PrincipalThe Kerberos principal that will be used to connect to brokers. If not set, it is expected to set a JAAS configuration file in the JVM properties defined in the bootstrap.conf file. This principal will be set into 'sasl.jaas.config' Kafka's property.
Kerberos KeytabThe Kerberos keytab that will be used to connect to brokers. If not set, it is expected to set a JAAS configuration file in the JVM properties defined in the bootstrap.conf file. This principal will be set into 'sasl.jaas.config' Kafka's property.
SSL Context ServiceController Service API:
SSLContextService
Implementation:StandardSSLContextService
Specifies the SSL Context Service to use for communicating with Kafka.
Topic Name(s)The name of the Kafka Topic(s) to pull from. More than one can be supplied if comma separated.
Supports Expression Language: true
Topic Name Formatnames
  • names Topic is a full topic name or comma separated list of names
  • pattern Topic is a regex using the Java Pattern syntax
Specifies whether the Topic(s) provided are a comma separated list of names or a single regular expression
Group IDA Group ID is used to identify consumers that are within the same consumer group. Corresponds to Kafka's 'group.id' property.
Offset Resetlatest
  • earliest Automatically reset the offset to the earliest offset
  • latest Automatically reset the offset to the latest offset
  • none Throw exception to the consumer if no previous offset is found for the consumer's group
Allows you to manage the condition when there is no initial offset in Kafka or if the current offset does not exist any more on the server (e.g. because that data has been deleted). Corresponds to Kafka's 'auto.offset.reset' property.
Key Attribute Encodingutf-8
  • UTF-8 Encoded The key is interpreted as a UTF-8 Encoded string.
  • Hex Encoded The key is interpreted as arbitrary binary data and is encoded using hexadecimal characters with uppercase letters
FlowFiles that are emitted have an attribute named 'kafka.key'. This property dictates how the value of the attribute should be encoded.
Message DemarcatorSince KafkaConsumer receives messages in batches, you have an option to output FlowFiles which contains all Kafka messages in a single batch for a given topic and partition and this property allows you to provide a string (interpreted as UTF-8) to use for demarcating apart multiple Kafka messages. This is an optional property and if not provided each Kafka message received will result in a single FlowFile which time it is triggered. To enter special character such as 'new line' use CTRL+Enter or Shift+Enter depending on the OS
Supports Expression Language: true
Max Poll Records10000Specifies the maximum number of records Kafka should return in a single poll.
Max Uncommitted Time1 secsSpecifies the maximum amount of time allowed to pass before offsets must be committed. This value impacts how often offsets will be committed. Committing offsets less often increases throughput but also increases the window of potential data duplication in the event of a rebalance or JVM restart between commits. This value is also related to maximum poll records and the use of a message demarcator. When using a message demarcator we can have far more uncommitted messages than when we're not as there is much less for us to keep track of in memory.

Dynamic Properties:

Dynamic Properties allow the user to specify both the name and value of a property.

NameValueDescription
The name of a Kafka configuration property.The value of a given Kafka configuration property.These properties will be added on the Kafka configuration after loading any provided configuration properties. In the event a dynamic property represents a property that was already set, its value will be ignored and WARN message logged. For the list of available Kafka properties please refer to: http://kafka.apache.org/documentation.html#configuration.

Relationships:

NameDescription
successFlowFiles received from Kafka. Depending on demarcation strategy it is a flow file per message or a bundle of messages grouped by topic and partition.

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
kafka.countThe number of messages written if more than one
kafka.keyThe key of message if present and if single message. How the key is encoded depends on the value of the 'Key Attribute Encoding' property.
kafka.offsetThe offset of the message in the partition of the topic.
kafka.partitionThe partition of the topic the message or message bundle is from
kafka.topicThe topic the message or message bundle is from

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component does not allow an incoming relationship.