ForkEnrichment 2.0.0

Bundle
org.apache.nifi | nifi-standard-nar
Description
Used in conjunction with the JoinEnrichment processor, this processor is responsible for adding the attributes that are necessary for the JoinEnrichment processor to perform its function. Each incoming FlowFile will be cloned. The original FlowFile will have appropriate attributes added and then be transferred to the 'original' relationship. The clone will have appropriate attributes added and then be routed to the 'enrichment' relationship. See the documentation for the JoinEnrichment processor (and especially its Additional Details) for more information on how these Processors work together and how to perform enrichment tasks in NiFi by using these Processors.
Tags
enrich, fork, join, record
Input Requirement
REQUIRED
Supports Sensitive Dynamic Properties
false
  • Additional Details for ForkEnrichment 2.0.0

    ForkEnrichment

    Introduction

    The ForkEnrichment processor is designed to be used in conjunction with the JoinEnrichment Processor. Used together, they provide a powerful mechanism for transforming data into a separate request payload for gathering enrichment data, gathering that enrichment data, optionally transforming the enrichment data, and finally joining together the original payload with the enrichment data.

    Typical Dataflow

    A ForkEnrichment processor that is responsible for taking in a FlowFile and producing two copies of it: one to the “original” relationship and the other to the “enrichment” relationship. Each copy will have its own set of attributes added to it.

    The “original” FlowFile being routed to the JoinEnrichment processor, while the “enrichment” FlowFile is routed in a different direction. Each of these FlowFiles will have an attribute named “enrichment.group.id” with the same value. The JoinEnrichment processor then uses this information to correlate the two FlowFiles. The “enrichment.role” attribute will also be added to each FlowFile but with a different value. The FlowFile routed to “original” will have an enrichment.role of ORIGINAL while the FlowFile routed to “enrichment” will have an enrichment.role of ENRICHMENT.

    The Processors that make up the “enrichment” path will vary from use case to use case. We use JoltTransformJSON processor in order to transform our payload from the original payload into a payload that is expected by our web service. We then use the InvokeHTTP processor in order to gather enrichment data that is relevant to our use case. Other common processors to use in this path include QueryRecord, UpdateRecord, ReplaceText, JoltTransformRecord, and ScriptedTransformRecord. It is also be a common use case to transform the response from the web service that is invoked via InvokeHTTP using one or more of these processors.

    After the enrichment data has been gathered, it does us little good unless we are able to somehow combine our enrichment data back with our original payload. To achieve this, we use the JoinEnrichment processor. It is responsible for combining records from both the “original” FlowFile and the “enrichment” FlowFile.

    The JoinEnrichment Processor is configured with a separate RecordReader for the “original” FlowFile and for the “enrichment” FlowFile. This means that the original data and the enrichment data can have entirely different schemas and can even be in different data formats. For example, our original payload may be CSV data, while our enrichment data is a JSON payload. Because we make use of RecordReaders, this is entirely okay. The Processor also requires a RecordWriter to use for writing out the enriched payload (i.e., the payload that contains the join of both the “original” and the “enrichment” data).

    For details on how to join the original payload with the enrichment data, see the Additional Details of the JoinEnrichment Processor documentation.

Properties
Relationships
Name Description
enrichment A clone of the incoming FlowFile will be routed to this relationship, after adding appropriate attributes.
original The incoming FlowFile will be routed to this relationship, after adding appropriate attributes.
Writes Attributes
Name Description
enrichment.group.id The Group ID to use in order to correlate the 'original' FlowFile with the 'enrichment' FlowFile.
enrichment.role The role to use for enrichment. This will either be ORIGINAL or ENRICHMENT.
See Also