-
Processors
- AttributeRollingWindow
- AttributesToCSV
- AttributesToJSON
- CalculateRecordStats
- CaptureChangeMySQL
- CompressContent
- ConnectWebSocket
- ConsumeAMQP
- ConsumeAzureEventHub
- ConsumeElasticsearch
- ConsumeGCPubSub
- ConsumeIMAP
- ConsumeJMS
- ConsumeKafka
- ConsumeKinesisStream
- ConsumeMQTT
- ConsumePOP3
- ConsumeSlack
- ConsumeTwitter
- ConsumeWindowsEventLog
- ControlRate
- ConvertCharacterSet
- ConvertRecord
- CopyAzureBlobStorage_v12
- CopyS3Object
- CountText
- CryptographicHashContent
- DebugFlow
- DecryptContentAge
- DecryptContentPGP
- DeduplicateRecord
- DeleteAzureBlobStorage_v12
- DeleteAzureDataLakeStorage
- DeleteByQueryElasticsearch
- DeleteDynamoDB
- DeleteFile
- DeleteGCSObject
- DeleteGridFS
- DeleteMongo
- DeleteS3Object
- DeleteSFTP
- DeleteSQS
- DetectDuplicate
- DistributeLoad
- DuplicateFlowFile
- EncodeContent
- EncryptContentAge
- EncryptContentPGP
- EnforceOrder
- EvaluateJsonPath
- EvaluateXPath
- EvaluateXQuery
- ExecuteGroovyScript
- ExecuteProcess
- ExecuteScript
- ExecuteSQL
- ExecuteSQLRecord
- ExecuteStreamCommand
- ExtractAvroMetadata
- ExtractEmailAttachments
- ExtractEmailHeaders
- ExtractGrok
- ExtractHL7Attributes
- ExtractRecordSchema
- ExtractText
- FetchAzureBlobStorage_v12
- FetchAzureDataLakeStorage
- FetchBoxFile
- FetchDistributedMapCache
- FetchDropbox
- FetchFile
- FetchFTP
- FetchGCSObject
- FetchGoogleDrive
- FetchGridFS
- FetchS3Object
- FetchSFTP
- FetchSmb
- FilterAttribute
- FlattenJson
- ForkEnrichment
- ForkRecord
- GenerateFlowFile
- GenerateRecord
- GenerateTableFetch
- GeoEnrichIP
- GeoEnrichIPRecord
- GeohashRecord
- GetAsanaObject
- GetAwsPollyJobStatus
- GetAwsTextractJobStatus
- GetAwsTranscribeJobStatus
- GetAwsTranslateJobStatus
- GetAzureEventHub
- GetAzureQueueStorage_v12
- GetDynamoDB
- GetElasticsearch
- GetFile
- GetFTP
- GetGcpVisionAnnotateFilesOperationStatus
- GetGcpVisionAnnotateImagesOperationStatus
- GetHubSpot
- GetMongo
- GetMongoRecord
- GetS3ObjectMetadata
- GetSFTP
- GetShopify
- GetSmbFile
- GetSNMP
- GetSplunk
- GetSQS
- GetWorkdayReport
- GetZendesk
- HandleHttpRequest
- HandleHttpResponse
- IdentifyMimeType
- InvokeHTTP
- InvokeScriptedProcessor
- ISPEnrichIP
- JoinEnrichment
- JoltTransformJSON
- JoltTransformRecord
- JSLTTransformJSON
- JsonQueryElasticsearch
- ListAzureBlobStorage_v12
- ListAzureDataLakeStorage
- ListBoxFile
- ListDatabaseTables
- ListDropbox
- ListenFTP
- ListenHTTP
- ListenOTLP
- ListenSlack
- ListenSyslog
- ListenTCP
- ListenTrapSNMP
- ListenUDP
- ListenUDPRecord
- ListenWebSocket
- ListFile
- ListFTP
- ListGCSBucket
- ListGoogleDrive
- ListS3
- ListSFTP
- ListSmb
- LogAttribute
- LogMessage
- LookupAttribute
- LookupRecord
- MergeContent
- MergeRecord
- ModifyBytes
- ModifyCompression
- MonitorActivity
- MoveAzureDataLakeStorage
- Notify
- PackageFlowFile
- PaginatedJsonQueryElasticsearch
- ParseEvtx
- ParseNetflowv5
- ParseSyslog
- ParseSyslog5424
- PartitionRecord
- PublishAMQP
- PublishGCPubSub
- PublishJMS
- PublishKafka
- PublishMQTT
- PublishSlack
- PutAzureBlobStorage_v12
- PutAzureCosmosDBRecord
- PutAzureDataExplorer
- PutAzureDataLakeStorage
- PutAzureEventHub
- PutAzureQueueStorage_v12
- PutBigQuery
- PutBoxFile
- PutCloudWatchMetric
- PutDatabaseRecord
- PutDistributedMapCache
- PutDropbox
- PutDynamoDB
- PutDynamoDBRecord
- PutElasticsearchJson
- PutElasticsearchRecord
- PutEmail
- PutFile
- PutFTP
- PutGCSObject
- PutGoogleDrive
- PutGridFS
- PutKinesisFirehose
- PutKinesisStream
- PutLambda
- PutMongo
- PutMongoBulkOperations
- PutMongoRecord
- PutRecord
- PutRedisHashRecord
- PutS3Object
- PutSalesforceObject
- PutSFTP
- PutSmbFile
- PutSNS
- PutSplunk
- PutSplunkHTTP
- PutSQL
- PutSQS
- PutSyslog
- PutTCP
- PutUDP
- PutWebSocket
- PutZendeskTicket
- QueryAirtableTable
- QueryAzureDataExplorer
- QueryDatabaseTable
- QueryDatabaseTableRecord
- QueryRecord
- QuerySalesforceObject
- QuerySplunkIndexingStatus
- RemoveRecordField
- RenameRecordField
- ReplaceText
- ReplaceTextWithMapping
- RetryFlowFile
- RouteHL7
- RouteOnAttribute
- RouteOnContent
- RouteText
- RunMongoAggregation
- SampleRecord
- ScanAttribute
- ScanContent
- ScriptedFilterRecord
- ScriptedPartitionRecord
- ScriptedTransformRecord
- ScriptedValidateRecord
- SearchElasticsearch
- SegmentContent
- SendTrapSNMP
- SetSNMP
- SignContentPGP
- SplitAvro
- SplitContent
- SplitExcel
- SplitJson
- SplitPCAP
- SplitRecord
- SplitText
- SplitXml
- StartAwsPollyJob
- StartAwsTextractJob
- StartAwsTranscribeJob
- StartAwsTranslateJob
- StartGcpVisionAnnotateFilesOperation
- StartGcpVisionAnnotateImagesOperation
- TagS3Object
- TailFile
- TransformXml
- UnpackContent
- UpdateAttribute
- UpdateByQueryElasticsearch
- UpdateCounter
- UpdateDatabaseTable
- UpdateRecord
- ValidateCsv
- ValidateJson
- ValidateRecord
- ValidateXml
- VerifyContentMAC
- VerifyContentPGP
- Wait
-
Controller Services
- ADLSCredentialsControllerService
- ADLSCredentialsControllerServiceLookup
- AmazonGlueSchemaRegistry
- ApicurioSchemaRegistry
- AvroReader
- AvroRecordSetWriter
- AvroSchemaRegistry
- AWSCredentialsProviderControllerService
- AzureBlobStorageFileResourceService
- AzureCosmosDBClientService
- AzureDataLakeStorageFileResourceService
- AzureEventHubRecordSink
- AzureStorageCredentialsControllerService_v12
- AzureStorageCredentialsControllerServiceLookup_v12
- CEFReader
- ConfluentEncodedSchemaReferenceReader
- ConfluentEncodedSchemaReferenceWriter
- ConfluentSchemaRegistry
- CSVReader
- CSVRecordLookupService
- CSVRecordSetWriter
- DatabaseRecordLookupService
- DatabaseRecordSink
- DatabaseTableSchemaRegistry
- DBCPConnectionPool
- DBCPConnectionPoolLookup
- DistributedMapCacheLookupService
- ElasticSearchClientServiceImpl
- ElasticSearchLookupService
- ElasticSearchStringLookupService
- EmailRecordSink
- EmbeddedHazelcastCacheManager
- ExcelReader
- ExternalHazelcastCacheManager
- FreeFormTextRecordSetWriter
- GCPCredentialsControllerService
- GCSFileResourceService
- GrokReader
- HazelcastMapCacheClient
- HikariCPConnectionPool
- HttpRecordSink
- IPLookupService
- JettyWebSocketClient
- JettyWebSocketServer
- JMSConnectionFactoryProvider
- JndiJmsConnectionFactoryProvider
- JsonConfigBasedBoxClientService
- JsonPathReader
- JsonRecordSetWriter
- JsonTreeReader
- Kafka3ConnectionService
- KerberosKeytabUserService
- KerberosPasswordUserService
- KerberosTicketCacheUserService
- LoggingRecordSink
- MapCacheClientService
- MapCacheServer
- MongoDBControllerService
- MongoDBLookupService
- PropertiesFileLookupService
- ProtobufReader
- ReaderLookup
- RecordSetWriterLookup
- RecordSinkServiceLookup
- RedisConnectionPoolService
- RedisDistributedMapCacheClientService
- RestLookupService
- S3FileResourceService
- ScriptedLookupService
- ScriptedReader
- ScriptedRecordSetWriter
- ScriptedRecordSink
- SetCacheClientService
- SetCacheServer
- SimpleCsvFileLookupService
- SimpleDatabaseLookupService
- SimpleKeyValueLookupService
- SimpleRedisDistributedMapCacheClientService
- SimpleScriptedLookupService
- SiteToSiteReportingRecordSink
- SlackRecordSink
- SmbjClientProviderService
- StandardAsanaClientProviderService
- StandardAzureCredentialsControllerService
- StandardDropboxCredentialService
- StandardFileResourceService
- StandardHashiCorpVaultClientService
- StandardHttpContextMap
- StandardJsonSchemaRegistry
- StandardKustoIngestService
- StandardKustoQueryService
- StandardOauth2AccessTokenProvider
- StandardPGPPrivateKeyService
- StandardPGPPublicKeyService
- StandardPrivateKeyService
- StandardProxyConfigurationService
- StandardRestrictedSSLContextService
- StandardS3EncryptionService
- StandardSSLContextService
- StandardWebClientServiceProvider
- Syslog5424Reader
- SyslogReader
- UDPEventRecordSink
- VolatileSchemaCache
- WindowsEventLogReader
- XMLFileLookupService
- XMLReader
- XMLRecordSetWriter
- YamlTreeReader
- ZendeskRecordSink
MergeContent 2.0.0
- Bundle
- org.apache.nifi | nifi-standard-nar
- Description
- Merges a Group of FlowFiles together based on a user-defined strategy and packages them into a single FlowFile. It is recommended that the Processor be configured with only a single incoming connection, as Group of FlowFiles will not be created from FlowFiles in different connections. This processor updates the mime.type attribute as appropriate. NOTE: this processor should NOT be configured with Cron Driven for the Scheduling Strategy.
- Tags
- archive, concatenation, content, correlation, flowfile-stream, flowfile-stream-v3, merge, stream, tar, zip
- Input Requirement
- REQUIRED
- Supports Sensitive Dynamic Properties
- false
Properties
-
Attribute Strategy
Determines which FlowFile attributes should be added to the bundle. If 'Keep All Unique Attributes' is selected, any attribute on any FlowFile that gets bundled will be kept unless its value conflicts with the value from another FlowFile. If 'Keep Only Common Attributes' is selected, only the attributes that exist on all FlowFiles in the bundle, with the same value, will be preserved.
- Display Name
- Attribute Strategy
- Description
- Determines which FlowFile attributes should be added to the bundle. If 'Keep All Unique Attributes' is selected, any attribute on any FlowFile that gets bundled will be kept unless its value conflicts with the value from another FlowFile. If 'Keep Only Common Attributes' is selected, only the attributes that exist on all FlowFiles in the bundle, with the same value, will be preserved.
- API Name
- Attribute Strategy
- Default Value
- Keep Only Common Attributes
- Allowable Values
-
- Keep Only Common Attributes
- Keep All Unique Attributes
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Compression Level
Specifies the compression level to use when using the Zip Merge Format; if not using the Zip Merge Format, this value is ignored
- Display Name
- Compression Level
- Description
- Specifies the compression level to use when using the Zip Merge Format; if not using the Zip Merge Format, this value is ignored
- API Name
- Compression Level
- Default Value
- 1
- Allowable Values
-
- 0
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
- Dependencies
-
- Merge Format is set to any of [ZIP]
-
Correlation Attribute Name
If specified, like FlowFiles will be binned together, where 'like FlowFiles' means FlowFiles that have the same value for this Attribute. If not specified, FlowFiles are bundled by the order in which they are pulled from the queue.
- Display Name
- Correlation Attribute Name
- Description
- If specified, like FlowFiles will be binned together, where 'like FlowFiles' means FlowFiles that have the same value for this Attribute. If not specified, FlowFiles are bundled by the order in which they are pulled from the queue.
- API Name
- Correlation Attribute Name
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Merge Strategy is set to any of [Bin-Packing Algorithm]
-
Delimiter Strategy
Determines if Header, Footer, and Demarcator should point to files containing the respective content, or if the values of the properties should be used as the content.
- Display Name
- Delimiter Strategy
- Description
- Determines if Header, Footer, and Demarcator should point to files containing the respective content, or if the values of the properties should be used as the content.
- API Name
- Delimiter Strategy
- Default Value
- Do Not Use Delimiters
- Allowable Values
-
- Do Not Use Delimiters
- Filename
- Text
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
- Dependencies
-
- Merge Format is set to any of [Binary Concatenation]
-
Demarcator
Filename or text specifying the demarcator to use. If not specified, no demarcator is supplied.
- Display Name
- Demarcator
- Description
- Filename or text specifying the demarcator to use. If not specified, no demarcator is supplied.
- API Name
- Demarcator File
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Merge Format is set to any of [Binary Concatenation]
- Delimiter Strategy is set to any of [Filename, Text]
-
Footer
Filename or text specifying the footer to use. If not specified, no footer is supplied.
- Display Name
- Footer
- Description
- Filename or text specifying the footer to use. If not specified, no footer is supplied.
- API Name
- Footer File
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Merge Format is set to any of [Binary Concatenation]
- Delimiter Strategy is set to any of [Filename, Text]
-
Header
Filename or text specifying the header to use. If not specified, no header is supplied.
- Display Name
- Header
- Description
- Filename or text specifying the header to use. If not specified, no header is supplied.
- API Name
- Header File
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Merge Format is set to any of [Binary Concatenation]
- Delimiter Strategy is set to any of [Filename, Text]
-
Keep Path
If using the Zip or Tar Merge Format, specifies whether or not the FlowFiles' paths should be included in their entry names.
- Display Name
- Keep Path
- Description
- If using the Zip or Tar Merge Format, specifies whether or not the FlowFiles' paths should be included in their entry names.
- API Name
- Keep Path
- Default Value
- false
- Allowable Values
-
- true
- false
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
- Dependencies
-
- Merge Format is set to any of [ZIP, TAR]
-
Max Bin Age
The maximum age of a Bin that will trigger a Bin to be complete. Expected format is <duration> <time unit> where <duration> is a positive integer and time unit is one of seconds, minutes, hours
- Display Name
- Max Bin Age
- Description
- The maximum age of a Bin that will trigger a Bin to be complete. Expected format is <duration> <time unit> where <duration> is a positive integer and time unit is one of seconds, minutes, hours
- API Name
- Max Bin Age
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- false
-
Maximum Group Size
The maximum size for the bundle. If not specified, there is no maximum.
- Display Name
- Maximum Group Size
- Description
- The maximum size for the bundle. If not specified, there is no maximum.
- API Name
- Maximum Group Size
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- false
- Dependencies
-
- Merge Strategy is set to any of [Bin-Packing Algorithm]
-
Maximum number of Bins
Specifies the maximum number of bins that can be held in memory at any one time
- Display Name
- Maximum number of Bins
- Description
- Specifies the maximum number of bins that can be held in memory at any one time
- API Name
- Maximum number of Bins
- Default Value
- 5
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Maximum Number of Entries
The maximum number of files to include in a bundle
- Display Name
- Maximum Number of Entries
- Description
- The maximum number of files to include in a bundle
- API Name
- Maximum Number of Entries
- Default Value
- 1000
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
- Dependencies
-
- Merge Strategy is set to any of [Bin-Packing Algorithm]
-
Merge Format
Determines the format that will be used to merge the content.
- Display Name
- Merge Format
- Description
- Determines the format that will be used to merge the content.
- API Name
- Merge Format
- Default Value
- Binary Concatenation
- Allowable Values
-
- TAR
- ZIP
- FlowFile Stream, v3
- FlowFile Stream, v2
- FlowFile Tar, v1
- Binary Concatenation
- Avro
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Merge Strategy
Specifies the algorithm used to merge content. The 'Defragment' algorithm combines fragments that are associated by attributes back into a single cohesive FlowFile. The 'Bin-Packing Algorithm' generates a FlowFile populated by arbitrarily chosen FlowFiles
- Display Name
- Merge Strategy
- Description
- Specifies the algorithm used to merge content. The 'Defragment' algorithm combines fragments that are associated by attributes back into a single cohesive FlowFile. The 'Bin-Packing Algorithm' generates a FlowFile populated by arbitrarily chosen FlowFiles
- API Name
- Merge Strategy
- Default Value
- Bin-Packing Algorithm
- Allowable Values
-
- Bin-Packing Algorithm
- Defragment
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Metadata Strategy
For FlowFiles whose input format supports metadata (Avro, e.g.), this property determines which metadata should be added to the bundle. If 'Use First Metadata' is selected, the metadata keys/values from the first FlowFile to be bundled will be used. If 'Keep Only Common Metadata' is selected, only the metadata that exists on all FlowFiles in the bundle, with the same value, will be preserved. If 'Ignore Metadata' is selected, no metadata is transferred to the outgoing bundled FlowFile. If 'Do Not Merge Uncommon Metadata' is selected, any FlowFile whose metadata values do not match those of the first bundled FlowFile will not be merged.
- Display Name
- Metadata Strategy
- Description
- For FlowFiles whose input format supports metadata (Avro, e.g.), this property determines which metadata should be added to the bundle. If 'Use First Metadata' is selected, the metadata keys/values from the first FlowFile to be bundled will be used. If 'Keep Only Common Metadata' is selected, only the metadata that exists on all FlowFiles in the bundle, with the same value, will be preserved. If 'Ignore Metadata' is selected, no metadata is transferred to the outgoing bundled FlowFile. If 'Do Not Merge Uncommon Metadata' is selected, any FlowFile whose metadata values do not match those of the first bundled FlowFile will not be merged.
- API Name
- mergecontent-metadata-strategy
- Default Value
- Do Not Merge Uncommon Metadata
- Allowable Values
-
- Use First Metadata
- Keep Only Common Metadata
- Do Not Merge Uncommon Metadata
- Ignore Metadata
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
- Dependencies
-
- Merge Format is set to any of [Avro]
-
Minimum Group Size
The minimum size for the bundle
- Display Name
- Minimum Group Size
- Description
- The minimum size for the bundle
- API Name
- Minimum Group Size
- Default Value
- 0 B
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
- Dependencies
-
- Merge Strategy is set to any of [Bin-Packing Algorithm]
-
Minimum Number of Entries
The minimum number of files to include in a bundle
- Display Name
- Minimum Number of Entries
- Description
- The minimum number of files to include in a bundle
- API Name
- Minimum Number of Entries
- Default Value
- 1
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
- Dependencies
-
- Merge Strategy is set to any of [Bin-Packing Algorithm]
-
Tar Modified Time
If using the Tar Merge Format, specifies if the Tar entry should store the modified timestamp either by expression (e.g. ${file.lastModifiedTime} or static value, both of which must match the ISO8601 format 'yyyy-MM-dd'T'HH:mm:ssZ'.
- Display Name
- Tar Modified Time
- Description
- If using the Tar Merge Format, specifies if the Tar entry should store the modified timestamp either by expression (e.g. ${file.lastModifiedTime} or static value, both of which must match the ISO8601 format 'yyyy-MM-dd'T'HH:mm:ssZ'.
- API Name
- Tar Modified Time
- Default Value
- ${file.lastModifiedTime}
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Merge Format is set to any of [TAR]
System Resource Considerations
Resource | Description |
---|---|
MEMORY | While content is not stored in memory, the FlowFiles' attributes are. The configuration of MergeContent (maximum bin size, maximum group size, maximum bin age, max number of entries) will influence how much memory is used. If merging together many small FlowFiles, a two-stage approach may be necessary in order to avoid excessive use of memory. |
Relationships
Name | Description |
---|---|
merged | The FlowFile containing the merged content |
original | The FlowFiles that were used to create the bundle |
failure | If the bundle cannot be created, all FlowFiles that would have been used to created the bundle will be transferred to failure |
Reads Attributes
Name | Description |
---|---|
fragment.identifier | Applicable only if the <Merge Strategy> property is set to Defragment. All FlowFiles with the same value for this attribute will be bundled together. |
fragment.index | Applicable only if the <Merge Strategy> property is set to Defragment. This attribute indicates the order in which the fragments should be assembled. This attribute must be present on all FlowFiles when using the Defragment Merge Strategy and must be a unique (i.e., unique across all FlowFiles that have the same value for the "fragment.identifier" attribute) integer between 0 and the value of the fragment.count attribute. If two or more FlowFiles have the same value for the "fragment.identifier" attribute and the same value for the "fragment.index" attribute, the first FlowFile processed will be accepted and subsequent FlowFiles will not be accepted into the Bin. |
fragment.count | Applicable only if the <Merge Strategy> property is set to Defragment. This attribute indicates how many FlowFiles should be expected in the given bundle. At least one FlowFile must have this attribute in the bundle. If multiple FlowFiles contain the "fragment.count" attribute in a given bundle, all must have the same value. |
segment.original.filename | Applicable only if the <Merge Strategy> property is set to Defragment. This attribute must be present on all FlowFiles with the same value for the fragment.identifier attribute. All FlowFiles in the same bundle must have the same value for this attribute. The value of this attribute will be used for the filename of the completed merged FlowFile. |
tar.permissions | Applicable only if the <Merge Format> property is set to TAR. The value of this attribute must be 3 characters; each character must be in the range 0 to 7 (inclusive) and indicates the file permissions that should be used for the FlowFile's TAR entry. If this attribute is missing or has an invalid value, the default value of 644 will be used |
Writes Attributes
Name | Description |
---|---|
filename | When more than 1 file is merged, the filename comes from the segment.original.filename attribute. If that attribute does not exist in the source FlowFiles, then the filename is set to the number of nanoseconds matching system time. Then a filename extension may be applied:if Merge Format is TAR, then the filename will be appended with .tar, if Merge Format is ZIP, then the filename will be appended with .zip, if Merge Format is FlowFileStream, then the filename will be appended with .pkg |
merge.count | The number of FlowFiles that were merged into this bundle |
merge.bin.age | The age of the bin, in milliseconds, when it was merged and output. Effectively this is the greatest amount of time that any FlowFile in this bundle remained waiting in this processor before it was output |
merge.uuid | UUID of the merged flow file that will be added to the original flow files attributes. |
merge.reason | This processor allows for several thresholds to be configured for merging FlowFiles. This attribute indicates which of the Thresholds resulted in the FlowFiles being merged. For an explanation of each of the possible values and their meanings, see the Processor's Usage / documentation and see the 'Additional Details' page. |
Use Cases
-
Concatenate FlowFiles with textual content together in order to create fewer, larger FlowFiles.
- Description
- Concatenate FlowFiles with textual content together in order to create fewer, larger FlowFiles.
- Keywords
- concatenate, bundle, aggregate, bin, merge, combine, smash
- Configuration
"Merge Strategy" = "Bin Packing Algorithm" "Merge Format" = "Binary Concatenation" "Delimiter Strategy" = "Text" "Demarcator" = "\n" (a newline can be inserted by pressing Shift + Enter) "Minimum Number of Entries" = "1" "Maximum Number of Entries" = "500000000" "Minimum Group Size" = the minimum amount of data to write to an output FlowFile. A reasonable value might be "128 MB" "Maximum Group Size" = the maximum amount of data to write to an output FlowFile. A reasonable value might be "256 MB" "Max Bin Age" = the maximum amount of time to wait for incoming data before timing out and transferring the FlowFile along even though it is smaller than the Max Bin Age. A reasonable value might be "5 mins"
-
Concatenate FlowFiles with binary content together in order to create fewer, larger FlowFiles.
- Description
- Concatenate FlowFiles with binary content together in order to create fewer, larger FlowFiles.
- Notes
- Not all binary data can be concatenated together. Whether or not this configuration is valid depends on the type of your data.
- Keywords
- concatenate, bundle, aggregate, bin, merge, combine, smash
- Configuration
"Merge Strategy" = "Bin Packing Algorithm" "Merge Format" = "Binary Concatenation" "Delimiter Strategy" = "Text" "Minimum Number of Entries" = "1" "Maximum Number of Entries" = "500000000" "Minimum Group Size" = the minimum amount of data to write to an output FlowFile. A reasonable value might be "128 MB" "Maximum Group Size" = the maximum amount of data to write to an output FlowFile. A reasonable value might be "256 MB" "Max Bin Age" = the maximum amount of time to wait for incoming data before timing out and transferring the FlowFile along even though it is smaller than the Max Bin Age. A reasonable value might be "5 mins"
-
Reassemble a FlowFile that was previously split apart into smaller FlowFiles by a processor such as SplitText, UnpackContext, SplitRecord, etc.
- Description
- Reassemble a FlowFile that was previously split apart into smaller FlowFiles by a processor such as SplitText, UnpackContext, SplitRecord, etc.
- Keywords
- reassemble, repack, merge, recombine
- Configuration
"Merge Strategy" = "Defragment" "Merge Format" = the value of Merge Format depends on the desired output format. If the file was previously zipped together and was split apart by UnpackContent, a Merge Format of "ZIP" makes sense. If it was previously a .tar file, a Merge Format of "TAR" makes sense. If the data is textual, "Binary Concatenation" can be used to combine the text into a single document. "Delimiter Strategy" = "Text" "Max Bin Age" = the maximum amount of time to wait for incoming data before timing out and transferring the fragments to 'failure'. A reasonable value might be "5 mins" For textual data, "Demarcator" should be set to a newline (\n), set by pressing Shift+Enter in the UI. For binary data, "Demarcator" should be left blank.
See Also