-
Processors
- AttributeRollingWindow
- AttributesToCSV
- AttributesToJSON
- CalculateRecordStats
- CaptureChangeMySQL
- CompressContent
- ConnectWebSocket
- ConsumeAMQP
- ConsumeAzureEventHub
- ConsumeElasticsearch
- ConsumeGCPubSub
- ConsumeIMAP
- ConsumeJMS
- ConsumeKafka
- ConsumeKinesisStream
- ConsumeMQTT
- ConsumePOP3
- ConsumeSlack
- ConsumeTwitter
- ConsumeWindowsEventLog
- ControlRate
- ConvertCharacterSet
- ConvertRecord
- CopyAzureBlobStorage_v12
- CopyS3Object
- CountText
- CryptographicHashContent
- DebugFlow
- DecryptContentAge
- DecryptContentPGP
- DeduplicateRecord
- DeleteAzureBlobStorage_v12
- DeleteAzureDataLakeStorage
- DeleteByQueryElasticsearch
- DeleteDynamoDB
- DeleteFile
- DeleteGCSObject
- DeleteGridFS
- DeleteMongo
- DeleteS3Object
- DeleteSFTP
- DeleteSQS
- DetectDuplicate
- DistributeLoad
- DuplicateFlowFile
- EncodeContent
- EncryptContentAge
- EncryptContentPGP
- EnforceOrder
- EvaluateJsonPath
- EvaluateXPath
- EvaluateXQuery
- ExecuteGroovyScript
- ExecuteProcess
- ExecuteScript
- ExecuteSQL
- ExecuteSQLRecord
- ExecuteStreamCommand
- ExtractAvroMetadata
- ExtractEmailAttachments
- ExtractEmailHeaders
- ExtractGrok
- ExtractHL7Attributes
- ExtractRecordSchema
- ExtractText
- FetchAzureBlobStorage_v12
- FetchAzureDataLakeStorage
- FetchBoxFile
- FetchDistributedMapCache
- FetchDropbox
- FetchFile
- FetchFTP
- FetchGCSObject
- FetchGoogleDrive
- FetchGridFS
- FetchS3Object
- FetchSFTP
- FetchSmb
- FilterAttribute
- FlattenJson
- ForkEnrichment
- ForkRecord
- GenerateFlowFile
- GenerateRecord
- GenerateTableFetch
- GeoEnrichIP
- GeoEnrichIPRecord
- GeohashRecord
- GetAsanaObject
- GetAwsPollyJobStatus
- GetAwsTextractJobStatus
- GetAwsTranscribeJobStatus
- GetAwsTranslateJobStatus
- GetAzureEventHub
- GetAzureQueueStorage_v12
- GetDynamoDB
- GetElasticsearch
- GetFile
- GetFTP
- GetGcpVisionAnnotateFilesOperationStatus
- GetGcpVisionAnnotateImagesOperationStatus
- GetHubSpot
- GetMongo
- GetMongoRecord
- GetS3ObjectMetadata
- GetSFTP
- GetShopify
- GetSmbFile
- GetSNMP
- GetSplunk
- GetSQS
- GetWorkdayReport
- GetZendesk
- HandleHttpRequest
- HandleHttpResponse
- IdentifyMimeType
- InvokeHTTP
- InvokeScriptedProcessor
- ISPEnrichIP
- JoinEnrichment
- JoltTransformJSON
- JoltTransformRecord
- JSLTTransformJSON
- JsonQueryElasticsearch
- ListAzureBlobStorage_v12
- ListAzureDataLakeStorage
- ListBoxFile
- ListDatabaseTables
- ListDropbox
- ListenFTP
- ListenHTTP
- ListenOTLP
- ListenSlack
- ListenSyslog
- ListenTCP
- ListenTrapSNMP
- ListenUDP
- ListenUDPRecord
- ListenWebSocket
- ListFile
- ListFTP
- ListGCSBucket
- ListGoogleDrive
- ListS3
- ListSFTP
- ListSmb
- LogAttribute
- LogMessage
- LookupAttribute
- LookupRecord
- MergeContent
- MergeRecord
- ModifyBytes
- ModifyCompression
- MonitorActivity
- MoveAzureDataLakeStorage
- Notify
- PackageFlowFile
- PaginatedJsonQueryElasticsearch
- ParseEvtx
- ParseNetflowv5
- ParseSyslog
- ParseSyslog5424
- PartitionRecord
- PublishAMQP
- PublishGCPubSub
- PublishJMS
- PublishKafka
- PublishMQTT
- PublishSlack
- PutAzureBlobStorage_v12
- PutAzureCosmosDBRecord
- PutAzureDataExplorer
- PutAzureDataLakeStorage
- PutAzureEventHub
- PutAzureQueueStorage_v12
- PutBigQuery
- PutBoxFile
- PutCloudWatchMetric
- PutDatabaseRecord
- PutDistributedMapCache
- PutDropbox
- PutDynamoDB
- PutDynamoDBRecord
- PutElasticsearchJson
- PutElasticsearchRecord
- PutEmail
- PutFile
- PutFTP
- PutGCSObject
- PutGoogleDrive
- PutGridFS
- PutKinesisFirehose
- PutKinesisStream
- PutLambda
- PutMongo
- PutMongoBulkOperations
- PutMongoRecord
- PutRecord
- PutRedisHashRecord
- PutS3Object
- PutSalesforceObject
- PutSFTP
- PutSmbFile
- PutSNS
- PutSplunk
- PutSplunkHTTP
- PutSQL
- PutSQS
- PutSyslog
- PutTCP
- PutUDP
- PutWebSocket
- PutZendeskTicket
- QueryAirtableTable
- QueryAzureDataExplorer
- QueryDatabaseTable
- QueryDatabaseTableRecord
- QueryRecord
- QuerySalesforceObject
- QuerySplunkIndexingStatus
- RemoveRecordField
- RenameRecordField
- ReplaceText
- ReplaceTextWithMapping
- RetryFlowFile
- RouteHL7
- RouteOnAttribute
- RouteOnContent
- RouteText
- RunMongoAggregation
- SampleRecord
- ScanAttribute
- ScanContent
- ScriptedFilterRecord
- ScriptedPartitionRecord
- ScriptedTransformRecord
- ScriptedValidateRecord
- SearchElasticsearch
- SegmentContent
- SendTrapSNMP
- SetSNMP
- SignContentPGP
- SplitAvro
- SplitContent
- SplitExcel
- SplitJson
- SplitPCAP
- SplitRecord
- SplitText
- SplitXml
- StartAwsPollyJob
- StartAwsTextractJob
- StartAwsTranscribeJob
- StartAwsTranslateJob
- StartGcpVisionAnnotateFilesOperation
- StartGcpVisionAnnotateImagesOperation
- TagS3Object
- TailFile
- TransformXml
- UnpackContent
- UpdateAttribute
- UpdateByQueryElasticsearch
- UpdateCounter
- UpdateDatabaseTable
- UpdateRecord
- ValidateCsv
- ValidateJson
- ValidateRecord
- ValidateXml
- VerifyContentMAC
- VerifyContentPGP
- Wait
-
Controller Services
- ADLSCredentialsControllerService
- ADLSCredentialsControllerServiceLookup
- AmazonGlueSchemaRegistry
- ApicurioSchemaRegistry
- AvroReader
- AvroRecordSetWriter
- AvroSchemaRegistry
- AWSCredentialsProviderControllerService
- AzureBlobStorageFileResourceService
- AzureCosmosDBClientService
- AzureDataLakeStorageFileResourceService
- AzureEventHubRecordSink
- AzureStorageCredentialsControllerService_v12
- AzureStorageCredentialsControllerServiceLookup_v12
- CEFReader
- ConfluentEncodedSchemaReferenceReader
- ConfluentEncodedSchemaReferenceWriter
- ConfluentSchemaRegistry
- CSVReader
- CSVRecordLookupService
- CSVRecordSetWriter
- DatabaseRecordLookupService
- DatabaseRecordSink
- DatabaseTableSchemaRegistry
- DBCPConnectionPool
- DBCPConnectionPoolLookup
- DistributedMapCacheLookupService
- ElasticSearchClientServiceImpl
- ElasticSearchLookupService
- ElasticSearchStringLookupService
- EmailRecordSink
- EmbeddedHazelcastCacheManager
- ExcelReader
- ExternalHazelcastCacheManager
- FreeFormTextRecordSetWriter
- GCPCredentialsControllerService
- GCSFileResourceService
- GrokReader
- HazelcastMapCacheClient
- HikariCPConnectionPool
- HttpRecordSink
- IPLookupService
- JettyWebSocketClient
- JettyWebSocketServer
- JMSConnectionFactoryProvider
- JndiJmsConnectionFactoryProvider
- JsonConfigBasedBoxClientService
- JsonPathReader
- JsonRecordSetWriter
- JsonTreeReader
- Kafka3ConnectionService
- KerberosKeytabUserService
- KerberosPasswordUserService
- KerberosTicketCacheUserService
- LoggingRecordSink
- MapCacheClientService
- MapCacheServer
- MongoDBControllerService
- MongoDBLookupService
- PropertiesFileLookupService
- ProtobufReader
- ReaderLookup
- RecordSetWriterLookup
- RecordSinkServiceLookup
- RedisConnectionPoolService
- RedisDistributedMapCacheClientService
- RestLookupService
- S3FileResourceService
- ScriptedLookupService
- ScriptedReader
- ScriptedRecordSetWriter
- ScriptedRecordSink
- SetCacheClientService
- SetCacheServer
- SimpleCsvFileLookupService
- SimpleDatabaseLookupService
- SimpleKeyValueLookupService
- SimpleRedisDistributedMapCacheClientService
- SimpleScriptedLookupService
- SiteToSiteReportingRecordSink
- SlackRecordSink
- SmbjClientProviderService
- StandardAsanaClientProviderService
- StandardAzureCredentialsControllerService
- StandardDropboxCredentialService
- StandardFileResourceService
- StandardHashiCorpVaultClientService
- StandardHttpContextMap
- StandardJsonSchemaRegistry
- StandardKustoIngestService
- StandardKustoQueryService
- StandardOauth2AccessTokenProvider
- StandardPGPPrivateKeyService
- StandardPGPPublicKeyService
- StandardPrivateKeyService
- StandardProxyConfigurationService
- StandardRestrictedSSLContextService
- StandardS3EncryptionService
- StandardSSLContextService
- StandardWebClientServiceProvider
- Syslog5424Reader
- SyslogReader
- UDPEventRecordSink
- VolatileSchemaCache
- WindowsEventLogReader
- XMLFileLookupService
- XMLReader
- XMLRecordSetWriter
- YamlTreeReader
- ZendeskRecordSink
SearchElasticsearch 2.0.0
- Bundle
- org.apache.nifi | nifi-elasticsearch-restapi-nar
- Description
- A processor that allows the user to repeatedly run a paginated query (with aggregations) written with the Elasticsearch JSON DSL. Search After/Point in Time queries must include a valid "sort" field. The processor will retrieve multiple pages of results until either no more results are available or the Pagination Keep Alive expiration is reached, after which the query will restart with the first page of results being retrieved.
- Tags
- elasticsearch, elasticsearch5, elasticsearch6, elasticsearch7, elasticsearch8, json, page, query, scroll, search
- Input Requirement
- FORBIDDEN
- Supports Sensitive Dynamic Properties
- false
Properties
-
Query Attribute
If set, the executed query will be set on each result flowfile in the specified attribute.
- Display Name
- Query Attribute
- Description
- If set, the executed query will be set on each result flowfile in the specified attribute.
- API Name
- el-query-attribute
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
-
Client Service
An Elasticsearch client service to use for running queries.
- Display Name
- Client Service
- Description
- An Elasticsearch client service to use for running queries.
- API Name
- el-rest-client-service
- Service Interface
- org.apache.nifi.elasticsearch.ElasticSearchClientService
- Service Implementations
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Index
The name of the index to use.
- Display Name
- Index
- Description
- The name of the index to use.
- API Name
- el-rest-fetch-index
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- true
-
Aggregation Results Format
Format of Aggregation output.
- Display Name
- Aggregation Results Format
- Description
- Format of Aggregation output.
- API Name
- el-rest-format-aggregations
- Default Value
- FULL
- Allowable Values
-
- FULL
- BUCKETS_ONLY
- METADATA_ONLY
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Search Results Format
Format of Hits output.
- Display Name
- Search Results Format
- Description
- Format of Hits output.
- API Name
- el-rest-format-hits
- Default Value
- FULL
- Allowable Values
-
- FULL
- SOURCE_ONLY
- METADATA_ONLY
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Output No Hits
Output a "hits" flowfile even if no hits found for query. If true, an empty "hits" flowfile will be output even if "aggregations" are output.
- Display Name
- Output No Hits
- Description
- Output a "hits" flowfile even if no hits found for query. If true, an empty "hits" flowfile will be output even if "aggregations" are output.
- API Name
- el-rest-output-no-hits
- Default Value
- false
- Allowable Values
-
- true
- false
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Pagination Keep Alive
Pagination "keep_alive" period. Period Elasticsearch will keep the scroll/pit cursor alive in between requests (this is not the time expected for all pages to be returned, but the maximum allowed time for requests between page retrievals).
- Display Name
- Pagination Keep Alive
- Description
- Pagination "keep_alive" period. Period Elasticsearch will keep the scroll/pit cursor alive in between requests (this is not the time expected for all pages to be returned, but the maximum allowed time for requests between page retrievals).
- API Name
- el-rest-pagination-keep-alive
- Default Value
- 10 mins
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Pagination Type
Pagination method to use. Not all types are available for all Elasticsearch versions, check the Elasticsearch docs to confirm which are applicable and recommended for your service.
- Display Name
- Pagination Type
- Description
- Pagination method to use. Not all types are available for all Elasticsearch versions, check the Elasticsearch docs to confirm which are applicable and recommended for your service.
- API Name
- el-rest-pagination-type
- Default Value
- pagination-scroll
- Allowable Values
-
- SCROLL
- SEARCH_AFTER
- POINT_IN_TIME
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Query
A query in JSON syntax, not Lucene syntax. Ex: {"query":{"match":{"somefield":"somevalue"}}}. If the query is empty, a default JSON Object will be used, which will result in a "match_all" query in Elasticsearch.
- Display Name
- Query
- Description
- A query in JSON syntax, not Lucene syntax. Ex: {"query":{"match":{"somefield":"somevalue"}}}. If the query is empty, a default JSON Object will be used, which will result in a "match_all" query in Elasticsearch.
- API Name
- el-rest-query
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Query Definition Style is set to any of [full]
-
Query Clause
A "query" clause in JSON syntax, not Lucene syntax. Ex: {"match":{"somefield":"somevalue"}}. If the query is empty, a default JSON Object will be used, which will result in a "match_all" query in Elasticsearch.
- Display Name
- Query Clause
- Description
- A "query" clause in JSON syntax, not Lucene syntax. Ex: {"match":{"somefield":"somevalue"}}. If the query is empty, a default JSON Object will be used, which will result in a "match_all" query in Elasticsearch.
- API Name
- el-rest-query-clause
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Query Definition Style is set to any of [build]
-
Query Definition Style
How the JSON Query will be defined for use by the processor.
- Display Name
- Query Definition Style
- Description
- How the JSON Query will be defined for use by the processor.
- API Name
- el-rest-query-definition-style
- Default Value
- full
- Allowable Values
-
- FULL_QUERY
- BUILD_QUERY
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Aggregation Results Split
Output a flowfile containing all aggregations or one flowfile for each individual aggregation.
- Display Name
- Aggregation Results Split
- Description
- Output a flowfile containing all aggregations or one flowfile for each individual aggregation.
- API Name
- el-rest-split-up-aggregations
- Default Value
- splitUp-no
- Allowable Values
-
- PER_HIT
- PER_RESPONSE
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Search Results Split
Output a flowfile containing all hits or one flowfile for each individual hit or one flowfile containing all hits from all paged responses.
- Display Name
- Search Results Split
- Description
- Output a flowfile containing all hits or one flowfile for each individual hit or one flowfile containing all hits from all paged responses.
- API Name
- el-rest-split-up-hits
- Default Value
- splitUp-no
- Allowable Values
-
- PER_HIT
- PER_RESPONSE
- PER_QUERY
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Type
The type of this document (used by Elasticsearch for indexing and searching).
- Display Name
- Type
- Description
- The type of this document (used by Elasticsearch for indexing and searching).
- API Name
- el-rest-type
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
-
Aggregations
One or more query aggregations (or "aggs"), in JSON syntax. Ex: {"items": {"terms": {"field": "product", "size": 10}}}
- Display Name
- Aggregations
- Description
- One or more query aggregations (or "aggs"), in JSON syntax. Ex: {"items": {"terms": {"field": "product", "size": 10}}}
- API Name
- es-rest-query-aggs
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Query Definition Style is set to any of [build]
-
Fields
Fields of indexed documents to be retrieved, in JSON syntax. Ex: ["user.id", "http.response.*", {"field": "@timestamp", "format": "epoch_millis"}]
- Display Name
- Fields
- Description
- Fields of indexed documents to be retrieved, in JSON syntax. Ex: ["user.id", "http.response.*", {"field": "@timestamp", "format": "epoch_millis"}]
- API Name
- es-rest-query-fields
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Query Definition Style is set to any of [build]
-
Script Fields
Fields to created using script evaluation at query runtime, in JSON syntax. Ex: {"test1": {"script": {"lang": "painless", "source": "doc['price'].value * 2"}}, "test2": {"script": {"lang": "painless", "source": "doc['price'].value * params.factor", "params": {"factor": 2.0}}}}
- Display Name
- Script Fields
- Description
- Fields to created using script evaluation at query runtime, in JSON syntax. Ex: {"test1": {"script": {"lang": "painless", "source": "doc['price'].value * 2"}}, "test2": {"script": {"lang": "painless", "source": "doc['price'].value * params.factor", "params": {"factor": 2.0}}}}
- API Name
- es-rest-query-script-fields
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Query Definition Style is set to any of [build]
-
Sort
Sort results by one or more fields, in JSON syntax. Ex: [{"price" : {"order" : "asc", "mode" : "avg"}}, {"post_date" : {"format": "strict_date_optional_time_nanos"}}]
- Display Name
- Sort
- Description
- Sort results by one or more fields, in JSON syntax. Ex: [{"price" : {"order" : "asc", "mode" : "avg"}}, {"post_date" : {"format": "strict_date_optional_time_nanos"}}]
- API Name
- es-rest-query-sort
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Query Definition Style is set to any of [build]
-
Size
The maximum number of documents to retrieve in the query. If the query is paginated, this "size" applies to each page of the query, not the "size" of the entire result set.
- Display Name
- Size
- Description
- The maximum number of documents to retrieve in the query. If the query is paginated, this "size" applies to each page of the query, not the "size" of the entire result set.
- API Name
- es-rest-size
- Expression Language Scope
- Environment variables and FlowFile Attributes
- Sensitive
- false
- Required
- false
- Dependencies
-
- Query Definition Style is set to any of [build]
-
Max JSON Field String Length
The maximum allowed length of a string value when parsing a JSON document or attribute.
- Display Name
- Max JSON Field String Length
- Description
- The maximum allowed length of a string value when parsing a JSON document or attribute.
- API Name
- Max JSON Field String Length
- Default Value
- 20 MB
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
Dynamic Properties
-
The name of a URL query parameter to add
Adds the specified property name/value as a query parameter in the Elasticsearch URL used for processing. These parameters will override any matching parameters in the query request body. For SCROLL type queries, these parameters are only used in the initial (first page) query as the Elasticsearch Scroll API does not support the same query parameters for subsequent pages of data.
- Name
- The name of a URL query parameter to add
- Description
- Adds the specified property name/value as a query parameter in the Elasticsearch URL used for processing. These parameters will override any matching parameters in the query request body. For SCROLL type queries, these parameters are only used in the initial (first page) query as the Elasticsearch Scroll API does not support the same query parameters for subsequent pages of data.
- Value
- The value of the URL query parameter
- Expression Language Scope
- ENVIRONMENT
State Management
Scopes | Description |
---|---|
LOCAL | The pagination state (scrollId, searchAfter, pitId, hitCount, pageCount, pageExpirationTimestamp) is retained in between invocations of this processor until the Scroll/PiT has expired (when the current time is later than the last query execution plus the Pagination Keep Alive interval). |
System Resource Considerations
Resource | Description |
---|---|
MEMORY | Care should be taken on the size of each page because each response from Elasticsearch will be loaded into memory all at once and converted into the resulting flowfiles. |
Relationships
Name | Description |
---|---|
aggregations | Aggregations are routed to this relationship. |
hits | Search hits are routed to this relationship. |
Writes Attributes
Name | Description |
---|---|
mime.type | application/json |
aggregation.name | The name of the aggregation whose results are in the output flowfile |
aggregation.number | The number of the aggregation whose results are in the output flowfile |
page.number | The number of the page (request), starting from 1, in which the results were returned that are in the output flowfile |
hit.count | The number of hits that are in the output flowfile |
elasticsearch.query.error | The error message provided by Elasticsearch if there is an error querying the index. |
See Also