Sniffing

The Elasticsearch Sniffer can be used to locate Elasticsearch Nodes within a Cluster to which you are connecting. This can be beneficial if your cluster dynamically changes over time, e.g. new Nodes are added to maintain performance during heavy load.

Sniffing can also be used to update the list of Hosts within the Cluster if a connection Failure is encountered during operation. In order to "Sniff on Failure", you must also enable "Sniff Cluster Nodes".

Not all situations make sense to use Sniffing, for example if:

There may also be need to set some of the Elasticsearch Networking Advanced Settings, such as network.publish_host to ensure that the HTTP Hosts found by the Sniffer are accessible by NiFi. For example, Elasticsearch may use a network internal publish_host that is inaccessible to NiFi, but instead should use an address/IP that NiFi understands. It may also be necessary to add this same address to Elasticsearch's network.bind_host list.

See Elasticsearch sniffing best practices: What, when, why, how for more details of the best practices.

Resources Usage Consideration

This Elasticsearch client relies on a RestClient using the Apache HTTP Async Client. By default, it will start one dispatcher thread, and a number of worker threads used by the connection manager. There will be as many worker thread as the number of locally detected processors/cores on the NiFi host. Consequently, it is highly recommended to have only one instance of this controller service per remote Elasticsearch destination and have this controller service shared across all of the Elasticsearch processors of the NiFi flows. Having a very high number of instances could lead to resource starvation and result in OOM errors.