GetHTTP

Description:

Fetches data from an HTTP or HTTPS URL and writes the data to the content of a FlowFile. Once the content has been fetched, the ETag and Last Modified dates are remembered (if the web server supports these concepts). This allows the Processor to fetch new data only if the remote data has changed or until the state is cleared. That is, once the content has been fetched from the given URL, it will not be fetched again until the content on the remote server changes. Note that due to limitations on state management, stored "last modified" and etag fields never expire. If the URL in GetHttp uses Expression Language that is unbounded, there is the potential for Out of Memory Errors to occur.

Tags:

get, fetch, poll, http, https, ingest, source, input

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, whether a property supports the NiFi Expression Language, and whether a property is considered "sensitive", meaning that its value will be encrypted. Before entering a value in a sensitive property, ensure that the nifi.properties file has an entry for the property nifi.sensitive.props.key.

NameDefault ValueAllowable ValuesDescription
URLThe URL to pull from
Supports Expression Language: true
FilenameThe filename to assign to the file when pulled
Supports Expression Language: true
SSL Context ServiceController Service API:
SSLContextService
Implementation:StandardSSLContextService
The Controller Service to use in order to obtain an SSL Context
UsernameUsername required to access the URL
PasswordPassword required to access the URL
Sensitive Property: true
Connection Timeout30 secHow long to wait when attempting to connect to the remote server before giving up
Data Timeout30 secHow long to wait between receiving segments of data from the remote server before giving up and discarding the partial file
User AgentWhat to report as the User Agent when we connect to the remote server
Accept Content-TypeIf specified, requests will only accept the provided Content-Type
Follow Redirectsfalse
  • true
  • false
If we receive a 3xx HTTP Status Code from the server, indicates whether or not we should follow the redirect that the server specifies
Redirect Cookie Policydefault
  • default Default cookie policy that provides a higher degree of compatibility with common cookie management of popular HTTP agents for non-standard (Netscape style) cookies.
  • standard RFC 6265 compliant cookie policy (interoperability profile).
  • strict RFC 6265 compliant cookie policy (strict profile).
  • netscape Netscape draft compliant cookie policy.
  • ignore A cookie policy that ignores cookies.
When a HTTP server responds to a request with a redirect, this is the cookie policy used to copy cookies to the following request.
Proxy HostThe fully qualified hostname or IP address of the proxy server
Proxy PortThe port of the proxy server

Dynamic Properties:

Dynamic Properties allow the user to specify both the name and value of a property.

NameValueDescription
Header NameThe Expression Language to be used to populate the header valueThe additional headers to be sent by the processor whenever making a new HTTP request. Setting a dynamic property name to XYZ and value to ${attribute} will result in the header 'XYZ: attribute_value' being sent to the HTTP endpoint

Relationships:

NameDescription
successAll files are transferred to the success relationship

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
filenameThe filename is set to the name of the file on the remote server
mime.typeThe MIME Type of the FlowFile, as reported by the HTTP Content-Type header

State management:

ScopeDescription
LOCALStores Last Modified Time and ETag headers returned by server so that the same data will not be fetched multiple times.

Restricted:

This component is not restricted.

Input requirement:

This component does not allow an incoming relationship.