FetchAzureDataLakeStorage

Description:

Fetch the specified file from Azure Data Lake Storage

Tags:

azure, microsoft, cloud, storage, adlsgen2, datalake

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
ADLS Credentialsadls-credentials-serviceController Service API:
ADLSCredentialsService
Implementations: ADLSCredentialsControllerServiceLookup
ADLSCredentialsControllerService
Controller Service used to obtain Azure Credentials.
Filesystem Namefilesystem-nameName of the Azure Storage File System (also called Container). It is assumed to be already existing.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Directory Namedirectory-nameName of the Azure Storage Directory. The Directory Name cannot contain a leading '/'. The root directory can be designated by the empty string value. In case of the PutAzureDataLakeStorage processor, the directory will be created if not already existing.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
File Namefile-name${azure.filename}The filename
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Range Startrange-startThe byte position at which to start reading from the object. An empty value or a value of zero will start reading at the beginning of the object.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Range Lengthrange-lengthThe number of bytes to download from the object, starting from the Range Start. An empty value or a value that extends beyond the end of the object will read to the end of the object.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Number of Retriesnumber-of-retries0The number of automatic retries to perform if the download fails.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Proxy Configuration Serviceproxy-configuration-serviceController Service API:
ProxyConfigurationService
Implementation: StandardProxyConfigurationService
Specifies the Proxy Configuration Controller Service to proxy network requests. Supported proxies: HTTP, SOCKS In case of SOCKS, it is not guaranteed that the selected SOCKS Version will be used by the processor.

Relationships:

NameDescription
failureFiles that could not be written to Azure storage for some reason are transferred to this relationship
successFiles that have been successfully written to Azure storage are transferred to this relationship

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
azure.datalake.storage.statusCodeThe HTTP error code (if available) from the failed operation
azure.datalake.storage.errorCodeThe Azure Data Lake Storage moniker of the failed operation
azure.datalake.storage.errorMessageThe Azure Data Lake Storage error message from the failed operation

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

Example Use Cases Involving Other Components:

Use Case:

Retrieve all files in an Azure DataLake Storage directory

Keywords:

azure, datalake, adls, state, retrieve, fetch, all, stream

Components involved:

Component Type: org.apache.nifi.processors.azure.storage.ListAzureDataLakeStorage

Configuration:

The "Filesystem Name" property should be set to the name of the Azure Filesystem (also known as a Container) that files reside in. If the flow being built is to be reused elsewhere, it's a good idea to parameterize this property by setting it to something like #{AZURE_FILESYSTEM}.

Configure the "Directory Name" property to specify the name of the directory in the file system. If the flow being built is to be reused elsewhere, it's a good idea to parameterize this property by setting it to something like #{AZURE_DIRECTORY}.

The "ADLS Credentials" property should specify an instance of the ADLSCredentialsService in order to provide credentials for accessing the filesystem.

The 'success' Relationship of this Processor is then connected to FetchAzureDataLakeStorage.



Component Type: org.apache.nifi.processors.azure.storage.FetchAzureDataLakeStorage

Configuration:

"Filesystem Name" = "${azure.filesystem}"

"Directory Name" = "${azure.directory}"

"File Name" = "${azure.filename}"

The "ADLS Credentials" property should specify an instance of the ADLSCredentialsService in order to provide credentials for accessing the filesystem.





System Resource Considerations:

None specified.

See Also:

PutAzureDataLakeStorage, DeleteAzureDataLakeStorage, ListAzureDataLakeStorage