ListGoogleDrive 2.0.0

Bundle
org.apache.nifi | nifi-gcp-nar
Description
Performs a listing of concrete files (shortcuts are ignored) in a Google Drive folder. If the 'Record Writer' property is set, a single Output FlowFile is created, and each file in the listing is written as a single record to the output file. Otherwise, for each file in the listing, an individual FlowFile is created, the metadata being written as FlowFile attributes. This Processor is designed to run on Primary Node only in a cluster. If the primary node changes, the new Primary Node will pick up where the previous node left off without duplicating all of the data. Please see Additional Details to set up access to Google Drive.
Tags
drive, google, storage
Input Requirement
FORBIDDEN
Supports Sensitive Dynamic Properties
false
  • Additional Details for ListGoogleDrive 2.0.0

    ListGoogleDrive

    Accessing Google Drive from NiFi

    This processor uses Google Cloud credentials for authentication to access Google Drive. The following steps are required to prepare the Google Cloud and Google Drive accounts for the processors:

    1. Enable Google Drive API in Google Cloud
    2. Grant access to Google Drive folder
      • In Google Cloud Console navigate to IAM & Admin -> Service Accounts.
      • Take a note of the email of the service account you are going to use.
      • Navigate to the folder to be listed in Google Drive.
      • Right-click on the Folder -> Share.
      • Enter the service account email.
    3. Find Folder ID
      • Navigate to the folder to be listed in Google Drive and enter it. The URL in your browser will include the ID at the end of the URL. For example, if the URL were https://drive.google.com/drive/folders/1trTraPVCnX5_TNwO8d9P_bz278xWOmGm, the Folder ID would be 1trTraPVCnX5_TNwO8d9P_bz278xWOmGm
    4. Set Folder ID in ‘Folder ID’ property
Properties
State Management
Scopes Description
CLUSTER The processor stores necessary data to be able to keep track what files have been listed already. What exactly needs to be stored depends on the 'Listing Strategy'. State is stored across the cluster so that this Processor can be run on Primary Node only and if a new Primary Node is selected, the new node can pick up where the previous node left off, without duplicating the data.
Relationships
Name Description
success All FlowFiles that are received are routed to success
Writes Attributes
Name Description
drive.id The id of the file
filename The name of the file
mime.type The MIME type of the file
drive.size The size of the file
drive.timestamp The last modified time or created time (whichever is greater) of the file. The reason for this is that the original modified date of a file is preserved when uploaded to Google Drive. 'Created time' takes the time when the upload occurs. However uploaded files can still be modified later.
See Also