QueryAirtableTable 2.0.0

Bundle
org.apache.nifi | nifi-airtable-nar
Description
Query records from an Airtable table. Records are incrementally retrieved based on the last modified time of the records. Records can also be further filtered by setting the 'Custom Filter' property which supports the formulas provided by the Airtable API. This processor is intended to be run on the Primary Node only.
Tags
airtable, database, query
Input Requirement
FORBIDDEN
Supports Sensitive Dynamic Properties
false
  • Additional Details for QueryAirtableTable 2.0.0

    QueryAirtableTable

    Description

    Airtable is a spreadsheet-database hybrid. In Airtable an application is called base and each base can have multiple tables. A table consists of records (rows) and each record can have multiple fields (columns). The QueryAirtableTable processor can query records from a single base and table via Airtable’s REST API. The processor utilizes streams to be able to handle a large number of records. It can also split large record sets to multiple FlowFiles just like a database processor.

    Personal Access Token

    Please note that API Keys were deprecated, Airtable now provides Personal Access Tokens (PATs) instead. Airtable REST API calls requires a PAT (Personal Access Token) that needs to be passed in a request. An Airtable account is required to generate the PAT.

    API rate limit

    The Airtable REST API limits the number of requests that can be sent on a per-base basis to avoid bottlenecks. Currently, this limit is 5 requests per second per base. If this limit is exceeded you can’t make another request for 30 seconds. It’s your responsibility to handle this rate limit via configuring Yield Duration and Run Schedule properly. It is recommended to start off with the default settings and to increase both parameters when rate limit issues occur.

    Metadata API

    Currently, the Metadata API of Airtable is unstable, and we don’t provide a way to use it. Until it becomes stable you can set up a ConvertRecord or MergeRecord processor with a JsonTreeReader to read the content and convert it into a Record with schema.

Properties
State Management
Scopes Description
CLUSTER The last successful query's time is stored in order to enable incremental loading. The initial query returns all the records in the table and each subsequent query filters the records by their last modified time. In other words, if a record is updated after the last successful query only the updated records will be returned in the next query. State is stored across the cluster, so this Processor can run only on the Primary Node and if a new Primary Node is selected, the new node can pick up where the previous one left off without duplicating the data.
Relationships
Name Description
success For FlowFiles created as a result of a successful query.
Writes Attributes
Name Description
record.count Sets the number of records in the FlowFile.
fragment.identifier If 'Max Records Per FlowFile' is set then all FlowFiles from the same query result set will have the same value for the fragment.identifier attribute. This can then be used to correlate the results.
fragment.count If 'Max Records Per FlowFile' is set then this is the total number of FlowFiles produced by a single ResultSet. This can be used in conjunction with the fragment.identifier attribute in order to know how many FlowFiles belonged to the same incoming ResultSet.
fragment.index If 'Max Records Per FlowFile' is set then the position of this FlowFile in the list of outgoing FlowFiles that were all derived from the same result set FlowFile. This can be used in conjunction with the fragment.identifier attribute to know which FlowFiles originated from the same query result set and in what order FlowFiles were produced