ListDatabaseTables

Description:

Generates a set of flow files, each containing attributes corresponding to metadata about a table from a database connection. Once metadata about a table has been fetched, it will not be fetched again until the Refresh Interval (if set) has elapsed, or until state has been manually cleared.

Tags:

sql, list, jdbc, table, database

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Database Connection Pooling Servicelist-db-tables-db-connectionController Service API:
DBCPService
Implementations: HadoopDBCPConnectionPool
DBCPConnectionPool
DBCPConnectionPoolLookup
HikariCPConnectionPool
The Controller Service that is used to obtain connection to database
Cataloglist-db-tables-catalogThe name of a catalog from which to list database tables. The name must match the catalog name as it is stored in the database. If the property is not set, the catalog name will not be used to narrow the search for tables. If the property is set to an empty string, tables without a catalog will be listed.
Schema Patternlist-db-tables-schema-patternA pattern for matching schemas in the database. Within a pattern, "%" means match any substring of 0 or more characters, and "_" means match any one character. The pattern must match the schema name as it is stored in the database. If the property is not set, the schema name will not be used to narrow the search for tables. If the property is set to an empty string, tables without a schema will be listed.
Table Name Patternlist-db-tables-name-patternA pattern for matching tables in the database. Within a pattern, "%" means match any substring of 0 or more characters, and "_" means match any one character. The pattern must match the table name as it is stored in the database. If the property is not set, all tables will be retrieved.
Table Typeslist-db-tables-typesTABLEA comma-separated list of table types to include. For example, some databases support TABLE and VIEW types. If the property is not set, tables of all types will be returned.
Include Countlist-db-include-countfalse
  • true
  • false
Whether to include the table's row count as a flow file attribute. This affects performance as a database query will be generated for each table in the retrieved list.
Record Writerrecord-writerController Service API:
RecordSetWriterFactory
Implementations: ScriptedRecordSetWriter
ParquetRecordSetWriter
RecordSetWriterLookup
CSVRecordSetWriter
FreeFormTextRecordSetWriter
JsonRecordSetWriter
AvroRecordSetWriter
XMLRecordSetWriter
Specifies the Record Writer to use for creating the listing. If not specified, one FlowFile will be created for each entity that is listed. If the Record Writer is specified, all entities will be written to a single FlowFile instead of adding attributes to individual FlowFiles.
Refresh Intervallist-db-refresh-interval0 secThe amount of time to elapse before resetting the processor state, thereby causing all current tables to be listed. During this interval, the processor may continue to run, but tables that have already been listed will not be re-listed. However new/added tables will be listed as the processor runs. A value of zero means the state will never be automatically reset, the user must Clear State manually.

Relationships:

NameDescription
successAll FlowFiles that are received are routed to success

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
db.table.nameContains the name of a database table from the connection
db.table.catalogContains the name of the catalog to which the table belongs (may be null)
db.table.schemaContains the name of the schema to which the table belongs (may be null)
db.table.fullnameContains the fully-qualifed table name (possibly including catalog, schema, etc.)
db.table.typeContains the type of the database table from the connection. Typical types are "TABLE", "VIEW", "SYSTEM TABLE", "GLOBAL TEMPORARY", "LOCAL TEMPORARY", "ALIAS", "SYNONYM"
db.table.remarksContains the name of a database table from the connection
db.table.countContains the number of rows in the table

State management:

ScopeDescription
CLUSTERAfter performing a listing of tables, the timestamp of the query is stored. This allows the Processor to not re-list tables the next time that the Processor is run. Specifying the refresh interval in the processor properties will indicate that when the processor detects the interval has elapsed, the state will be reset and tables will be re-listed as a result. This processor is meant to be run on the primary node only.

Restricted:

This component is not restricted.

Input requirement:

This component does not allow an incoming relationship.

System Resource Considerations:

None specified.