Streams tweets from Twitter's streaming API v2. The stream provides a sample stream or a search stream based on previously uploaded rules. This processor also provides a pass through for certain fields of the tweet to be returned as part of the response. See https://developer.twitter.com/en/docs/twitter-api/data-dictionary/introduction for more information regarding the Tweet object model.
twitter, tweets, social media, status, json
In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values.
Display Name | API Name | Default Value | Allowable Values | Description |
---|---|---|---|---|
Stream Endpoint | stream-endpoint | Sample Stream |
| The source from which the processor will consume Tweets. |
Base Path | base-path | https://api.twitter.com | The base path that the processor will use for making HTTP requests. The default value should be sufficient for most use cases. | |
Bearer Token | bearer-token | The Bearer Token provided by Twitter. Sensitive Property: true | ||
Queue Size | queue-size | 10000 | Maximum size of internal queue for streamed messages | |
Batch Size | batch-size | 1000 | The maximum size of the number of Tweets to be written to a single FlowFile. Will write fewer Tweets based on the number available in the queue at the time of processor invocation. | |
Backoff Attempts | backoff-attempts | 5 | The number of reconnection tries the processor will attempt in the event of a disconnection of the stream for any reason, before throwing an exception. To start a stream after this exception occur and the connection is fixed, please stop and restart the processor. If the valueof this property is 0, then backoff will never occur and the processor will always need to be restartedif the stream fails. | |
Backoff Time | backoff-time | 1 mins | The duration to backoff before requesting a new stream ifthe current one fails for any reason. Will increase by factor of 2 every time a restart fails | |
Maximum Backoff Time | maximum-backoff-time | 5 mins | The maximum duration to backoff to start attempting a new stream.It is recommended that this number be much higher than the 'Backoff Time' property | |
Connect Timeout | connect-timeout | 10 secs | The maximum time in which client should establish a connection with the Twitter API before a time out. Setting the value to 0 disables connection timeouts. | |
Read Timeout | read-timeout | 10 secs | The maximum time of inactivity between receiving tweets from Twitter through the API before a timeout. Setting the value to 0 disables read timeouts. | |
Backfill Minutes | backfill-minutes | 0 | The number of minutes (up to 5 minutes) of streaming data to be requested after a disconnect. Only available for project with academic research access. See https://developer.twitter.com/en/docs/twitter-api/tweets/filtered-stream/integrate/recovery-and-redundancy-features | |
Tweet Fields | tweet-fields | A comma-separated list of tweet fields to be returned as part of the tweet. Refer to https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet for proper usage. Possible field values include: attachments, author_id, context_annotations, conversation_id, created_at, entities, geo, id, in_reply_to_user_id, lang, non_public_metrics, organic_metrics, possibly_sensitive, promoted_metrics, public_metrics, referenced_tweets, reply_settings, source, text, withheld | ||
User Fields | user-fields | A comma-separated list of user fields to be returned as part of the tweet. Refer to https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/user for proper usage. Possible field values include: created_at, description, entities, id, location, name, pinned_tweet_id, profile_image_url, protected, public_metrics, url, username, verified, withheld | ||
Media Fields | media-fields | A comma-separated list of media fields to be returned as part of the tweet. Refer to https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/media for proper usage. Possible field values include: alt_text, duration_ms, height, media_key, non_public_metrics, organic_metrics, preview_image_url, promoted_metrics, public_metrics, type, url, width | ||
Poll Fields | poll-fields | A comma-separated list of poll fields to be returned as part of the tweet. Refer to https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/poll for proper usage. Possible field values include: duration_minutes, end_datetime, id, options, voting_status | ||
Place Fields | place-fields | A comma-separated list of place fields to be returned as part of the tweet. Refer to https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/place for proper usage. Possible field values include: contained_within, country, country_code, full_name, geo, id, name, place_type | ||
Expansions | expansions | A comma-separated list of expansions for objects in the returned tweet. See https://developer.twitter.com/en/docs/twitter-api/expansions for proper usage. Possible field values include: author_id, referenced_tweets.id, referenced_tweets.id.author_id, entities.mentions.username, attachments.poll_ids, attachments.media_keys ,in_reply_to_user_id, geo.place_id |
Name | Description |
---|---|
success | FlowFiles containing an array of one or more Tweets |
Name | Description |
---|---|
mime.type | The MIME Type set to application/json |
tweets | The number of Tweets in the FlowFile |