ConsumeTwitter

Description:

Streams tweets from Twitter's streaming API v2. The stream provides a sample stream or a search stream based on previously uploaded rules. This processor also provides a pass through for certain fields of the tweet to be returned as part of the response. See https://developer.twitter.com/en/docs/twitter-api/data-dictionary/introduction for more information regarding the Tweet object model.

Tags:

twitter, tweets, social media, status, json

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Stream Endpointstream-endpointSample Stream
  • Sample Stream Streams about one percent of all Tweets. https://developer.twitter.com/en/docs/twitter-api/tweets/volume-streams/api-reference/get-tweets-sample-stream
  • Search Stream The search stream produces Tweets that match filtering rules configured on Twitter services. At least one well-formed filtering rule must be configured. https://developer.twitter.com/en/docs/twitter-api/tweets/filtered-stream/api-reference/get-tweets-search-stream
The source from which the processor will consume Tweets.
Base Pathbase-pathhttps://api.twitter.comThe base path that the processor will use for making HTTP requests. The default value should be sufficient for most use cases.
Bearer Tokenbearer-tokenThe Bearer Token provided by Twitter.
Sensitive Property: true
Queue Sizequeue-size10000Maximum size of internal queue for streamed messages
Batch Sizebatch-size1000The maximum size of the number of Tweets to be written to a single FlowFile. Will write fewer Tweets based on the number available in the queue at the time of processor invocation.
Backoff Attemptsbackoff-attempts5The number of reconnection tries the processor will attempt in the event of a disconnection of the stream for any reason, before throwing an exception. To start a stream after this exception occur and the connection is fixed, please stop and restart the processor. If the valueof this property is 0, then backoff will never occur and the processor will always need to be restartedif the stream fails.
Backoff Timebackoff-time1 minsThe duration to backoff before requesting a new stream ifthe current one fails for any reason. Will increase by factor of 2 every time a restart fails
Maximum Backoff Timemaximum-backoff-time5 minsThe maximum duration to backoff to start attempting a new stream.It is recommended that this number be much higher than the 'Backoff Time' property
Connect Timeoutconnect-timeout10 secsThe maximum time in which client should establish a connection with the Twitter API before a time out. Setting the value to 0 disables connection timeouts.
Read Timeoutread-timeout10 secsThe maximum time of inactivity between receiving tweets from Twitter through the API before a timeout. Setting the value to 0 disables read timeouts.
Backfill Minutesbackfill-minutes0The number of minutes (up to 5 minutes) of streaming data to be requested after a disconnect. Only available for project with academic research access. See https://developer.twitter.com/en/docs/twitter-api/tweets/filtered-stream/integrate/recovery-and-redundancy-features
Tweet Fieldstweet-fieldsA comma-separated list of tweet fields to be returned as part of the tweet. Refer to https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet for proper usage. Possible field values include: attachments, author_id, context_annotations, conversation_id, created_at, entities, geo, id, in_reply_to_user_id, lang, non_public_metrics, organic_metrics, possibly_sensitive, promoted_metrics, public_metrics, referenced_tweets, reply_settings, source, text, withheld
User Fieldsuser-fieldsA comma-separated list of user fields to be returned as part of the tweet. Refer to https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/user for proper usage. Possible field values include: created_at, description, entities, id, location, name, pinned_tweet_id, profile_image_url, protected, public_metrics, url, username, verified, withheld
Media Fieldsmedia-fieldsA comma-separated list of media fields to be returned as part of the tweet. Refer to https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/media for proper usage. Possible field values include: alt_text, duration_ms, height, media_key, non_public_metrics, organic_metrics, preview_image_url, promoted_metrics, public_metrics, type, url, width
Poll Fieldspoll-fieldsA comma-separated list of poll fields to be returned as part of the tweet. Refer to https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/poll for proper usage. Possible field values include: duration_minutes, end_datetime, id, options, voting_status
Place Fieldsplace-fieldsA comma-separated list of place fields to be returned as part of the tweet. Refer to https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/place for proper usage. Possible field values include: contained_within, country, country_code, full_name, geo, id, name, place_type
ExpansionsexpansionsA comma-separated list of expansions for objects in the returned tweet. See https://developer.twitter.com/en/docs/twitter-api/expansions for proper usage. Possible field values include: author_id, referenced_tweets.id, referenced_tweets.id.author_id, entities.mentions.username, attachments.poll_ids, attachments.media_keys ,in_reply_to_user_id, geo.place_id

Relationships:

NameDescription
successFlowFiles containing an array of one or more Tweets

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
mime.typeThe MIME Type set to application/json
tweetsThe number of Tweets in the FlowFile

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component does not allow an incoming relationship.

System Resource Considerations:

None specified.