Overview
With the S3 connector, Birdie can import data in Parquet format from AWS S3 or a storage service that implements the S3 API such as Google Cloud Storage. Once a day the connector checks if there are new objects (files) and if so imports the records in those objects.
Requirements
Dedicated bucket for Birdie Integration
Birdie integration requires a service account with read-only access. Write access may be granted as optional to support teams during initial/manual dataset uploads.
AWS Docs
See docs on creating a user and generating an access key for the user.
If you do not wish to provide a service account, see the docs to create a role and create an IAM Policy that allows Birdie (reach out and we'll provide the ID for the Birdie Account) to assume the role (see the docs)
Create an IAM Policy that gives the user/role S3 Read or S3 Read/Write Access to the specific bucket. See the docs on how to this.
GCP Docs
See the docs for how to enable HMAC Access to a bucket
For more information on how this works, Read about the GCS XML API, which works with S3 compatible tools.
For more information about HMAC keys, Read about HMAC keys for GCS
IAM Policy example with read-write access:
Parameters
Region: The region, e.g "us-west-2" (AWS) or "us-central1" (GCP).
Bucket: The bucket name.
Prefix: A prefix for the object keys. We suggest organizing it based on the kind of data (e.g `birdie/tickets`, `birdie/nps`)
Format: The data format to use. Currently only supports `parquet`.
Kind: The kind of data you're trying to import. This defines what schema Birdie expects when reading rows from your file. Supported values are:
`review`
`nps`
`csat`
`support_ticket`
`social_media_post`
`issue`
Credentials for S3
Access Key ID / HMAC Access ID
Secret Key ID / HMAC Secret
External ID (optional,AWS Specific)
Role ARN (optional, AWS specific)
The S3 endpoint to use. Only needed if not using AWS S3.
Start Date: Date to filter objects by (object modified at).
S3 Schemas
Each row of the file must fit within one of the following schemas. The schema must match the kind selected when configuring the parameters.
See the oficial PARQUET spec for more information on the supported types and logical types.
Feedbacks // Review
Column Name | Type | Optional | Description |
|
| Required | Unique identifier for each review. |
|
| Required | Text posted by user |
|
| Required | When the feedback was posted (RFC 3339 timestamp) |
|
| Optional | Identifier for the author of the the record. |
|
| Optional | Identifier for the account the record belongs to. |
|
| Optional | Language of the record as BCP 47 code. |
|
| Optional | The title of the feedback given by the author. |
|
| Optional | A rating or score of the feedback. |
|
| Optional | The category the review belongs to. |
|
| Optional |
|
Feedbacks // NPS and CSAT
Column Name | Type | Optional | Description |
|
| Required | Unique identifier for each answer. |
|
| Optional | Text posted by user |
|
| Required | When the feedback was posted (RFC 3339 timestamp) |
|
| Optional | Identifier for the author of the the record. |
|
| Optional | Identifier for the account the record belongs to. |
|
| Optional | Language of the record as BCP 47 code. |
|
| Optional | The title of the survey. |
|
| Optional | A rating or score of the feedback. |
|
| Optional | The name of the author. |
Conversations // Support tickets
Column Name | Type | Optional | Description |
|
| Required | Unique identifier for each conversation. |
|
| Required | Unique identifier for each message. |
|
| Optional | Identifier for the author of the the message. |
|
| Optional | Identifier for the account the message belongs to. |
|
| Required | Text of the message |
|
| Required | When the message was posted (RFC 3339 timestamp) |
|
| Optional | Language of the message as BCP 47 code. |
|
| Optional | Subject of the ticket. |
|
| Optional | Status of the ticket, e.g |
|
| Optional | Priority assigned to the ticket. |
|
| Optional | Source channel of the ticket, e.g |
|
| Optional | Array of tags applied to the ticket. |
|
| Optional |
|
|
| Optional | The name of the author of the message. |
Conversation // Social Media Post
Column Name | Type | Optional | Description |
|
| Required | Unique identifier for each conversation. |
|
| Required | Unique identifier for each message. |
|
| Optional | Identifier for the author of the the message. |
|
| Optional | Identifier for the account the message belongs to. |
|
| Required | Text of the message |
|
| Required | When the message was posted (RFC 3339 timestamp) |
|
| Optional | Language of the message as BCP 47 code. |
|
| Optional | Title of the post. |
|
| Optional |
|
|
| Optional | The category the post was under, e.g a subreddit name. |
|
| Optional | URL of the post. |
|
| Optional | Source channel of the post, e.g |
|
| Optional | Array of tags applied to the post. |
|
| Optional |
|
|
| Optional | The name of the author of the message. |
|
| Optional | The number of upvotes the message has. |
Conversation // Issue
Column Name | Type | Optional | Description |
|
| Required | Unique identifier for each conversation. |
|
| Required | Unique identifier for each message. |
|
| Optional | Identifier for the author of the the message. |
|
| Optional | Identifier for the account the message belongs to. |
|
| Required | Text of the message |
|
| Required | When the message was posted (RFC 3339 timestamp) |
|
| Optional | Language of the message as BCP 47 code. |
|
| Optional | Project identifier |
|
| Optional | Project Name |
|
| Optional | Issue title |
|
| Optional | Issue status |
|
| Optional | The name of the author of the message. |
Custom Fields
Any columns that don't fit under the previously listed schemas may become custom fields.
The name of the column in the Parquet Schema must be configured as the key/source of the custom field inside the Birdie App.