Skip to main content
All Collections2. Integrações com Birdie
How to import data from a secure cloud object storage (GCP, AWS, Azure)
How to import data from a secure cloud object storage (GCP, AWS, Azure)

Import data objects from your GCS or S3 buckets into you Birdie account

P
Written by Product Team
Updated over a month ago

Overview

With the S3 connector, Birdie can import data in Parquet format from AWS S3 or a storage service that implements the S3 API such as Google Cloud Storage. Once a day the connector checks if there are new objects (files) and if so imports the records in those objects.

Requirements

  • Dedicated bucket for Birdie Integration

  • Birdie integration requires a service account with read-only access. Write access may be granted as optional to support teams during initial/manual dataset uploads.

    • AWS Docs

      • See docs on creating a user and generating an access key for the user.

      • If you do not wish to provide a service account, see the docs to create a role and create an IAM Policy that allows Birdie (reach out and we'll provide the ID for the Birdie Account) to assume the role (see the docs)

      • Create an IAM Policy that gives the user/role S3 Read or S3 Read/Write Access to the specific bucket. See the docs on how to this.

    • GCP Docs

      • See the docs for how to enable HMAC Access to a bucket

      • For more information on how this works, Read about the GCS XML API, which works with S3 compatible tools.

      • For more information about HMAC keys, Read about HMAC keys for GCS

  • IAM Policy example with read-write access:

Parameters

  • Region: The region, e.g "us-west-2" (AWS) or "us-central1" (GCP).

  • Bucket: The bucket name.

  • Prefix: A prefix for the object keys. We suggest organizing it based on the kind of data (e.g `birdie/tickets`, `birdie/nps`)

  • Format: The data format to use. Currently only supports `parquet`.

  • Kind: The kind of data you're trying to import. This defines what schema Birdie expects when reading rows from your file. Supported values are:

    • `review`

    • `nps`

    • `csat`

    • `support_ticket`

    • `social_media_post`

    • `issue`

  • Credentials for S3

    • Access Key ID / HMAC Access ID

    • Secret Key ID / HMAC Secret

    • External ID (optional,AWS Specific)

    • Role ARN (optional, AWS specific)

    • The S3 endpoint to use. Only needed if not using AWS S3.

  • Start Date: Date to filter objects by (object modified at).

S3 Schemas

Each row of the file must fit within one of the following schemas. The schema must match the kind selected when configuring the parameters.

See the oficial PARQUET spec for more information on the supported types and logical types.

Feedbacks // Review

Column Name

Type

Optional

Description

feedback_id

STRING

Required

Unique identifier for each review.

text

STRING

Required

Text posted by user

posted_at

STRING

Required

When the feedback was posted (RFC 3339 timestamp)

author_id

STRING

Optional

Identifier for the author of the the record.

account_id

STRING

Optional

Identifier for the account the record belongs to.

language

STRING

Optional

Language of the record as BCP 47 code.

title

STRING

Optional

The title of the feedback given by the author.

rating

FLOAT

Optional

A rating or score of the feedback.

category

STRING

Optional

The category the review belongs to.

owner

STRING

Optional

Owner, Competitor

Feedbacks // NPS and CSAT

Column Name

Type

Optional

Description

feedback_id

STRING

Required

Unique identifier for each answer.

text

STRING

Optional

Text posted by user

posted_at

STRING

Required

When the feedback was posted (RFC 3339 timestamp)

author_id

STRING

Optional

Identifier for the author of the the record.

account_id

STRING

Optional

Identifier for the account the record belongs to.

language

STRING

Optional

Language of the record as BCP 47 code.

title

STRING

Optional

The title of the survey.

rating

FLOAT

Optional

A rating or score of the feedback.

author_name

STRING

Optional

The name of the author.

Conversations // Support tickets

Column Name

Type

Optional

Description

conversation_id

STRING

Required

Unique identifier for each conversation.

message_id

STRING

Required

Unique identifier for each message.

author_id

STRING

Optional

Identifier for the author of the the message.

account_id

STRING

Optional

Identifier for the account the message belongs to.

text

STRING

Required

Text of the message

posted_at

STRING

Required

When the message was posted (RFC 3339 timestamp)

language

STRING

Optional

Language of the message as BCP 47 code.

subject

STRING

Optional

Subject of the ticket.

status

STRING

Optional

Status of the ticket, e.g open.

priority

STRING

Optional

Priority assigned to the ticket.

channel

STRING

Optional

Source channel of the ticket, e.g web.

tags

REPEATED STRING

Optional

Array of tags applied to the ticket.

author_type

STRING

Optional

Internal Person, User, Bot

author_name

STRING

Optional

The name of the author of the message.

survey_title

STRING

Optional

Title of the survey that closes the ticket.

survey_type

STRING

Optional

Type of the survey that closes the ticket. One of: csat or nps

rating

FLOAT

Optional

Rating that the client gave to the support ticket experience.

solved

STRING

Optional

Flag that indicates if the ticket was solved. One of: true or false

source

STRING

Optional

A user-customizable label for grouping feedbacks

Note 1: To ensure consistency, please upload only one row per conversation containing the survey response fields (such as survey_type, survey_title, rating, etc.). This message should be the final one for that ticket, reflecting the client’s closing thoughts on the service provided.

Note 2: To upload multiple messages per ticket, make sure the "ticket" fields (such as subject, status, priority, channel and tags) are consistent across all messages.

Conversation // Social Media Post

Column Name

Type

Optional

Description

conversation_id

STRING

Required

Unique identifier for each conversation.

message_id

STRING

Required

Unique identifier for each message.

author_id

STRING

Optional

Identifier for the author of the the message.

account_id

STRING

Optional

Identifier for the account the message belongs to.

text

STRING

Required

Text of the message

posted_at

STRING

Required

When the message was posted (RFC 3339 timestamp)

language

STRING

Optional

Language of the message as BCP 47 code.

title

STRING

Optional

Title of the post.

owner

STRING

Optional

Owner, Competitor

category

STRING

Optional

The category the post was under, e.g a subreddit name.

url

STRING

Optional

URL of the post.

channel

STRING

Optional

Source channel of the post, e.g facebook.

tags

REPEATED STRING

Optional

Array of tags applied to the post.

author_type

STRING

Optional

Internal Person, User, Bot

author_name

STRING

Optional

The name of the author of the message.

upvotes

INTEGER

Optional

The number of upvotes the message has.

Conversation // Issue

Column Name

Type

Optional

Description

conversation_id

STRING

Required

Unique identifier for each conversation.

message_id

STRING

Required

Unique identifier for each message.

author_id

STRING

Optional

Identifier for the author of the the message.

account_id

STRING

Optional

Identifier for the account the message belongs to.

text

STRING

Required

Text of the message

posted_at

STRING

Required

When the message was posted (RFC 3339 timestamp)

language

STRING

Optional

Language of the message as BCP 47 code.

project_id

STRING

Optional

Project identifier

project_name

STRING

Optional

Project Name

title

STRING

Optional

Issue title

status

STRING

Optional

Issue status

author_name

STRING

Optional

The name of the author of the message.

Accounts

Column Name

Type

Optional

Description

id

STRING

Required

Unique identifier for the account.

source

STRING

Required

Source of the account data (e.g., website, CRM, etc.).

name

STRING

Optional

The name of the account.

website

STRING

Optional

The website URL of the account.

industry

STRING

Optional

The industry to which the account belongs (e.g., Manufacturing, Technology).

customer_type

STRING

Optional

The type of customer, either b2b (business-to-business) or b2c (business-to-consumer).

sales_pipeline_stage

STRING

Optional

The current stage of the account in the sales pipeline (e.g., Lead, Prospect, Closed).

plan_name

STRING

Optional

The name of the plan associated with the account.

plan_start_date

STRING

Optional

The start date of the account's plan (RFC 3339 timestamp).

plan_end_date

STRING

Optional

The end date of the account's plan (RFC 3339 timestamp).

employees

STRING

Optional

The number of employees working for the account.

annual_revenue_currency

STRING

Optional

The currency used for the account's annual revenue (e.g., USD, EUR).

annual_revenue_amount

STRING

Optional

The annual revenue of the account.

spending_period

STRING

Optional

The period over which the account's spending occurs (daily, monthly, yearly, etc.).

spending_currency

STRING

Optional

The currency used for the account's spending (e.g., USD, EUR).

spending_amount

STRING

Optional

The amount of money the account spends during the specified period.

Custom Fields

Any columns that don't fit under the previously listed schemas may become custom fields.

The name of the column in the Parquet Schema must be configured as the key/source of the custom field inside the Birdie App.

Did this answer your question?