S3 Parquet
S
S3 Parquet

target-s3-parquet (gupy-io variant)

S3 Parquet is a file format for storing and processing large amounts of data in a distributed computing environment.

S3 Parquet is a columnar storage format that allows for efficient compression and encoding of data, making it ideal for storing and processing large amounts of data in a distributed computing environment. It is designed to work seamlessly with Amazon S3 and other big data processing tools such as Apache Spark and Hadoop. S3 Parquet allows for faster data processing and analysis, as well as reduced storage costs, making it a popular choice for big data applications.

Settings

S3 Path

The path to the S3 bucket and object where the Parquet data is stored.

AWS Access Key Id

The access key ID for the AWS account that has permission to access the S3 bucket.

AWS Secret Access Key

The secret access key for the AWS account that has permission to access the S3 bucket.

Athena Database

The name of the Athena database where the Parquet data will be queried.

Add Record Metadata

Whether or not to add metadata to each record in the Parquet data.

Stringify Schema

Whether or not to convert the schema of the Parquet data to a string format.

Stream Maps

A mapping of column names to stream names for the Parquet data.

Stream Map Config

Configuration options for the stream maps.

Flattening Enabled

Whether or not to flatten nested structures in the Parquet data.

Flattening Max Depth

The maximum depth to which nested structures will be flattened.

Meltano Community Connector

S3 Parquet connector is available on Meltano Community. It is built by our growing community of over 5000+ developers. Refer to the Install section below to verify the readiness of this connector.

Why Meltano?
Access to Meltano Slack communityJoin 5,500+ data engineers and analytics practitioners. The community is active, helpful, and always on. Good for quick questions, sharing patterns, and learning what others are building.