Overview
Amazon Simple Storage Service (S3) is an object storage service offering industry-leading scalability, availability, and durability. Datazone provides native integration with AWS S3 to read data files directly from your S3 buckets.Connection Parameters
Parameter | Required | Description |
---|---|---|
Name | Yes | A unique identifier for your AWS S3 source |
AWS Access Key ID | Yes | The access key ID from your AWS credentials |
AWS Secret Access Key | Yes | The secret access key from your AWS credentials |
AWS Region | Yes | The AWS region where your S3 bucket is located (e.g., us-east-1) |
Bucket Name | Yes | The name of the S3 bucket containing your data files |
Required Permissions
The AWS IAM user account needs the following permissions on the specified S3 bucket:s3:GetObject
- For reading files from the buckets3:ListBucket
- For listing contents of the buckets3:GetBucketLocation
- For determining the bucket’s region
Limitations
Be aware of the following limitations when working with AWS S3 CSV sources:- CSV, TXT, Parquet, JSON files are supported
- UTF-8 encoding is recommended
- Individual file size limits apply based on your AWS S3 configuration
- The S3 bucket and Datazone instance should ideally be in the same region for optimal performance
- Cross-region access may incur additional AWS charges
Next Steps
After configuring your AWS S3 source:- Create extracts to specify which CSV files to ingest
- Configure scheduling for recurring extracts
- Integrate the source into your data pipelines