Data Ingestion
Azure Blob Storage
Azure Blob Storage is Microsoft’s object storage solution for the cloud, designed to store massive amounts of unstructured data.
Overview
Azure Blob Storage is a massively scalable object storage solution for unstructured data. Datazone provides native integration with Azure Blob Storage to read data files directly from your containers.
Connection Parameters
Parameter | Required | Description |
---|---|---|
Name | Yes | A unique identifier for your Azure Blob source |
Account URL | Yes | The URL endpoint for your Azure Storage account |
Token | Yes | The access token or SAS token for authentication |
Container Name | Yes | The name of the container holding your data files |
Required Permissions
The Azure Storage account needs the following permissions:
Storage Blob Data Reader
- For reading blob dataStorage Blob Data List
- For listing blobs in containersStorage Account List
- For accessing storage account properties
Limitations
Be aware of the following limitations when working with Azure Blob sources:
- CSV, TXT, Parquet, JSON files are supported
- UTF-8 encoding is recommended
- Individual file size limits apply based on your Azure Storage configuration
- The Storage account and Datazone instance should ideally be in the same region for optimal performance
- Cross-region access may incur additional Azure charges
Next Steps
After configuring your Azure Blob source:
- Create extracts to specify which files to ingest
- Configure scheduling for recurring extracts
- Integrate the source into your data pipelines
For more information about working with extracts and pipelines, refer to their respective documentation sections.