Execution

Overview

The Execution entity represents the process of running a data job within the data platform. It is a crucial component in the data pipeline, responsible for executing data extraction, transformation, and loading tasks. Execution can be triggered by user actions or scheduled events and is essential for the movement and processing of data from sources to datasets.

Properties

ID: A unique identifier for the execution instance.
Type: The nature of the execution process (e.g., batch, real-time, streaming).
Status: Current status of the execution (e.g., running, completed, failed).
Trigger: The method by which the execution is initiated (e.g., manual trigger, scheduled event).

Usage

Data Movement: Executes processes that move data from sources, through extracts, and into datasets.
Data Transformation: Executes transformation scripts (PySpark scripts) as defined in the transforms associated with the execution.
Scheduling and Automation: Supports the scheduling of data jobs for regular, automated execution.

Pipeline Schedule