Job Monitoring

The Job Monitoring page provides a centralised view of all data pipeline jobs — their status, execution history, source and destination, and scheduling details.

The Job Monitoring page provides a centralised view of all data pipeline jobs — their current status, execution history, source and destination, and scheduling details. It is accessible from the notification bell icon in the top-right toolbar of any page in Dataworkz.

The page is organised into four tabs:

Tab
Description

Job Status

All jobs — both one-time and recurring — with their most recent run details and status.

Scheduled Jobs

All recurring jobs with their next scheduled run time and active/inactive state.

Continuous Jobs

Always-on streaming jobs grouped by source system (e.g., MongoDB, Kafka).

Composite Jobs

Multi-step composite Dataflow jobs powered by Airflow.

Job Status

The Job Status tab shows a unified list of all jobs across job types, providing a snapshot of the most recent execution for each.

Columns

Column
Description

Schedule

Whether the job runs Recurring (on a schedule) or One-time. Clickable to view schedule details.

Job type

The type of operation (e.g., Dashboard, Join, preparation).

Created by

The user who created the job.

Last run

The timestamp of the most recent execution.

Source

The source Dataset(s) used by the job. Multiple sources are shown as dataset+N.

Destination

The target Dataset the job writes results to.

Status

Current execution status: Running, Completed, or Error.

Actions

For completed one-time jobs, a Publish button is available to save the job as a reusable Dataflow.

AI Filter

Use the AI Filter input at the top of the list to search or filter jobs using natural language (e.g., type "join jobs from last week" to find matching entries).

Publishing a Completed Job as a Dataflow

One-time jobs that have completed successfully display a Publish button in the Actions column. Clicking Publish opens the Publish Steps dialog:

Field
Description

Steps Name

Enter a name for the Dataflow to be saved.

Tags

Optionally add tags to categorise the Dataflow in the Dataflows list.

Click Publish to save the job's transformation steps as a named, reusable Dataflow. Click Cancel to discard.

💡 Note: Published Dataflows appear in Data Studio > Dataflows and can be cloned, edited, scheduled, or applied to new Datasets. See Dataflows for details.

Job Status Values

Status
Meaning

Running

The job is currently running or is scheduled and actively monitored.

Completed

The job finished successfully.

Error

The job encountered an error during execution. Review the source and destination configuration.

Scheduled Jobs

The Scheduled Jobs tab lists all recurring jobs with their next scheduled run time and current active state.

Columns

Column
Description

Job name

The unique name of the scheduled job. Clickable to view job details.

Job type

The operation type (e.g., join, dashboard).

Created by

The user who created the job.

Source

The source Dataset(s). Multiple sources shown as dataset+N.

Destination

The target Dataset.

Frequency

How often the job runs (e.g., Daily).

Next run

The timestamp of the next scheduled execution. Click the icon for additional schedule details.

Status

Active or Inactive.

Actions

Pause (⏸), Edit (✎), or Delete (🗑) the scheduled job. Inactive jobs show a Play (▷) button to resume.

Managing Scheduled Jobs

  • Pause — Click the pause icon (⏸) to temporarily suspend a recurring job without deleting it. The status changes to Inactive.

  • Resume — Click the play icon (▷) on an inactive job to re-activate it.

  • Edit — Click the edit icon (✎) to modify the job's schedule, source, or target configuration.

  • Delete — Click the delete icon (🗑) to permanently remove the scheduled job.

Continuous Jobs

The Continuous Jobs tab lists always-on streaming jobs that process data in real time from connected streaming sources. Jobs are grouped by source system type.

Supported Source Systems

Continuous jobs are organised under their source system section, for example:

  • MONGODB — Jobs streaming from MongoDB collections

  • KAFKA — Jobs streaming from Kafka topics

Columns

Column
Description

Job name

The unique job identifier.

Job type

The DAG type used for the stream (e.g., Continuous_Dashboard_dag).

Created by

The user who created the job.

Source

The source Dataset or collection being streamed from.

Destination

The target Dataset receiving the streamed data.

Created Date

The date and time the job was created.

Status

Current state of the continuous job.

Actions

Delete (🗑) or Resume (▷) the job.

Continuous Job Status Values

Status
Meaning

Suspended

The job is paused and not currently streaming.

Deleted

The job has been removed. No actions are available.

Running

The job is actively streaming data.

Composite Jobs

The Composite Jobs tab lists multi-step pipeline jobs that combine multiple single Dataflows into end-to-end workflows, orchestrated via Apache Airflow.

Columns

Column
Description

Schedule

Whether the job is Recurring or one-time.

Job name

The unique name of the composite job.

Job type

The orchestration engine used (e.g., Airflow).

Created by

The user who created the composite job.

Created Date

The date and time the composite job was created.

Status

Current state of the composite job.

Actions

Pause (⏸), Resume (▷), or Delete (🗑) the job.

Composite Job Status Values

Status
Meaning

Running

The composite pipeline is actively executing.

Inactive

The job is paused. Use the play icon (▷) to resume.

Deleted

The job has been removed. No actions are available.

💡 Note: Composite jobs correspond to Composite Dataflows created in Data Studio > Dataflows. See Dataflows for details on building and managing multi-step pipelines.

Last updated