Skip to content

Built-in Jobs

Craft Easy ships with several built-in jobs that cover common maintenance and data-movement tasks. Some are part of the craft-easy-api framework and auto-register when JOBS_ENABLED=true; others are part of the standalone craft-easy-jobs package and become available when you install it. This page is the catalogue for both.

Summary

Job Package Timeout Retries Scope
token-cleanup craft-easy-jobs 120s 0 system
audit-log-archive craft-easy-jobs 600s 0 system
bi-export craft-easy-jobs 3600s 2 per_tenant
cleanup_expired_tokens craft-easy-api 300s 2 system
cleanup_audit_log craft-easy-api 600s 2 system
cleanup_job_runs craft-easy-api 300s 2 system
purge_soft_deleted craft-easy-api 600s 2 system
calculate_next_schedules craft-easy-api 120s 1 system

Both packages ship token-cleanup / cleanup_expired_tokens-style jobs because they target different lifecycles — the CLI version is meant for Cloud Run cron triggers, and the API version is meant for the in-process scheduler. Pick the one that matches your deployment.


craft-easy-jobs (standalone CLI)

These jobs live in craft_easy_jobs.builtin and require the craft-easy-jobs[full] extra (which pulls in craft-easy-api so the jobs can access Beanie models).

token-cleanup

Purpose: Delete expired authentication sessions (UserAuthentication documents).

Timeout 120 seconds
Retries 0
Scope system (no tenant)
Parameters none

Returns:

{"deleted": 42}

Usage:

craft-easy-job run token-cleanup

Schedule daily via Cloud Scheduler or a Kubernetes CronJob:

schedule: "0 3 * * *"
command: ["craft-easy-job", "run", "token-cleanup"]

audit-log-archive

Purpose: Archive old audit log entries past the retention window.

Timeout 600 seconds
Retries 0
Scope system
Parameters none

Returns:

{"archived": 0}

The concrete archival strategy (cold storage, S3, delete) is left to the implementation — the standalone version is a placeholder that can be customised per deployment. For a ready-made deletion-based cleanup, use the API framework's cleanup_audit_log instead.

bi-export

Purpose: Export configured resources to BigQuery or Azure SQL with GDPR filtering.

Timeout 3600 seconds (1 hour)
Retries 2
Scope per_tenant (runs once per tenant)

Parameters:

{
  "target_type": "bigquery",
  "configs": [
    {
      "resource": "invoices",
      "target_table": "invoices_fact",
      "gdpr_mode": "exclude",
      "flatten_depth": 1,
      "flatten_fields": {},
      "exclude_fields": ["internal_notes"],
      "include_fields": null
    }
  ],

  "project_id": "my-gcp-project",
  "dataset_id": "analytics",
  "location": "europe-north1",
  "credentials_path": "/secrets/bq-sa.json",

  "connection_string": "Driver={ODBC Driver 18 for SQL Server};Server=...",

  "batch_size": 500
}
Field Target Purpose
target_type both "bigquery" or "azure_sql"
configs both List of BIExportConfig dicts — one per resource
project_id, dataset_id, location, credentials_path BigQuery Dataset + credentials
connection_string Azure SQL ODBC connection string
batch_size both Upsert batch size (default 500)

Each item in configs matches BIExportConfig from BI Export:

Field Default Purpose
resource required MongoDB collection name
target_table required Destination table name
gdpr_mode "exclude" exclude, anonymize, or hash
flatten_depth 1 Auto-flatten nested dicts up to this depth
flatten_fields {} Explicit dot-path → column mapping
exclude_fields [] Fields to drop
include_fields null Whitelist; if set, only these fields are exported

Returns:

{
  "target_type": "bigquery",
  "resources_processed": 2,
  "total_exported": 2450,
  "total_errors": 0,
  "results": [
    {"resource": "invoices", "table": "invoices_fact", "exported": 1800, "skipped": 0, "errors": []},
    {"resource": "payments", "table": "payments_fact", "exported": 650, "skipped": 0, "errors": []}
  ]
}

Usage:

craft-easy-job run bi-export \
  --tenant-id tenant_664abc \
  --params "$(cat bi-params.json)"

craft-easy-api (in-process framework)

These jobs register themselves in the job_registry when the API boots with JOBS_ENABLED=true. Run them manually with POST /jobs/run/{job_name} or schedule them with POST /jobs/schedules.

cleanup_expired_tokens

Purpose: Delete expired AuthenticationCode and UserAuthentication documents.

Timeout 300 seconds
Retries 2 (max_attempts=2)
Parameters none

Returns:

{"codes_deleted": 128}

Recommended schedule:

{"cron_expression": "0 3 * * *"}

cleanup_audit_log

Purpose: Delete audit log entries older than JOBS_AUDIT_RETENTION_DAYS (default 365).

Timeout 600 seconds
Retries 2
Parameters none (cutoff comes from settings)

Returns:

{
  "deleted": 54321,
  "cutoff_date": "2025-04-05T00:00:00+00:00",
  "retention_days": 365
}

Recommended schedule:

{"cron_expression": "0 4 * * 0"}

cleanup_job_runs

Purpose: Delete JobRun documents older than JOBS_CLEANUP_DAYS (default 90). Keeps the job history table bounded.

Timeout 300 seconds
Retries 2
Parameters none

Returns:

{
  "deleted": 2450,
  "cutoff_date": "2026-01-05T00:00:00+00:00",
  "cleanup_days": 90
}

Recommended schedule:

{"cron_expression": "0 5 * * *"}

purge_soft_deleted

Purpose: Hard-delete documents where is_deleted=True and deleted_at is older than SOFT_DELETE_RETENTION_DAYS. Runs against every BaseDocument subclass when SOFT_DELETE_ENABLED=true.

Timeout 600 seconds
Retries 2
Parameters none

Returns:

{
  "total_purged": 1850,
  "cutoff_date": "2025-01-05T00:00:00+00:00",
  "retention_days": 30
}

This job permanently deletes data. Set SOFT_DELETE_RETENTION_DAYS carefully — once purged, records cannot be recovered from the application database (check your MongoDB backups if you need them back).

Recommended schedule:

{"cron_expression": "0 2 * * *"}

calculate_next_schedules

Purpose: Recompute next_run_at on every enabled JobSchedule. Useful as a self-healing fallback if a deploy, clock change, or manual PATCH leaves a schedule in an inconsistent state.

Timeout 120 seconds
Retries 1
Parameters none

Returns:

{"total_schedules": 12, "updated": 12}

Recommended schedule:

{"cron_expression": "*/30 * * * *"}

Running built-in jobs from the API

# Manually trigger the in-process built-in
curl -X POST "http://localhost:5001/jobs/run/cleanup_expired_tokens" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{}'

# Check result
curl "http://localhost:5001/jobs/runs/{run_id}" \
  -H "Authorization: Bearer $TOKEN"

To schedule:

curl -X POST "http://localhost:5001/jobs/schedules" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "job_name": "nightly-token-cleanup",
    "job_type": "cleanup_expired_tokens",
    "cron_expression": "0 3 * * *",
    "is_enabled": true
  }'

Running built-in jobs from the CLI

# Install the extra that pulls in craft-easy-api (needed for Beanie models)
pip install craft-easy-jobs[full]

# Run the standalone BI export
craft-easy-job run bi-export \
  --tenant-id tenant_664abc \
  --params '{
    "target_type": "bigquery",
    "project_id": "my-gcp-project",
    "dataset_id": "analytics",
    "credentials_path": "/secrets/bq-sa.json",
    "configs": [
      {"resource": "invoices", "target_table": "invoices_fact", "gdpr_mode": "exclude"},
      {"resource": "payments", "target_table": "payments_fact", "gdpr_mode": "hash"}
    ]
  }'

See Scheduling for cron syntax and timezone handling, and API Job Framework for the full REST API.