Skip to content

API Import Engine

Craft Easy ships with a built-in file-import engine that lives inside craft-easy-api itself. It turns a CSV, JSON, XML, or fixed-width file plus an ImportTemplate into real resources via the normal CRUD hooks, with full multi-tenant routing, validation, preview, retry of rejected rows, and soft-delete revert.

For bank reconciliation, scheduled SFTP polling, and stand-alone container deployment, see the craft-easy-file-import package instead. The two solve different problems — see that page for a decision matrix.

Concepts

An import in Craft Easy has three moving parts:

Concept What it is
ImportTemplate The blueprint: what the source file looks like, what resource to create, how fields map, what validation to apply
ImportRun One execution of a template against one file — holds progress, counts, errors, and the parsed preview
RejectedRecord A single row that failed validation — the user can correct and retry it from the admin UI

The engine is driven by ImportEngine (core/file_import/engine.py). Templates are stored in the import_templates collection, runs in import_runs, and rejections in rejected_records.

The two-phase workflow

Every import goes through validate → execute, with an optional preview step in between. This lets users see exactly what will happen before committing to changes.

  upload file                preview?
      │                         │
      ▼                         ▼
┌──────────┐             ┌───────────┐
│ validate │────────────▶│  preview  │────────────▶ diff
└──────────┘             └───────────┘
┌──────────┐
│ execute  │────────────▶ create / update / delete
└──────────┘

validate() parses the file, checks required columns, runs the field transformer on the first N rows (FILE_IMPORT_PREVIEW_ROWS, default 10), and persists the ImportRun with status="validated", preview rows, and any early errors. The file itself is stored in run.file_content so execution can happen later without re-uploading.

preview() computes the full diff without touching the database: how many rows would be created, updated, deleted, or skipped. Returns to_create, to_update, to_delete, to_skip, and sample rows.

execute() runs the actual import. It processes rows one at a time, calls the configured CRUD hooks for each, saves progress every 100 rows, and ends with status="completed" (or failed).

Import modes

ImportTemplate.import_mode picks the behavior:

Mode Behavior Requires
append / create Every row becomes a new document.
upsert Match existing documents by upsert_key; create if not found, update if found. upsert_key
update Match by upsert_key; update if found, skip if not. upsert_key
sync Treat the file as the source of truth: create new, update existing, soft-delete anything in the scope that is not in the file. upsert_key, scope_filter

sync is the destructive one — it's how "the master spreadsheet is always right" workflows should be modeled. Always set a tight scope_filter so you don't accidentally soft-delete records from other areas.

The delete_column flag (upsert only)

In upsert mode, you can designate one column as a delete flag. Any row where that column contains "true", "1", "yes", or "delete" will soft-delete the matched record instead of updating it. This is what the admin UI uses to let users edit and delete rows in the same spreadsheet.

Defining a template

from craft_easy.models.file_import import ImportTemplate, FieldMapping

template = ImportTemplate(
    name="products-sync",
    description="Daily product catalogue sync from ERP",
    file_type="csv",
    delimiter=",",
    encoding="utf-8",
    has_header=True,

    target_resource="products",
    import_mode="sync",
    upsert_key="sku",
    scope_filter={"source": "erp"},

    field_mapping=[
        FieldMapping(source="SKU", target="sku", required=True),
        FieldMapping(source="Name", target="name", required=True, transform="trim"),
        FieldMapping(source="Price", target="price", transform="decimal:2"),
        FieldMapping(source="Active", target="is_active", transform="boolean"),
        FieldMapping(source="Updated", target="updated_at", transform="date:YYYY-MM-DD"),
    ],

    default_values={"source": "erp"},
    required_columns=["SKU", "Name"],
)

await template.insert()

Field transforms

FieldMapping.transform accepts these expressions (see core/file_import/transformer.py):

Transform Example Result
date:FORMAT date:YYYY-MM-DD ISO date
decimal:N decimal:2 Decimal with N decimals; accepts European commas
integer int(value)
boolean true/1/yes/on/ja/santTrue
constant:VALUE constant:active Literal value, source ignored
concat:f1,f2 concat:FirstName,LastName Space-joined
trim value.strip()
upper / lower Case conversion

Uploading and running an import

The REST surface lives under /imports:

# 1. Upload a file, get a validated run
curl -X POST "http://localhost:5001/imports/upload" \
  -H "Authorization: Bearer $TOKEN" \
  -F "template_id=664a..." \
  -F "file=@products.csv"

# Response: ImportRun in status "validated" with preview_rows

# 2. Optional — compute full diff
curl -X POST "http://localhost:5001/imports/preview" \
  -H "Authorization: Bearer $TOKEN" \
  -F "template_id=664a..." \
  -F "file=@products.csv"

# 3. Execute the validated run
curl -X POST "http://localhost:5001/imports/runs/664b.../execute" \
  -H "Authorization: Bearer $TOKEN"

# Response: ImportRun in status "completed" with created_count etc.

Dry run

POST /imports/dry-run validates a file without creating a run, so you can iterate on the template without polluting the run history.

Cancelling and reverting

# Cancel a run that is still in "pending" or "validated"
curl -X POST "http://localhost:5001/imports/runs/664b.../cancel"

# Revert a "sync" run — restores soft-deleted records
curl -X POST "http://localhost:5001/imports/runs/664b.../revert"

revert() only works for runs that actually soft-deleted something (deleted_count > 0). It clears is_deleted and deleted_at on every record listed in run.deleted_ids.

Rejected records and retry

When a row fails validation or transformation, it is stored as a RejectedRecord with its original parsed data and the list of errors. Users can inspect and correct rejected rows from the admin UI, then retry them individually or in bulk:

# List all rejected records for a run
curl "http://localhost:5001/imports/rejected?run_id=664b..."

# Submit corrections
curl -X PUT "http://localhost:5001/imports/rejected/664c..." \
  -H "Content-Type: application/json" \
  -d '{"corrected_data": {"sku": "ABC123", "price": "199.00"}}'

# Retry a single corrected record
curl -X POST "http://localhost:5001/imports/rejected/664c.../retry"

# Retry everything that has been corrected
curl -X POST "http://localhost:5001/imports/runs/664b.../retry-all"

# Or give up on a record
curl -X POST "http://localhost:5001/imports/rejected/664c.../discard"

Rejected records never block execution — an import completes with whatever it could process and leaves the rest in rejected_records for later human review.

Parsers

Under the hood, parsers translate raw bytes into a list[dict[str, str]] for the transformer:

file_type Parser Notes
csv CsvParser Respects delimiter, encoding, skip_rows, has_header
json JsonParser Accepts arrays, single objects, or {"data": [...]} / items / rows / records / results wrappers
xml XmlParser Root children become rows; element tags become columns
fixed_width FixedWidthParser Uses template.column_widths
excel Not included by default; install openpyxl and register your own parser

You can register a custom parser by subclassing FileParser and wiring it through get_parser().

Multi-tenant imports

Templates can be shared from a parent tenant (shared_from_tenant_id) so sub-tenants can use them read-only, and individual import runs can target a sub-tenant via target_tenant_id. This is how property managers run the same import template across every property they administer.

curl -X POST "http://localhost:5001/imports/upload" \
  -F "template_id=664a..." \
  -F "target_tenant_id=664b..." \
  -F "file=@products.csv"

The API validates that the current user has access to the target tenant before routing the import.

File sources

In addition to HTTP upload, the engine supports polled sources for automated imports (core/file_import/sources.py):

Source When to use
SFTPSource Bank files, partner data drops, ERP exports
WatchedDirectorySource Files appearing in a local/mounted directory

Both return {name, content, size} dicts that can be fed directly into ImportEngine.validate(). Polling orchestration itself is up to your application — for a ready-made scheduler, see the standalone craft-easy-file-import package.

Configuration reference

Setting Default Purpose
FILE_IMPORT_PREVIEW_ROWS 10 Number of rows to transform for the preview
FILE_IMPORT_MAX_ERRORS 1000 Stop collecting errors after N