API Import Engine¶
Craft Easy ships with a built-in file-import engine that lives inside craft-easy-api itself. It turns a CSV, JSON, XML, or fixed-width file plus an ImportTemplate into real resources via the normal CRUD hooks, with full multi-tenant routing, validation, preview, retry of rejected rows, and soft-delete revert.
For bank reconciliation, scheduled SFTP polling, and stand-alone container deployment, see the craft-easy-file-import package instead. The two solve different problems — see that page for a decision matrix.
Concepts¶
An import in Craft Easy has three moving parts:
| Concept | What it is |
|---|---|
ImportTemplate |
The blueprint: what the source file looks like, what resource to create, how fields map, what validation to apply |
ImportRun |
One execution of a template against one file — holds progress, counts, errors, and the parsed preview |
RejectedRecord |
A single row that failed validation — the user can correct and retry it from the admin UI |
The engine is driven by ImportEngine (core/file_import/engine.py). Templates are stored in the import_templates collection, runs in import_runs, and rejections in rejected_records.
The two-phase workflow¶
Every import goes through validate → execute, with an optional preview step in between. This lets users see exactly what will happen before committing to changes.
upload file preview?
│ │
▼ ▼
┌──────────┐ ┌───────────┐
│ validate │────────────▶│ preview │────────────▶ diff
└──────────┘ └───────────┘
│
▼
┌──────────┐
│ execute │────────────▶ create / update / delete
└──────────┘
validate() parses the file, checks required columns, runs the field transformer on the first N rows (FILE_IMPORT_PREVIEW_ROWS, default 10), and persists the ImportRun with status="validated", preview rows, and any early errors. The file itself is stored in run.file_content so execution can happen later without re-uploading.
preview() computes the full diff without touching the database: how many rows would be created, updated, deleted, or skipped. Returns to_create, to_update, to_delete, to_skip, and sample rows.
execute() runs the actual import. It processes rows one at a time, calls the configured CRUD hooks for each, saves progress every 100 rows, and ends with status="completed" (or failed).
Import modes¶
ImportTemplate.import_mode picks the behavior:
| Mode | Behavior | Requires |
|---|---|---|
append / create |
Every row becomes a new document. | — |
upsert |
Match existing documents by upsert_key; create if not found, update if found. |
upsert_key |
update |
Match by upsert_key; update if found, skip if not. |
upsert_key |
sync |
Treat the file as the source of truth: create new, update existing, soft-delete anything in the scope that is not in the file. | upsert_key, scope_filter |
sync is the destructive one — it's how "the master spreadsheet is always right" workflows should be modeled. Always set a tight scope_filter so you don't accidentally soft-delete records from other areas.
The delete_column flag (upsert only)¶
In upsert mode, you can designate one column as a delete flag. Any row where that column contains "true", "1", "yes", or "delete" will soft-delete the matched record instead of updating it. This is what the admin UI uses to let users edit and delete rows in the same spreadsheet.
Defining a template¶
from craft_easy.models.file_import import ImportTemplate, FieldMapping
template = ImportTemplate(
name="products-sync",
description="Daily product catalogue sync from ERP",
file_type="csv",
delimiter=",",
encoding="utf-8",
has_header=True,
target_resource="products",
import_mode="sync",
upsert_key="sku",
scope_filter={"source": "erp"},
field_mapping=[
FieldMapping(source="SKU", target="sku", required=True),
FieldMapping(source="Name", target="name", required=True, transform="trim"),
FieldMapping(source="Price", target="price", transform="decimal:2"),
FieldMapping(source="Active", target="is_active", transform="boolean"),
FieldMapping(source="Updated", target="updated_at", transform="date:YYYY-MM-DD"),
],
default_values={"source": "erp"},
required_columns=["SKU", "Name"],
)
await template.insert()
Field transforms¶
FieldMapping.transform accepts these expressions (see core/file_import/transformer.py):
| Transform | Example | Result |
|---|---|---|
date:FORMAT |
date:YYYY-MM-DD |
ISO date |
decimal:N |
decimal:2 |
Decimal with N decimals; accepts European commas |
integer |
int(value) |
|
boolean |
true/1/yes/on/ja/sant → True |
|
constant:VALUE |
constant:active |
Literal value, source ignored |
concat:f1,f2 |
concat:FirstName,LastName |
Space-joined |
trim |
value.strip() |
|
upper / lower |
Case conversion |
Uploading and running an import¶
The REST surface lives under /imports:
# 1. Upload a file, get a validated run
curl -X POST "http://localhost:5001/imports/upload" \
-H "Authorization: Bearer $TOKEN" \
-F "template_id=664a..." \
-F "file=@products.csv"
# Response: ImportRun in status "validated" with preview_rows
# 2. Optional — compute full diff
curl -X POST "http://localhost:5001/imports/preview" \
-H "Authorization: Bearer $TOKEN" \
-F "template_id=664a..." \
-F "file=@products.csv"
# 3. Execute the validated run
curl -X POST "http://localhost:5001/imports/runs/664b.../execute" \
-H "Authorization: Bearer $TOKEN"
# Response: ImportRun in status "completed" with created_count etc.
Dry run¶
POST /imports/dry-run validates a file without creating a run, so you can iterate on the template without polluting the run history.
Cancelling and reverting¶
# Cancel a run that is still in "pending" or "validated"
curl -X POST "http://localhost:5001/imports/runs/664b.../cancel"
# Revert a "sync" run — restores soft-deleted records
curl -X POST "http://localhost:5001/imports/runs/664b.../revert"
revert() only works for runs that actually soft-deleted something (deleted_count > 0). It clears is_deleted and deleted_at on every record listed in run.deleted_ids.
Rejected records and retry¶
When a row fails validation or transformation, it is stored as a RejectedRecord with its original parsed data and the list of errors. Users can inspect and correct rejected rows from the admin UI, then retry them individually or in bulk:
# List all rejected records for a run
curl "http://localhost:5001/imports/rejected?run_id=664b..."
# Submit corrections
curl -X PUT "http://localhost:5001/imports/rejected/664c..." \
-H "Content-Type: application/json" \
-d '{"corrected_data": {"sku": "ABC123", "price": "199.00"}}'
# Retry a single corrected record
curl -X POST "http://localhost:5001/imports/rejected/664c.../retry"
# Retry everything that has been corrected
curl -X POST "http://localhost:5001/imports/runs/664b.../retry-all"
# Or give up on a record
curl -X POST "http://localhost:5001/imports/rejected/664c.../discard"
Rejected records never block execution — an import completes with whatever it could process and leaves the rest in rejected_records for later human review.
Parsers¶
Under the hood, parsers translate raw bytes into a list[dict[str, str]] for the transformer:
file_type |
Parser | Notes |
|---|---|---|
csv |
CsvParser |
Respects delimiter, encoding, skip_rows, has_header |
json |
JsonParser |
Accepts arrays, single objects, or {"data": [...]} / items / rows / records / results wrappers |
xml |
XmlParser |
Root children become rows; element tags become columns |
fixed_width |
FixedWidthParser |
Uses template.column_widths |
excel |
— | Not included by default; install openpyxl and register your own parser |
You can register a custom parser by subclassing FileParser and wiring it through get_parser().
Multi-tenant imports¶
Templates can be shared from a parent tenant (shared_from_tenant_id) so sub-tenants can use them read-only, and individual import runs can target a sub-tenant via target_tenant_id. This is how property managers run the same import template across every property they administer.
curl -X POST "http://localhost:5001/imports/upload" \
-F "template_id=664a..." \
-F "target_tenant_id=664b..." \
-F "file=@products.csv"
The API validates that the current user has access to the target tenant before routing the import.
File sources¶
In addition to HTTP upload, the engine supports polled sources for automated imports (core/file_import/sources.py):
| Source | When to use |
|---|---|
SFTPSource |
Bank files, partner data drops, ERP exports |
WatchedDirectorySource |
Files appearing in a local/mounted directory |
Both return {name, content, size} dicts that can be fed directly into ImportEngine.validate(). Polling orchestration itself is up to your application — for a ready-made scheduler, see the standalone craft-easy-file-import package.
Configuration reference¶
| Setting | Default | Purpose |
|---|---|---|
FILE_IMPORT_PREVIEW_ROWS |
10 | Number of rows to transform for the preview |
FILE_IMPORT_MAX_ERRORS |
1000 | Stop collecting errors after N |