GDPR¶
Craft Easy has first-class GDPR support built into the core framework. Fields are tagged as PII at model definition time, and the framework uses those tags to power:
- Data portability — export everything a user has, across every collection
- Right to erasure — depersonalise user data on every tagged field
- Consent tracking — append-only log of consent decisions
- GDPR filtering in BI exports — exclude, anonymise, or hash PII on the way out
Enable GDPR endpoints in settings.py:
Tagging fields as PII¶
Every model marks its PII fields with json_schema_extra={"gdpr": True, "gdpr_category": "..."}. The gdpr_category decides how a field is depersonalised:
from beanie import Document
from pydantic import Field
from typing import Optional
class User(BaseDocument):
name: str
email: Optional[str] = Field(
default=None,
json_schema_extra={"gdpr": True, "gdpr_category": "email"},
)
phone: Optional[str] = Field(
default=None,
json_schema_extra={"gdpr": True, "gdpr_category": "phone"},
)
personal_id: Optional[str] = Field(
default=None,
json_schema_extra={"gdpr": True, "gdpr_category": "identity"},
)
free_notes: Optional[str] = Field(
default=None,
json_schema_extra={"gdpr": True, "gdpr_category": "free_text"},
)
Supported categories¶
| Category | Depersonalisation value |
|---|---|
identity |
"DEPERSONALIZED" |
contact |
"***" |
email |
"depersonalized@removed.invalid" |
phone |
"+00000000000" |
address |
"Address removed" |
personal |
"DEPERSONALIZED" |
free_text |
"[Content removed per GDPR]" |
The default category (if only gdpr: True is set) is personal. Extend DEPERSONALIZATION_RULES in core/gdpr/service.py to add new categories.
Discovering tagged fields¶
Every BaseDocument subclass exposes:
User.gdpr_fields()
# → {"email": "email", "phone": "phone", "personal_id": "identity", "free_notes": "free_text"}
The GET /gdpr/schema endpoint walks every registered model and returns the same information as JSON so an admin UI can show the user what is stored about them.
Data portability¶
GET /gdpr/user-data/{user_id} returns every document tied to that user across every collection. The service iterates ALL_MODELS, finds documents with user_id == target, and collects them into a dict keyed by collection name.
Response:
{
"user_id": "664a1234...",
"collections": {
"users": [{"id": "...", "name": "Alice", "email": "alice@example.com", ...}],
"bookings": [{"id": "...", "user_id": "664a1234...", "location": "...", ...}],
"payments": [{"id": "...", "user_id": "664a1234...", "amount": "199.00", ...}]
}
}
Soft-deleted documents are excluded automatically.
Downloadable export¶
For the actual data portability deliverable (what you hand the data subject), use the export endpoint:
# JSON
curl "http://localhost:5001/gdpr/export/664a1234...?format=json" \
-H "Authorization: Bearer $TOKEN" \
-o export.json
# CSV (multi-section, one block per collection)
curl "http://localhost:5001/gdpr/export/664a1234...?format=csv" \
-H "Authorization: Bearer $TOKEN" \
-o export.csv
The JSON export includes an export_date timestamp and is the recommended format — GDPR requires a "commonly used, machine-readable format" and JSON satisfies that.
Right to erasure¶
POST /gdpr/depersonalize/{user_id} runs the depersonalisation workflow across every collection:
- Iterate
ALL_MODELS. - For every model with GDPR-tagged fields, find all documents for the user.
- Skip documents that already have
is_depersonalized=True. - Apply the
DEPERSONALIZATION_RULESfor each tagged field. - Set
is_depersonalized=Trueanddepersonalized_at=now(). - Save the document.
- Write an audit entry with the operation
depersonalize.
curl -X POST "http://localhost:5001/gdpr/depersonalize/664a1234..." \
-H "Authorization: Bearer $TOKEN"
Response:
{
"user_id": "664a1234...",
"status": "depersonalized",
"collections": {
"users": {"documents_depersonalized": 1, "fields": ["email", "phone", "personal_id"]},
"bookings": {"documents_depersonalized": 5, "fields": ["free_notes"]}
}
}
Depersonalisation is destructive. The original values are overwritten in place — not hidden, not encrypted, not archived. A depersonalised document keeps its links (foreign keys, booking IDs) so the business record survives, but every PII field is gone.
Single-document depersonalisation¶
If you only need to depersonalise a specific document — for example, a single support ticket that contains PII by mistake — use:
curl -X POST "http://localhost:5001/gdpr/depersonalize/users/664a1234..." \
-H "Authorization: Bearer $TOKEN"
Full erasure (depersonalise + disable + revoke consents)¶
For the full "right to be forgotten" flow, the erasure-execute endpoint cascades:
curl -X POST "http://localhost:5001/gdpr/erasure-execute/664a1234..." \
-H "Authorization: Bearer $TOKEN"
This depersonalises every GDPR-tagged field, revokes all active consents for the user, disables the user account, and writes an audit entry per step. Use this when the legal process is complete and the user must be fully erased from the live system.
For a two-phase workflow where legal review has to happen first, call POST /gdpr/erasure-request/{user_id} to record the request, review it in the admin UI, and then call erasure-execute when approved.
Consent tracking¶
Consent decisions are stored as ConsentRecord documents in an append-only log. Every grant and every revocation is a new record — nothing is updated or deleted, so you always have the full history.
class ConsentRecord(BaseDocument):
user_id: PydanticObjectId
consent_type: str # "data_processing" | "marketing" | "analytics" | "third_party"
is_granted: bool
granted_at: datetime | None
revoked_at: datetime | None
source: str # "app" | "portal" | "admin"
ip_address: str | None
version: str | None # Version of the consent text
Recording consent¶
curl -X POST "http://localhost:5001/gdpr/consent" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"user_id": "664a1234...",
"consent_type": "marketing",
"is_granted": true,
"source": "app",
"ip_address": "192.168.1.10",
"version": "v1.0"
}'
Getting the current consent state¶
Response:
{
"user_id": "664a1234...",
"consents": {
"data_processing": {
"granted": true,
"last_updated": "2026-04-05T12:00:00+00:00",
"source": "app",
"version": "v1.0"
},
"marketing": {
"granted": false,
"last_updated": "2026-04-04T15:30:00+00:00",
"source": "portal",
"version": "v1.0"
}
}
}
The response reflects the most recent record per consent type — that's the current state.
Withdrawing consent¶
An authenticated user can withdraw their own consent:
curl -X POST "http://localhost:5001/gdpr/consents/withdraw" \
-H "Authorization: Bearer $USER_TOKEN" \
-H "Content-Type: application/json" \
-d '{"consent_type": "marketing", "source": "api"}'
Requiring consent in your code¶
Use require_consent() as a guard at the top of any endpoint that processes personal data:
from craft_easy.core.gdpr.service import require_consent
@router.get("/marketing/newsletter-signup")
async def signup(user_id: str = Depends(current_user_id)):
await require_consent(user_id, "marketing")
# If we're here, the user has granted marketing consent
...
If the user does not have active consent, the guard raises HTTPException(403) with a structured body:
{
"error": "consent_required",
"consent_type": "marketing",
"message": "Active consent for 'marketing' is required."
}
GDPR filtering in BI exports¶
The BI export pipeline honours GDPR tags when exporting to external warehouses. Each BIExportConfig has a gdpr_mode:
| Mode | Effect on GDPR-tagged fields |
|---|---|
exclude |
Removed from the exported row |
anonymize |
Replaced with "***" |
hash |
Replaced with the first 16 chars of the SHA-256 hash — deterministic, so joins still work without leaking values |
See BI Export for the full flow. The key point is that you never have to rewrite your warehouse queries to comply with GDPR — the filtering happens before rows leave MongoDB.
BaseDocument flags¶
Two fields are inherited by every BaseDocument subclass and used by the GDPR subsystem:
| Field | Purpose |
|---|---|
is_depersonalized |
Set to True after depersonalisation; used to skip already-processed documents |
depersonalized_at |
Timestamp of the depersonalisation |
Both are exclude=True in the Pydantic schema, so they never appear in API responses — they are internal bookkeeping.
Endpoint reference¶
| Method | Path | Purpose |
|---|---|---|
GET |
/gdpr/schema |
List every GDPR-tagged field across all models |
GET |
/gdpr/user-data/{user_id} |
Collect all user data for portability |
GET |
/gdpr/export/{user_id}?format=json\|csv |
Download data subject export |
POST |
/gdpr/depersonalize/{user_id} |
Depersonalise all user data |
POST |
/gdpr/depersonalize/{collection}/{item_id} |
Depersonalise a single document |
POST |
/gdpr/erasure-request/{user_id} |
Record an erasure request (awaits approval) |
POST |
/gdpr/erasure-execute/{user_id} |
Execute full erasure (depersonalise + revoke + disable) |
GET |
/gdpr/consents/{user_id} |
List all consent records |
GET |
/gdpr/consents/{user_id}/status |
Current consent state per type |
GET |
/gdpr/consent-log/{user_id} |
Full consent history |
POST |
/gdpr/consent |
Record a consent decision |
POST |
/gdpr/consents/withdraw |
Withdraw consent (authenticated user) |
All endpoints require authentication and produce Audit entries automatically.