GDPR¶

Craft Easy has first-class GDPR support built into the core framework. Fields are tagged as PII at model definition time, and the framework uses those tags to power:

Data portability — export everything a user has, across every collection
Right to erasure — depersonalise user data on every tagged field
Consent tracking — append-only log of consent decisions
GDPR filtering in BI exports — exclude, anonymise, or hash PII on the way out

Enable GDPR endpoints in settings.py:

GDPR_ENABLED=true

Tagging fields as PII¶

Every model marks its PII fields with json_schema_extra={"gdpr": True, "gdpr_category": "..."}. The gdpr_category decides how a field is depersonalised:

from beanie import Document
from pydantic import Field
from typing import Optional

class User(BaseDocument):
    name: str
    email: Optional[str] = Field(
        default=None,
        json_schema_extra={"gdpr": True, "gdpr_category": "email"},
    )
    phone: Optional[str] = Field(
        default=None,
        json_schema_extra={"gdpr": True, "gdpr_category": "phone"},
    )
    personal_id: Optional[str] = Field(
        default=None,
        json_schema_extra={"gdpr": True, "gdpr_category": "identity"},
    )
    free_notes: Optional[str] = Field(
        default=None,
        json_schema_extra={"gdpr": True, "gdpr_category": "free_text"},
    )

Supported categories¶

Category	Depersonalisation value
`identity`	`"DEPERSONALIZED"`
`contact`	`"***"`
`email`	`"depersonalized@removed.invalid"`
`phone`	`"+00000000000"`
`address`	`"Address removed"`
`personal`	`"DEPERSONALIZED"`
`free_text`	`"[Content removed per GDPR]"`

The default category (if only gdpr: True is set) is personal. Extend DEPERSONALIZATION_RULES in core/gdpr/service.py to add new categories.

Discovering tagged fields¶

Every BaseDocument subclass exposes:

User.gdpr_fields()
# → {"email": "email", "phone": "phone", "personal_id": "identity", "free_notes": "free_text"}

The GET /gdpr/schema endpoint walks every registered model and returns the same information as JSON so an admin UI can show the user what is stored about them.

Data portability¶

GET /gdpr/user-data/{user_id} returns every document tied to that user across every collection. The service iterates ALL_MODELS, finds documents with user_id == target, and collects them into a dict keyed by collection name.

curl "http://localhost:5001/gdpr/user-data/664a1234..." \
  -H "Authorization: Bearer $TOKEN"

Response:

{
  "user_id": "664a1234...",
  "collections": {
    "users": [{"id": "...", "name": "Alice", "email": "alice@example.com", ...}],
    "bookings": [{"id": "...", "user_id": "664a1234...", "location": "...", ...}],
    "payments": [{"id": "...", "user_id": "664a1234...", "amount": "199.00", ...}]
  }
}

Soft-deleted documents are excluded automatically.

Downloadable export¶

For the actual data portability deliverable (what you hand the data subject), use the export endpoint:

# JSON
curl "http://localhost:5001/gdpr/export/664a1234...?format=json" \
  -H "Authorization: Bearer $TOKEN" \
  -o export.json

# CSV (multi-section, one block per collection)
curl "http://localhost:5001/gdpr/export/664a1234...?format=csv" \
  -H "Authorization: Bearer $TOKEN" \
  -o export.csv

The JSON export includes an export_date timestamp and is the recommended format — GDPR requires a "commonly used, machine-readable format" and JSON satisfies that.

Right to erasure¶

POST /gdpr/depersonalize/{user_id} runs the depersonalisation workflow across every collection:

Iterate ALL_MODELS.
For every model with GDPR-tagged fields, find all documents for the user.
Skip documents that already have is_depersonalized=True.
Apply the DEPERSONALIZATION_RULES for each tagged field.
Set is_depersonalized=True and depersonalized_at=now().
Save the document.
Write an audit entry with the operation depersonalize.

curl -X POST "http://localhost:5001/gdpr/depersonalize/664a1234..." \
  -H "Authorization: Bearer $TOKEN"

Response:

{
  "user_id": "664a1234...",
  "status": "depersonalized",
  "collections": {
    "users": {"documents_depersonalized": 1, "fields": ["email", "phone", "personal_id"]},
    "bookings": {"documents_depersonalized": 5, "fields": ["free_notes"]}
  }
}

Depersonalisation is destructive. The original values are overwritten in place — not hidden, not encrypted, not archived. A depersonalised document keeps its links (foreign keys, booking IDs) so the business record survives, but every PII field is gone.

Single-document depersonalisation¶

If you only need to depersonalise a specific document — for example, a single support ticket that contains PII by mistake — use:

curl -X POST "http://localhost:5001/gdpr/depersonalize/users/664a1234..." \
  -H "Authorization: Bearer $TOKEN"

Full erasure (depersonalise + disable + revoke consents)¶

For the full "right to be forgotten" flow, the erasure-execute endpoint cascades:

curl -X POST "http://localhost:5001/gdpr/erasure-execute/664a1234..." \
  -H "Authorization: Bearer $TOKEN"

This depersonalises every GDPR-tagged field, revokes all active consents for the user, disables the user account, and writes an audit entry per step. Use this when the legal process is complete and the user must be fully erased from the live system.

For a two-phase workflow where legal review has to happen first, call POST /gdpr/erasure-request/{user_id} to record the request, review it in the admin UI, and then call erasure-execute when approved.

Consent decisions are stored as ConsentRecord documents in an append-only log. Every grant and every revocation is a new record — nothing is updated or deleted, so you always have the full history.

class ConsentRecord(BaseDocument):
    user_id: PydanticObjectId
    consent_type: str                   # "data_processing" | "marketing" | "analytics" | "third_party"
    is_granted: bool
    granted_at: datetime | None
    revoked_at: datetime | None
    source: str                         # "app" | "portal" | "admin"
    ip_address: str | None
    version: str | None                 # Version of the consent text

curl -X POST "http://localhost:5001/gdpr/consent" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "664a1234...",
    "consent_type": "marketing",
    "is_granted": true,
    "source": "app",
    "ip_address": "192.168.1.10",
    "version": "v1.0"
  }'

curl "http://localhost:5001/gdpr/consents/664a1234.../status" \
  -H "Authorization: Bearer $TOKEN"

Response:

{
  "user_id": "664a1234...",
  "consents": {
    "data_processing": {
      "granted": true,
      "last_updated": "2026-04-05T12:00:00+00:00",
      "source": "app",
      "version": "v1.0"
    },
    "marketing": {
      "granted": false,
      "last_updated": "2026-04-04T15:30:00+00:00",
      "source": "portal",
      "version": "v1.0"
    }
  }
}

The response reflects the most recent record per consent type — that's the current state.

An authenticated user can withdraw their own consent:

curl -X POST "http://localhost:5001/gdpr/consents/withdraw" \
  -H "Authorization: Bearer $USER_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"consent_type": "marketing", "source": "api"}'

Use require_consent() as a guard at the top of any endpoint that processes personal data:

from craft_easy.core.gdpr.service import require_consent

@router.get("/marketing/newsletter-signup")
async def signup(user_id: str = Depends(current_user_id)):
    await require_consent(user_id, "marketing")
    # If we're here, the user has granted marketing consent
    ...

If the user does not have active consent, the guard raises HTTPException(403) with a structured body:

{
  "error": "consent_required",
  "consent_type": "marketing",
  "message": "Active consent for 'marketing' is required."
}

The BI export pipeline honours GDPR tags when exporting to external warehouses. Each BIExportConfig has a gdpr_mode:

Mode	Effect on GDPR-tagged fields
`exclude`	Removed from the exported row
`anonymize`	Replaced with `"***"`
`hash`	Replaced with the first 16 chars of the SHA-256 hash — deterministic, so joins still work without leaking values

See BI Export for the full flow. The key point is that you never have to rewrite your warehouse queries to comply with GDPR — the filtering happens before rows leave MongoDB.

BaseDocument flags¶

Two fields are inherited by every BaseDocument subclass and used by the GDPR subsystem:

Field	Purpose
`is_depersonalized`	Set to `True` after depersonalisation; used to skip already-processed documents
`depersonalized_at`	Timestamp of the depersonalisation

Both are exclude=True in the Pydantic schema, so they never appear in API responses — they are internal bookkeeping.

Endpoint reference¶

Method	Path	Purpose
`GET`	`/gdpr/schema`	List every GDPR-tagged field across all models
`GET`	`/gdpr/user-data/{user_id}`	Collect all user data for portability
`GET`	`/gdpr/export/{user_id}?format=json\\|csv`	Download data subject export
`POST`	`/gdpr/depersonalize/{user_id}`	Depersonalise all user data
`POST`	`/gdpr/depersonalize/{collection}/{item_id}`	Depersonalise a single document
`POST`	`/gdpr/erasure-request/{user_id}`	Record an erasure request (awaits approval)
`POST`	`/gdpr/erasure-execute/{user_id}`	Execute full erasure (depersonalise + revoke + disable)
`GET`	`/gdpr/consents/{user_id}`	List all consent records
`GET`	`/gdpr/consents/{user_id}/status`	Current consent state per type
`GET`	`/gdpr/consent-log/{user_id}`	Full consent history
`POST`	`/gdpr/consent`	Record a consent decision
`POST`	`/gdpr/consents/withdraw`	Withdraw consent (authenticated user)

All endpoints require authentication and produce Audit entries automatically.

GDPR¶

Tagging fields as PII¶

Supported categories¶

Discovering tagged fields¶

Data portability¶

Downloadable export¶

Right to erasure¶

Single-document depersonalisation¶

Full erasure (depersonalise + disable + revoke consents)¶

Consent tracking¶

Recording consent¶

Getting the current consent state¶

Withdrawing consent¶

Requiring consent in your code¶

GDPR filtering in BI exports¶

BaseDocument flags¶

Endpoint reference¶