Redactor

The Redactor scans and masks sensitive information (PII) and secrets in real-time before data leaves the agent boundary. It uses regex patterns with validation (Luhn checks, entropy filters) for high-accuracy detection across 7 PII categories and 11 secret types.

Quick Start

from enforcecore import Redactor

redactor = Redactor(categories=["email", "phone", "ssn"])

result = redactor.redact("Contact john@example.com or call 555-123-4567")
print(result.text)
# "Contact [EMAIL] or call [PHONE]"

print(result.entities)
# [DetectedEntity(category="email", text="john@example.com", ...),
#  DetectedEntity(category="phone", text="555-123-4567", ...)]
print(result.count)         # 2
print(result.was_redacted)  # True

Info

Note: When using the @enforce decorator with pii_redaction.enabled: true in your policy, redaction happens automatically on both inputs and outputs.

`Redactor` Class

Constructor

Redactor(
    categories: list[str] | None = None,
    strategy: RedactionStrategy = RedactionStrategy.PLACEHOLDER,
)

Parameter	Type	Default	Description
`categories`	`list[str] \| None`	all built-in	PII categories to detect. `None` enables all.
`strategy`	`RedactionStrategy`	`PLACEHOLDER`	How to redact detected entities.

Redaction Strategies

Strategy	Behavior	Example Output
`PLACEHOLDER`	Replace with category tag	`"Contact [EMAIL]"`
`MASK`	Replace characters with `*`	`"Contact **@*.com"`
`HASH`	Replace with SHA-256 hash	`"Contact [sha256:a3f2...]"`
`REMOVE`	Remove entirely	`"Contact or call "`

from enforcecore.core.types import RedactionStrategy

redactor = Redactor(strategy=RedactionStrategy.MASK)
result = redactor.redact("Call 555-123-4567")
# "Call ***-***-****"

Methods

`redact(text) -> RedactionResult`

Scans text and applies the configured redaction strategy.

result = redactor.redact("My SSN is 123-45-6789 and email is jane@corp.com")
print(result.text)            # "My SSN is [SSN] and email is [EMAIL]"
print(result.original_text)   # "My SSN is 123-45-6789 and email is jane@corp.com"
print(result.was_redacted)    # True
print(result.count)           # 2

Returns: RedactionResult

Data Structures

`RedactionResult`

Attribute	Type	Description
`text`	`str`	The redacted output.
`original_text`	`str`	The original input.
`entities`	`list[DetectedEntity]`	All detected PII entities.
`events`	`list[RedactionEvent]`	Detailed redaction events.
`count`	`int`	Number of entities detected (property).
`was_redacted`	`bool`	Whether any redactions occurred (property).
`redacted_text`	`str`	Alias for `text` (property).

`DetectedEntity`

Attribute	Type	Description
`category`	`str`	PII category (e.g., `"email"`, `"phone"`).
`start`	`int`	Start index in original text.
`end`	`int`	End index in original text.
`text`	`str`	The matched text.

Built-in PII Categories

Category	Examples	Validation
`email`	`user@example.com`	Regex
`phone`	`+1 555-123-4567`, `(555) 123-4567`	Regex
`ssn`	`123-45-6789`	Regex + format validation
`credit_card`	`4111-1111-1111-1111`	Regex + Luhn checksum
`ip_address`	`192.168.1.1`	Regex
`passport`	`A12345678`, `AB123456`	ICAO Doc 9303 format
`person_name`	`John Smith`	Title Case pattern matching

Secret Detection

EnforceCore includes a dedicated SecretScanner for detecting leaked credentials:

from enforcecore.redactor.secrets import SecretScanner

scanner = SecretScanner()
secrets = scanner.scan("Use key AKIAIOSFODNN7EXAMPLE to access S3")

for secret in secrets:
    print(f"{secret.category}: {secret.text[:20]}...")
    # aws_access_key: AKIAIOSFODNN7EXAMPLE

Detected Secret Types

Category	Pattern
`aws_access_key`	`AKIA...` (20 chars)
`aws_secret_key`	40-char base64 strings
`github_token`	`ghp_`, `gho_`, `ghs_`, `ghr_` prefixed
`generic_api_key`	Key-value pairs with high Shannon entropy
`bearer_token`	`Bearer` / JWT tokens
`private_key`	PEM-encoded RSA, EC, DSA private keys
`password_in_url`	`user:pass@host` patterns
`gcp_service_account`	Google Cloud service account JSON
`azure_connection_string`	Azure storage/service bus strings
`database_connection_string`	postgres://, mysql://, mongodb://, redis://
`ssh_private_key`	OpenSSH private key markers

Custom Patterns

from enforcecore.redactor.patterns import PatternRegistry

PatternRegistry.register(
    category="employee_id",
    regex=r"EMP-\d{6}",
    placeholder="[EMPLOYEE_ID]",
    validator=lambda text: text.startswith("EMP-"),
)

# Now the Redactor will detect employee IDs
redactor = Redactor(categories=["email", "employee_id"])
result = redactor.redact("Employee EMP-123456 reported an issue")
# "Employee [EMPLOYEE_ID] reported an issue"

Parameter	Type	Description
`category`	`str`	Category name (used in DetectedEntity).
`regex`	`str`	Regex pattern to match.
`placeholder`	`str`	Replacement text (for PLACEHOLDER strategy).
`mask`	`str`	Mask character (for MASK strategy).
`validator`	`Callable \| None`	Optional function for additional validation.

Policy Integration

PII redaction is configured in your YAML policy:

rules:
  pii_redaction:
    enabled: true
    categories:
      - email
      - phone
      - ssn
      - credit_card
      - ip_address
    strategy: "placeholder"      # placeholder | mask | hash | remove
  redact_output: true            # Redact PII from function return values

Unicode Normalization

The Redactor includes Unicode normalization utilities to handle homoglyph-based evasion attacks (e.g., using Cyrillic characters that look like Latin letters).

Error Handling

Exception	When
`RedactionError`	General redaction failure.

Redactor

Quick Start

Redactor Class

Constructor

Redaction Strategies

Methods

redact(text) -> RedactionResult

Data Structures

RedactionResult

DetectedEntity

Built-in PII Categories

Secret Detection

Detected Secret Types

Custom Patterns

Policy Integration

Unicode Normalization

Error Handling

See Also

`Redactor` Class

`redact(text) -> RedactionResult`

`RedactionResult`

`DetectedEntity`