Redactor
The Redactor scans and masks sensitive information (PII) and secrets in real-time before data leaves the agent boundary. It uses regex patterns with validation (Luhn checks, entropy filters) for high-accuracy detection across 7 PII categories and 11 secret types.
Quick Start
from enforcecore import Redactor
redactor = Redactor(categories=["email", "phone", "ssn"])
result = redactor.redact("Contact john@example.com or call 555-123-4567")
print(result.text)
# "Contact [EMAIL] or call [PHONE]"
print(result.entities)
# [DetectedEntity(category="email", text="john@example.com", ...),
# DetectedEntity(category="phone", text="555-123-4567", ...)]
print(result.count) # 2
print(result.was_redacted) # TrueNote: When using the @enforce decorator with pii_redaction.enabled: true in your policy, redaction happens automatically on both inputs and outputs.
Redactor Class
Constructor
Redactor(
categories: list[str] | None = None,
strategy: RedactionStrategy = RedactionStrategy.PLACEHOLDER,
)| Parameter | Type | Default | Description |
|---|---|---|---|
categories |
list[str] | None |
all built-in | PII categories to detect. None enables all. |
strategy |
RedactionStrategy |
PLACEHOLDER |
How to redact detected entities. |
Redaction Strategies
| Strategy | Behavior | Example Output |
|---|---|---|
PLACEHOLDER |
Replace with category tag | "Contact [EMAIL]" |
MASK |
Replace characters with * |
"Contact ****@***.com" |
HASH |
Replace with SHA-256 hash | "Contact [sha256:a3f2...]" |
REMOVE |
Remove entirely | "Contact or call " |
from enforcecore.core.types import RedactionStrategy
redactor = Redactor(strategy=RedactionStrategy.MASK)
result = redactor.redact("Call 555-123-4567")
# "Call ***-***-****"Methods
redact(text) -> RedactionResult
Scans text and applies the configured redaction strategy.
result = redactor.redact("My SSN is 123-45-6789 and email is jane@corp.com")
print(result.text) # "My SSN is [SSN] and email is [EMAIL]"
print(result.original_text) # "My SSN is 123-45-6789 and email is jane@corp.com"
print(result.was_redacted) # True
print(result.count) # 2Returns: RedactionResult
Data Structures
RedactionResult
| Attribute | Type | Description |
|---|---|---|
text |
str |
The redacted output. |
original_text |
str |
The original input. |
entities |
list[DetectedEntity] |
All detected PII entities. |
events |
list[RedactionEvent] |
Detailed redaction events. |
count |
int |
Number of entities detected (property). |
was_redacted |
bool |
Whether any redactions occurred (property). |
redacted_text |
str |
Alias for text (property). |
DetectedEntity
| Attribute | Type | Description |
|---|---|---|
category |
str |
PII category (e.g., "email", "phone"). |
start |
int |
Start index in original text. |
end |
int |
End index in original text. |
text |
str |
The matched text. |
Built-in PII Categories
| Category | Examples | Validation |
|---|---|---|
email |
user@example.com |
Regex |
phone |
+1 555-123-4567, (555) 123-4567 |
Regex |
ssn |
123-45-6789 |
Regex + format validation |
credit_card |
4111-1111-1111-1111 |
Regex + Luhn checksum |
ip_address |
192.168.1.1 |
Regex |
passport |
A12345678, AB123456 |
ICAO Doc 9303 format |
person_name |
John Smith |
Title Case pattern matching |
Secret Detection
EnforceCore includes a dedicated SecretScanner for detecting leaked credentials:
from enforcecore.redactor.secrets import SecretScanner
scanner = SecretScanner()
secrets = scanner.scan("Use key AKIAIOSFODNN7EXAMPLE to access S3")
for secret in secrets:
print(f"{secret.category}: {secret.text[:20]}...")
# aws_access_key: AKIAIOSFODNN7EXAMPLEDetected Secret Types
| Category | Pattern |
|---|---|
aws_access_key |
AKIA... (20 chars) |
aws_secret_key |
40-char base64 strings |
github_token |
ghp_, gho_, ghs_, ghr_ prefixed |
generic_api_key |
Key-value pairs with high Shannon entropy |
bearer_token |
Bearer / JWT tokens |
private_key |
PEM-encoded RSA, EC, DSA private keys |
password_in_url |
user:pass@host patterns |
gcp_service_account |
Google Cloud service account JSON |
azure_connection_string |
Azure storage/service bus strings |
database_connection_string |
postgres://, mysql://, mongodb://, redis:// |
ssh_private_key |
OpenSSH private key markers |
Custom Patterns
Register custom PII patterns via the PatternRegistry:
from enforcecore.redactor.patterns import PatternRegistry
PatternRegistry.register(
category="employee_id",
regex=r"EMP-\d{6}",
placeholder="[EMPLOYEE_ID]",
validator=lambda text: text.startswith("EMP-"),
)
# Now the Redactor will detect employee IDs
redactor = Redactor(categories=["email", "employee_id"])
result = redactor.redact("Employee EMP-123456 reported an issue")
# "Employee [EMPLOYEE_ID] reported an issue"| Parameter | Type | Description |
|---|---|---|
category |
str |
Category name (used in DetectedEntity). |
regex |
str |
Regex pattern to match. |
placeholder |
str |
Replacement text (for PLACEHOLDER strategy). |
mask |
str |
Mask character (for MASK strategy). |
validator |
Callable | None |
Optional function for additional validation. |
Policy Integration
PII redaction is configured in your YAML policy:
rules:
pii_redaction:
enabled: true
categories:
- email
- phone
- ssn
- credit_card
- ip_address
strategy: "placeholder" # placeholder | mask | hash | remove
redact_output: true # Redact PII from function return valuesUnicode Normalization
The Redactor includes Unicode normalization utilities to handle homoglyph-based evasion attacks (e.g., using Cyrillic characters that look like Latin letters).
Error Handling
| Exception | When |
|---|---|
RedactionError |
General redaction failure. |
See Also
- Enforcer API — How the redactor integrates with enforcement.
- Policy Engine — Configuring PII rules in YAML.
- GDPR Compliance — How redaction supports GDPR Article 25.