Redactor

The Redactor scans and masks sensitive information (PII) and secrets in real-time before data leaves the agent boundary. It uses regex patterns with validation (Luhn checks, entropy filters) for high-accuracy detection across 7 PII categories and 11 secret types.

Quick Start

from enforcecore import Redactor

redactor = Redactor(categories=["email", "phone", "ssn"])

result = redactor.redact("Contact john@example.com or call 555-123-4567")
print(result.text)
# "Contact [EMAIL] or call [PHONE]"

print(result.entities)
# [DetectedEntity(category="email", text="john@example.com", ...),
#  DetectedEntity(category="phone", text="555-123-4567", ...)]
print(result.count)         # 2
print(result.was_redacted)  # True
Info

Note: When using the @enforce decorator with pii_redaction.enabled: true in your policy, redaction happens automatically on both inputs and outputs.

Redactor Class

Constructor

Redactor(
    categories: list[str] | None = None,
    strategy: RedactionStrategy = RedactionStrategy.PLACEHOLDER,
)
Parameter Type Default Description
categories list[str] | None all built-in PII categories to detect. None enables all.
strategy RedactionStrategy PLACEHOLDER How to redact detected entities.

Redaction Strategies

Strategy Behavior Example Output
PLACEHOLDER Replace with category tag "Contact [EMAIL]"
MASK Replace characters with * "Contact ****@***.com"
HASH Replace with SHA-256 hash "Contact [sha256:a3f2...]"
REMOVE Remove entirely "Contact or call "
from enforcecore.core.types import RedactionStrategy

redactor = Redactor(strategy=RedactionStrategy.MASK)
result = redactor.redact("Call 555-123-4567")
# "Call ***-***-****"

Methods

redact(text) -> RedactionResult

Scans text and applies the configured redaction strategy.

result = redactor.redact("My SSN is 123-45-6789 and email is jane@corp.com")
print(result.text)            # "My SSN is [SSN] and email is [EMAIL]"
print(result.original_text)   # "My SSN is 123-45-6789 and email is jane@corp.com"
print(result.was_redacted)    # True
print(result.count)           # 2

Returns: RedactionResult

Data Structures

RedactionResult

Attribute Type Description
text str The redacted output.
original_text str The original input.
entities list[DetectedEntity] All detected PII entities.
events list[RedactionEvent] Detailed redaction events.
count int Number of entities detected (property).
was_redacted bool Whether any redactions occurred (property).
redacted_text str Alias for text (property).

DetectedEntity

Attribute Type Description
category str PII category (e.g., "email", "phone").
start int Start index in original text.
end int End index in original text.
text str The matched text.

Built-in PII Categories

Category Examples Validation
email user@example.com Regex
phone +1 555-123-4567, (555) 123-4567 Regex
ssn 123-45-6789 Regex + format validation
credit_card 4111-1111-1111-1111 Regex + Luhn checksum
ip_address 192.168.1.1 Regex
passport A12345678, AB123456 ICAO Doc 9303 format
person_name John Smith Title Case pattern matching

Secret Detection

EnforceCore includes a dedicated SecretScanner for detecting leaked credentials:

from enforcecore.redactor.secrets import SecretScanner

scanner = SecretScanner()
secrets = scanner.scan("Use key AKIAIOSFODNN7EXAMPLE to access S3")

for secret in secrets:
    print(f"{secret.category}: {secret.text[:20]}...")
    # aws_access_key: AKIAIOSFODNN7EXAMPLE

Detected Secret Types

Category Pattern
aws_access_key AKIA... (20 chars)
aws_secret_key 40-char base64 strings
github_token ghp_, gho_, ghs_, ghr_ prefixed
generic_api_key Key-value pairs with high Shannon entropy
bearer_token Bearer / JWT tokens
private_key PEM-encoded RSA, EC, DSA private keys
password_in_url user:pass@host patterns
gcp_service_account Google Cloud service account JSON
azure_connection_string Azure storage/service bus strings
database_connection_string postgres://, mysql://, mongodb://, redis://
ssh_private_key OpenSSH private key markers

Custom Patterns

Register custom PII patterns via the PatternRegistry:

from enforcecore.redactor.patterns import PatternRegistry

PatternRegistry.register(
    category="employee_id",
    regex=r"EMP-\d{6}",
    placeholder="[EMPLOYEE_ID]",
    validator=lambda text: text.startswith("EMP-"),
)

# Now the Redactor will detect employee IDs
redactor = Redactor(categories=["email", "employee_id"])
result = redactor.redact("Employee EMP-123456 reported an issue")
# "Employee [EMPLOYEE_ID] reported an issue"
Parameter Type Description
category str Category name (used in DetectedEntity).
regex str Regex pattern to match.
placeholder str Replacement text (for PLACEHOLDER strategy).
mask str Mask character (for MASK strategy).
validator Callable | None Optional function for additional validation.

Policy Integration

PII redaction is configured in your YAML policy:

rules:
  pii_redaction:
    enabled: true
    categories:
      - email
      - phone
      - ssn
      - credit_card
      - ip_address
    strategy: "placeholder"      # placeholder | mask | hash | remove
  redact_output: true            # Redact PII from function return values

Unicode Normalization

The Redactor includes Unicode normalization utilities to handle homoglyph-based evasion attacks (e.g., using Cyrillic characters that look like Latin letters).

Error Handling

Exception When
RedactionError General redaction failure.

See Also