Safe Observability: A Framework for Automated PII Redaction from LLM Prompts in OpenTelemetry Pipelines
Keywords:
Large Language Models, OpenTelemetry, Safe Observability, PII-RedactionAbstract
The proliferation of Large Language Models (LLMs) within enterprise applications has introduced a critical conflict between the goals of modern observability and the mandates of data privacy. While observability platforms provide essential, deep visibility into complex distributed systems, they inadvertently become repositories for Personally Identifiable Information (PII) when they ingest the unstructured, information-rich prompts users submit to LLMs. This leakage of sensitive data into telemetry pipelines constitutes a significant security liability and a compliance risk under regulations such as GDPR and CCPA. This paper introduces the concept of "Safe Observability," a paradigm that reconciles the need for comprehensive system insight with robust privacy protection. I propose a novel framework for achieving this through the automated redaction of PII within the OpenTelemetry (OTel) ecosystem. The core of this framework is a custom, configurable PII-Redaction Processor for the OpenTelemetry Collector, designed to act as a strategic control plane for sanitizing telemetry data in-transit. The architecture employs a hybrid PII detection methodology, combining the speed of regular expressions with the contextual accuracy of Named Entity Recognition (NER) models, implemented as a decoupled microservice. This paper details the architectural design, provides a comprehensive implementation guide for building and deploying the custom processor using the OpenTelemetry Collector Builder (OCB) and Go, and presents a rigorous evaluation of its efficacy and performance impact. The findings demonstrate that this approach offers a practically viable and architecturally sound solution for preventing PII leakage, enabling organizations to leverage the power of observability and LLMs without compromising user privacy or regulatory compliance.
References
[1] J. Turner. “What is Observability? Beyond Logs, Metrics, and Traces.” Internet: https://www.strongdm.com/observability, Oct. 23, 2025 [accessed Oct. 2, 2025].
[2] “Logging Sensitive Information - PII.” Internet: https://docs.guidewire.com/security/secure-coding-guidance/logging-sensitive-information-PII, Oct. 3, 2025 [accessed Oct. 3, 2025].
[3] D. Samanta. “When Prompts Leak Secrets: The Hidden Risk in LLM Requests.” https://www.keysight.com/blogs/en/tech/nwvs/2025/08/04/pii-disclosure-in-user-request, Aug. 4, 2025 [Oct. 4, 2025].
[4] Nightfall AI. “How does sensitive information end up in observability platforms?” https://www.nightfall.ai/blog/how-does-sensitive-information-end-up-in-observability-platforms. [Oct. 8, 2025].
[5] Proofpoint. “AI and Data Protection: Strategies for LLM Compliance and Risk Mitigation.” https://www.proofpoint.com/us/blog/dspm/ai-and-data-protection-strategies-for-llm-compliance-and-risk-mitigation [Oct. 13, 2025].
[6] S. Falconer. “How to Keep Sensitive Data Out of Your Logs: 9 Best Practices.” https://www.skyflow.com/post/how-to-keep-sensitive-data-out-of-your-logs-nine-best-practices, Feb. 24, 2025 [Nov. 26, 2025].
[7] M. Mudryi, M. Chaklosh, G. M. Wójcik, et al. “The Hidden Dangers of Browsing AI Agents.” https://arxiv.org/abs/2505.13076 [Nov. 28, 2025].
[8] L. P. Gamage. “Named Entity Recognition (NER) for sanitizing the PII and sensitive data for public LLMs.” https://blog.stackademic.com/named-entity-recognition-ner-for-sanitizing-the-pii-and-sensitive-data-for-public-llms-2273912b7b90, Mar. 5, 2025 [Oct. 22, 2025].
[9] S. Asthana, R. Mahindru, B. Zhang, and J. Sanz. “Adaptive PII Mitigation Framework for Large Language Models.” https://arxiv.org/html/2501.12465v1, Jan. 20, 2025 [Oct. 13, 2025].
[10] S. Sadafal. “Masking PII in OpenTelemetry: How to Keep Observability Secure and Compliant.” https://medium.com/@sonal.sadafal/masking-pii-in-opentelemetry-how-to-keep-observability-secure-and-compliant-07baeac1a286, Jul. 29, 2025 [Oct. 15, 2025].
[11] C. Risi. “Report Shows OpenTelemetry’s Impact on Go Performance.” https://www.infoq.com/news/2025/06/opentelemetry-go-performance/, Jun. 2025 [Nov. 26, 2025].
[12] “OpenTelemetry Architecture.” Internet: https://uptrace.dev/opentelemetry/architecture [Oct. 15, 2025].
[13] S. Ajay. “Mastering the OpenTelemetry Collector: Architecture and Core Components.” https://medium.com/@siddhuajay001/mastering-the-opentelemetry-collector-architecture-deployment-and-core-components-3ebe7e1c9c39, Oct. 27, 2024 [Oct. 20, 2025].
[14] P. Sonpatki. “What is the OpenTelemetry Collector and How Does It Work?” https://last9.io/blog/what-is-opentelemetry-collector/, Jul. 17, 2024 [Oct. 20, 2025].
[15] “Collector.” Internet: https://opentelemetry.io/docs/collector/ [Oct. 22, 2025].
[16] S. Curran. “A simple Python library to detect PII in structured data, powered by Microsoft Presidio and spaCy NER.” https://github.com/shanecurran/piiscan [Oct. 22, 2025].
[17] “Using NLP and Pattern Matching to Detect, Assess, and Redact PII in Logs - Part 1.” https://www.elastic.co/observability-labs/blog/pii-ner-regex-assess-redact-part-1, Oct. 24, 2025 [Oct. 24, 2025].
[18] A. Jones. “SOTA PII Redaction on Your Laptop.” https://openpipe.ai/blog/pii-redact, Mar. 26, 2025 [Oct. 20, 2025].
[19] “Building a custom collector.” https://opentelemetry.io/docs/collector/custom-collector/ [Oct. 10, 2025].
[20] “The Learning Agency Lab - PII Data Detection.” https://www.kaggle.com/c/pii-detection-removal-from-educational-data/data [Oct. 5, 2025].
Downloads
Published
Issue
Section
License
Copyright (c) 2025 S. Melnyk

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who submit papers with this journal agree to the following terms.