Skip to main content

Observability Stack

Plugged.in implements a comprehensive observability stack for OAuth 2.1 operations, enabling real-time monitoring, security event detection, and performance analysis.

Architecture

The observability stack consists of three pillars:

Logs (Loki)

Structured JSON logs for all OAuth operations, security events, and errors

Metrics (Prometheus)

Real-time counters, histograms, and gauges for performance tracking

Dashboards (Grafana)

Unified visualization combining logs and metrics for insights

System Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     pluggedin-app                           β”‚
β”‚                                                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚  β”‚  OAuth Operations │────────▢│ Structured Logs  β”‚        β”‚
β”‚  β”‚  (token refresh,  β”‚         β”‚ (Pino + JSON)    │────┐   β”‚
β”‚  β”‚   PKCE, etc.)     β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                  β”‚   β”‚
β”‚           β”‚                                             β”‚   β”‚
β”‚           β–Ό                                             β”‚   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                 β”‚   β”‚
β”‚  β”‚ Prometheus Metrics│──────────┐                      β”‚   β”‚
β”‚  β”‚ (prom-client)     β”‚          β”‚                      β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚                      β”‚   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”˜
                                   β”‚                      β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Prometheus        β”‚     β”‚   Promtail      β”‚
                    β”‚   (Metrics Store)   β”‚     β”‚  (Log Shipper)  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚                         β”‚
                               β”‚          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
                               β”‚          β”‚     Loki              β”‚
                               β”‚          β”‚   (Log Aggregation)   β”‚
                               β”‚          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚                     β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚              Grafana                     β”‚
                    β”‚  (Unified Dashboards & Alerting)         β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Features

πŸ“Š Comprehensive Metrics

17 Prometheus metrics track all OAuth operations:
  • OAuth Flows: Initiation, completion, duration (counters + histograms)
  • Token Operations: Refresh attempts, success rate, rotation tracking
  • PKCE Security: State creation, validation, cleanup metrics
  • Security Events: Code injection attempts, token reuse detection, integrity violations
  • Discovery: RFC 9728 metadata discovery, success rates
  • Registration: Dynamic client registration (RFC 7591)

πŸ“ Structured Logging

All logs use JSON format for Loki compatibility:
  • OAuth Events: Flow tracking, state transitions, token operations
  • Security Events: Suspicious activity, attack detection, compliance violations
  • Performance: Timing, duration, resource usage
  • Errors: Detailed error context with stack traces

πŸ” Automatic Redaction

Sensitive data is automatically redacted from logs:
  • Access tokens, refresh tokens
  • PKCE code verifiers
  • Client secrets
  • Authorization codes

Quick Start

Prerequisites

# Ensure pluggedin-observability stack is running
cd /path/to/pluggedin-observability
docker-compose up -d
This starts:

Environment Variables

Add to pluggedin-app/.env:
# Observability Configuration
SERVICE_NAME=pluggedin-app
APP_VERSION=2.14.0
LOG_LEVEL=info  # trace, debug, info, warn, error

# Optional: Prometheus Push Gateway
PROMETHEUS_PUSH_GATEWAY=http://localhost:9091

Verify Setup

  1. Check Logs:
# View JSON-formatted OAuth logs
docker logs pluggedin-app | grep oauth | jq .
  1. Check Metrics:
# Prometheus metrics endpoint
curl http://localhost:12005/metrics | grep oauth
  1. Query Loki:
# Query OAuth events from last hour
curl -G -s "http://localhost:3100/loki/api/v1/query_range" \
  --data-urlencode 'query={service_name="pluggedin-app"} |= "oauth"' \
  --data-urlencode "start=$(date -u -d '1 hour ago' +%s)000000000" \
  --data-urlencode "end=$(date -u +%s)000000000" | jq .
  1. Access Grafana:
Open http://localhost:3000
Default credentials: admin/admin

What to Monitor

Critical Metrics

Token Reuse Detection

Metric: oauth_token_refresh_total{status="reuse_detected"}Alert when: > 0Action: Immediate security review - indicates replay attack or race condition

Code Injection Attempts

Metric: oauth_code_injection_attempts_totalAlert when: > 0Action: Review security logs, block attacker IP, audit user accounts

OAuth Flow Success Rate

Metric: oauth_flows_total{status="success"} / oauth_flows_totalAlert when: < 95%Action: Investigate failures, check auth server connectivity

Token Refresh Duration

Metric: oauth_token_refresh_duration_secondsAlert when: p99 > 5sAction: Check network latency to auth servers, database performance

Security Events

Monitor these log events continuously:
  • oauth_refresh_token_reuse_detected (P0 - Critical)
  • oauth_code_injection_attempt (P0 - Critical)
  • oauth_integrity_violation (P1 - High)
  • oauth_ownership_violation (P1 - High)
  • pkce_replay_detected (P1 - High)

Log Levels

Configure based on environment:
# Development
LOG_LEVEL=debug  # Verbose logging for debugging

# Staging
LOG_LEVEL=info   # Standard operational logging

# Production
LOG_LEVEL=warn   # Errors and warnings only (reduces volume)

Performance Impact

The observability stack is designed for minimal overhead:
  • Logging: ~1-2ms per operation (async I/O)
  • Metrics: ~0.1ms per increment (in-memory counters)
  • Total: < 0.5% CPU overhead in production

Next Steps

Troubleshooting

  1. Check metrics endpoint: curl http://localhost:12005/metrics
  2. Verify Prometheus config targets pluggedin-app
  3. Check Prometheus logs: docker logs prometheus
  4. Ensure app is generating OAuth traffic
  1. Verify JSON log format: docker logs pluggedin-app | head -1 | jq .
  2. Check Promtail config includes app log path
  3. Review Promtail logs: docker logs promtail
  4. Test Loki API: curl http://localhost:3100/ready
  1. Verify Prometheus URL: http://prometheus:9090 (Docker network)
  2. Verify Loki URL: http://loki:3100 (Docker network)
  3. Test connectivity: docker exec grafana curl http://prometheus:9090/api/v1/status/config
  1. Increase LOG_LEVEL to warn in production
  2. Configure log sampling in Promtail
  3. Set Loki retention policy (default: 30 days)
  4. Archive old logs to S3/GCS