15 Commits

Author SHA1 Message Date
ca190b0085 docs: add v0.5.2 release notes
Add changelog entries for recent commits:
- Health package: Kubernetes health probes
- Logger package: runtime level control and fixes
- Database package: config file override support
2025-09-21 12:10:05 -07:00
10864363e2 feat(health): enhance server with probe-specific handlers
- Add separate handlers for liveness (/healthz), readiness (/readyz),
  and startup (/startupz) probes
- Implement WithLivenessHandler, WithReadinessHandler, WithStartupHandler,
  and WithServiceName options
- Add probe-specific JSON response formats
- Add comprehensive package documentation with usage examples
- Maintain backward compatibility for /__health and / endpoints
- Add tests for all probe types and fallback scenarios

Enables proper Kubernetes health monitoring with different probe types.
2025-09-21 10:52:29 -07:00
66b51df2af feat(logger): add runtime log level control API
Add independent log level control for stderr and OTLP loggers.
Both can be configured via environment variables or programmatically
at runtime.

- Add SetLevel() and SetOTLPLevel() for runtime control
- Add ParseLevel() to convert strings to slog.Level
- Support LOG_LEVEL and OTLP_LOG_LEVEL env vars
- Maintain backward compatibility with DEBUG env var
- Add comprehensive test coverage
2025-09-06 05:21:33 -07:00
28d05d1d0e feat(database): add DATABASE_CONFIG_FILE env override
Allow overriding default database.yaml paths via DATABASE_CONFIG_FILE
environment variable. When set, uses single specified file instead of
default ["database.yaml", "/vault/secrets/database.yaml"] search paths.

Maintains backward compatibility when env var not set.
2025-08-03 12:20:35 -07:00
a774f92bf7 fix(logger): prevent mutex crash in bufferingExporter
Remove sync.Once reset that caused "unlock of unlocked mutex" panic.
Redesign initialization to use only checkReadiness goroutine for
retry attempts, eliminating race condition while preserving retry
functionality for TLS/tracing setup delays.
2025-08-02 22:55:57 -07:00
0b9769dc39 Prepare v0.5.1 2025-08-02 11:04:13 -07:00
9dadd9edc3 feat(version): add Unix epoch support for buildTime
Support both Unix epoch timestamps and RFC3339 format for build time
injection via ldflags. Unix epoch format provides simpler build
commands: $(date +%s) vs $(date -u +%Y-%m-%dT%H:%M:%SZ).

- Add parseBuildTime() to convert epoch to RFC3339
- Maintain backward compatibility with existing RFC3339 format
- Ensure consistent RFC3339 output regardless of input format
- Fix build date priority over git commit time
2025-08-02 10:16:41 -07:00
c6230be91e feat(metrics): add OTLP metrics support with centralized config
- Create new metrics/ package for OpenTelemetry-native metrics with OTLP export
- Refactor OTLP configuration to internal/tracerconfig/ to eliminate code duplication
- Add consistent retry configuration across all HTTP OTLP exporters
- Add configuration validation and improved error messages
- Include test coverage for all new functionality
- Make OpenTelemetry metrics dependencies explicit in go.mod

Designed for new applications requiring structured metrics export to
observability backends via OTLP protocol.
2025-08-02 09:29:27 -07:00
796b2a8412 tracing: enable retrying otlp requests when using http 2025-07-27 17:13:06 -07:00
6a3bc7bab3 feat(logger): add buffering exporter with TLS support for OTLP logs
Add buffering exporter to queue OTLP logs until tracing is configured.
Support TLS configuration for OpenTelemetry log export with client
certificate authentication. Improve logfmt formatting and tracing setup.
2025-07-27 16:36:18 -07:00
da13a371b4 feat(database): add shared transaction helpers
Add transaction base utilities with Begin, Commit, and Rollback
functions supporting both sql.DB and sql.Tx interfaces.
2025-07-12 23:52:48 -07:00
a1a5a6b8be database: create shared database package
Extract common database functionality from api/ntpdb and monitor/ntpdb
into shared common/database package:

- Dynamic connector pattern with configuration loading
- Configurable connection pool management (API: 25/10, Monitor: 10/5)
- Optional Prometheus metrics integration
- Generic transaction helpers with proper error handling
- Unified interfaces compatible with SQLC-generated code

Foundation for migration to eliminate ~200 lines of duplicate code.
2025-07-12 17:59:28 -07:00
96afb77844 database: create shared database package with configurable patterns
Extract ~200 lines of duplicate database connection code from api/ntpdb/
and monitor/ntpdb/ into common/database/ package. Creates foundation for
database consolidation while maintaining zero breaking changes.

Files added:
- config.go: Unified configuration with package-specific defaults
- connector.go: Dynamic connector pattern from Boostport
- pool.go: Configurable connection pool management
- metrics.go: Optional Prometheus metrics integration
- interfaces.go: Shared database interfaces for consistent patterns

Key features:
- Configuration-driven approach (API: 25/10 connections + metrics,
  Monitor: 10/5 connections, no metrics)
- Optional Prometheus metrics when registerer provided
- Backward compatibility via convenience functions
- Flexible config file loading (explicit paths + search-based)

Dependencies: Added mysql driver and yaml parsing for database configuration.
2025-07-12 16:54:24 -07:00
c372d79d1d build: goreleaser 2.11.0 and download script tweaks 2025-07-12 16:51:10 -07:00
b5141d6a70 Add database transaction helpers 2025-07-12 13:57:27 -07:00
30 changed files with 3444 additions and 208 deletions

53
CHANGELOG.md Normal file
View File

@@ -0,0 +1,53 @@
# Release Notes - v0.5.2
## Health Package
- **Kubernetes-native health probes** - Added dedicated handlers for liveness (`/healthz`), readiness (`/readyz`), and startup (`/startupz`) probes
- **Flexible configuration options** - New `WithLivenessHandler`, `WithReadinessHandler`, `WithStartupHandler`, and `WithServiceName` options
- **JSON response formats** - Structured probe responses with service identification
- **Backward compatibility** - Maintains existing `/__health` and `/` endpoints
## Logger Package
- **Runtime log level control** - Independent level management for stderr and OTLP loggers via `SetLevel()` and `SetOTLPLevel()`
- **Environment variable support** - Configure levels with `LOG_LEVEL` and `OTLP_LOG_LEVEL` env vars
- **String parsing utility** - New `ParseLevel()` function for converting string levels to `slog.Level`
- **Buffering exporter fix** - Resolved "unlock of unlocked mutex" panic in `bufferingExporter`
- **Initialization redesign** - Eliminated race conditions in TLS/tracing setup retry logic
## Database Package
- **Configuration file override** - Added `DATABASE_CONFIG_FILE` environment variable to specify custom database configuration file paths
- **Flexible path configuration** - Override default `["database.yaml", "/vault/secrets/database.yaml"]` search paths when needed
# Release Notes - v0.5.1
## Observability Enhancements
### OTLP Metrics Support
- **New `metrics/` package** - OpenTelemetry-native metrics with OTLP export support for structured metrics collection
- **Centralized OTLP configuration** - Refactored configuration to `internal/tracerconfig/` to eliminate code duplication across tracing, logging, and metrics
- **HTTP retry support** - Added consistent retry configuration for all HTTP OTLP exporters to improve reliability
### Enhanced Logging
- **Buffering exporter** - Added OTLP log buffering to queue logs until tracing configuration is available
- **TLS support for logs** - Client certificate authentication support for secure OTLP log export
- **Improved logfmt formatting** - Better structured output for log messages
### Tracing Improvements
- **HTTP retry support** - OTLP trace requests now automatically retry on failure when using HTTP transport
## Build System
### Version Package Enhancements
- **Unix epoch build time support** - Build time can now be injected as Unix timestamps (`$(date +%s)`) in addition to RFC3339 format
- **Simplified build commands** - Reduces complexity of ldflags injection while maintaining backward compatibility
- **Consistent output format** - All build times normalize to RFC3339 format regardless of input
## API Changes
### New Public Interfaces
- `metrics.NewMeterProvider()` - Create OTLP metrics provider with centralized configuration
- `metrics.Shutdown()` - Graceful shutdown for metrics exporters
- `internal/tracerconfig` - Shared OTLP configuration utilities (internal package)
### Dependencies
- Added explicit OpenTelemetry metrics dependencies to `go.mod`
- Updated tracing dependencies for retry support

72
database/config.go Normal file
View File

@@ -0,0 +1,72 @@
package database
import (
"os"
"time"
"github.com/prometheus/client_golang/prometheus"
)
// Config represents the database configuration structure
type Config struct {
MySQL DBConfig `yaml:"mysql"`
}
// DBConfig represents the MySQL database configuration
type DBConfig struct {
DSN string `default:"" flag:"dsn" usage:"Database DSN"`
User string `default:"" flag:"user"`
Pass string `default:"" flag:"pass"`
DBName string // Optional database name override
}
// ConfigOptions allows customization of database opening behavior
type ConfigOptions struct {
// ConfigFiles is a list of config file paths to search for database configuration
ConfigFiles []string
// EnablePoolMonitoring enables connection pool metrics collection
EnablePoolMonitoring bool
// PrometheusRegisterer for metrics collection. If nil, no metrics are collected.
PrometheusRegisterer prometheus.Registerer
// Connection pool settings
MaxOpenConns int
MaxIdleConns int
ConnMaxLifetime time.Duration
}
// getConfigFiles returns the list of config files to search for database configuration.
// If DATABASE_CONFIG_FILE environment variable is set, it returns that single file.
// Otherwise, it returns the default paths.
func getConfigFiles() []string {
if configFile := os.Getenv("DATABASE_CONFIG_FILE"); configFile != "" {
return []string{configFile}
}
return []string{"database.yaml", "/vault/secrets/database.yaml"}
}
// DefaultConfigOptions returns the standard configuration options used by API package
func DefaultConfigOptions() ConfigOptions {
return ConfigOptions{
ConfigFiles: getConfigFiles(),
EnablePoolMonitoring: true,
PrometheusRegisterer: prometheus.DefaultRegisterer,
MaxOpenConns: 25,
MaxIdleConns: 10,
ConnMaxLifetime: 3 * time.Minute,
}
}
// MonitorConfigOptions returns configuration options optimized for Monitor package
func MonitorConfigOptions() ConfigOptions {
return ConfigOptions{
ConfigFiles: getConfigFiles(),
EnablePoolMonitoring: false, // Monitor doesn't need metrics
PrometheusRegisterer: nil, // No Prometheus dependency
MaxOpenConns: 10,
MaxIdleConns: 5,
ConnMaxLifetime: 3 * time.Minute,
}
}

81
database/config_test.go Normal file
View File

@@ -0,0 +1,81 @@
package database
import (
"testing"
"time"
"github.com/prometheus/client_golang/prometheus"
)
func TestDefaultConfigOptions(t *testing.T) {
opts := DefaultConfigOptions()
// Verify expected defaults for API package
if opts.MaxOpenConns != 25 {
t.Errorf("Expected MaxOpenConns=25, got %d", opts.MaxOpenConns)
}
if opts.MaxIdleConns != 10 {
t.Errorf("Expected MaxIdleConns=10, got %d", opts.MaxIdleConns)
}
if opts.ConnMaxLifetime != 3*time.Minute {
t.Errorf("Expected ConnMaxLifetime=3m, got %v", opts.ConnMaxLifetime)
}
if !opts.EnablePoolMonitoring {
t.Error("Expected EnablePoolMonitoring=true")
}
if opts.PrometheusRegisterer != prometheus.DefaultRegisterer {
t.Error("Expected PrometheusRegisterer to be DefaultRegisterer")
}
if len(opts.ConfigFiles) == 0 {
t.Error("Expected ConfigFiles to be non-empty")
}
}
func TestMonitorConfigOptions(t *testing.T) {
opts := MonitorConfigOptions()
// Verify expected defaults for Monitor package
if opts.MaxOpenConns != 10 {
t.Errorf("Expected MaxOpenConns=10, got %d", opts.MaxOpenConns)
}
if opts.MaxIdleConns != 5 {
t.Errorf("Expected MaxIdleConns=5, got %d", opts.MaxIdleConns)
}
if opts.ConnMaxLifetime != 3*time.Minute {
t.Errorf("Expected ConnMaxLifetime=3m, got %v", opts.ConnMaxLifetime)
}
if opts.EnablePoolMonitoring {
t.Error("Expected EnablePoolMonitoring=false")
}
if opts.PrometheusRegisterer != nil {
t.Error("Expected PrometheusRegisterer to be nil")
}
if len(opts.ConfigFiles) == 0 {
t.Error("Expected ConfigFiles to be non-empty")
}
}
func TestConfigStructures(t *testing.T) {
// Test that configuration structures can be created and populated
config := Config{
MySQL: DBConfig{
DSN: "user:pass@tcp(localhost:3306)/dbname",
User: "testuser",
Pass: "testpass",
DBName: "testdb",
},
}
if config.MySQL.DSN == "" {
t.Error("Expected DSN to be set")
}
if config.MySQL.User != "testuser" {
t.Errorf("Expected User='testuser', got '%s'", config.MySQL.User)
}
if config.MySQL.Pass != "testpass" {
t.Errorf("Expected Pass='testpass', got '%s'", config.MySQL.Pass)
}
if config.MySQL.DBName != "testdb" {
t.Errorf("Expected DBName='testdb', got '%s'", config.MySQL.DBName)
}
}

88
database/connector.go Normal file
View File

@@ -0,0 +1,88 @@
package database
import (
"context"
"database/sql/driver"
"errors"
"fmt"
"os"
"github.com/go-sql-driver/mysql"
"gopkg.in/yaml.v3"
)
// from https://github.com/Boostport/dynamic-database-config
// CreateConnectorFunc is a function that creates a database connector
type CreateConnectorFunc func() (driver.Connector, error)
// Driver implements the sql/driver interface with dynamic configuration
type Driver struct {
CreateConnectorFunc CreateConnectorFunc
}
// Driver returns the driver instance
func (d Driver) Driver() driver.Driver {
return d
}
// Connect creates a new database connection using the dynamic connector
func (d Driver) Connect(ctx context.Context) (driver.Conn, error) {
connector, err := d.CreateConnectorFunc()
if err != nil {
return nil, fmt.Errorf("error creating connector from function: %w", err)
}
return connector.Connect(ctx)
}
// Open is not supported for dynamic configuration
func (d Driver) Open(name string) (driver.Conn, error) {
return nil, errors.New("open is not supported")
}
// createConnector creates a connector function that reads configuration from a file
func createConnector(configFile string) CreateConnectorFunc {
return func() (driver.Connector, error) {
dbFile, err := os.Open(configFile)
if err != nil {
return nil, err
}
defer dbFile.Close()
dec := yaml.NewDecoder(dbFile)
cfg := Config{}
err = dec.Decode(&cfg)
if err != nil {
return nil, err
}
dsn := cfg.MySQL.DSN
if len(dsn) == 0 {
dsn = os.Getenv("DATABASE_DSN")
if len(dsn) == 0 {
return nil, fmt.Errorf("dsn config in database.yaml or DATABASE_DSN environment variable required")
}
}
dbcfg, err := mysql.ParseDSN(dsn)
if err != nil {
return nil, err
}
if user := cfg.MySQL.User; len(user) > 0 {
dbcfg.User = user
}
if pass := cfg.MySQL.Pass; len(pass) > 0 {
dbcfg.Passwd = pass
}
if name := cfg.MySQL.DBName; len(name) > 0 {
dbcfg.DBName = name
}
return mysql.NewConnector(dbcfg)
}
}

View File

@@ -0,0 +1,117 @@
package database
import (
"context"
"database/sql"
"testing"
)
// Mock types for testing SQLC integration patterns
type mockQueries struct {
db DBTX
}
type mockQueriesTx struct {
*mockQueries
tx *sql.Tx
}
// Mock the Begin method pattern that SQLC generates
func (q *mockQueries) Begin(ctx context.Context) (*mockQueriesTx, error) {
// This would normally be: tx, err := q.db.(*sql.DB).BeginTx(ctx, nil)
// For our test, we return a mock
return &mockQueriesTx{mockQueries: q, tx: nil}, nil
}
func (qtx *mockQueriesTx) Commit(ctx context.Context) error {
return nil // Mock implementation
}
func (qtx *mockQueriesTx) Rollback(ctx context.Context) error {
return nil // Mock implementation
}
// This test verifies that our common database interfaces are compatible with SQLC-generated code
func TestSQLCIntegration(t *testing.T) {
// Test that SQLC's DBTX interface matches our DBTX interface
t.Run("DBTX Interface Compatibility", func(t *testing.T) {
// Test interface compatibility by assignment without execution
var ourDBTX DBTX
// Test with sql.DB (should implement DBTX)
var db *sql.DB
ourDBTX = db // This will compile only if interfaces are compatible
_ = ourDBTX // Use the variable to avoid "unused" warning
// Test with sql.Tx (should implement DBTX)
var tx *sql.Tx
ourDBTX = tx // This will compile only if interfaces are compatible
_ = ourDBTX // Use the variable to avoid "unused" warning
// If we reach here, interfaces are compatible
t.Log("DBTX interface is compatible with sql.DB and sql.Tx")
})
t.Run("Transaction Interface Compatibility", func(t *testing.T) {
// This test verifies our transaction interfaces work with SQLC patterns
// We can't define methods inside a function, so we test interface compatibility
// Verify our DB interface is compatible with what SQLC expects
var dbInterface DB[*mockQueriesTx]
var mockDB *mockQueries = &mockQueries{}
dbInterface = mockDB
// Test that our transaction helper can work with this pattern
err := WithTransaction(context.Background(), dbInterface, func(ctx context.Context, qtx *mockQueriesTx) error {
// This would be where you'd call SQLC-generated query methods
return nil
})
if err != nil {
t.Errorf("Transaction helper failed: %v", err)
}
})
}
// Test that demonstrates how the common package would be used with real SQLC patterns
func TestRealWorldUsagePattern(t *testing.T) {
// This test shows how a package would typically use our common database code
t.Run("Database Opening Pattern", func(t *testing.T) {
// Test that our configuration options work as expected
opts := DefaultConfigOptions()
// Modify for test environment (no actual database connection)
opts.ConfigFiles = []string{} // No config files for unit test
opts.PrometheusRegisterer = nil // No metrics for unit test
// This would normally open a database: db, err := OpenDB(ctx, opts)
// For our unit test, we just verify the options are reasonable
if opts.MaxOpenConns <= 0 {
t.Error("MaxOpenConns should be positive")
}
if opts.MaxIdleConns <= 0 {
t.Error("MaxIdleConns should be positive")
}
if opts.ConnMaxLifetime <= 0 {
t.Error("ConnMaxLifetime should be positive")
}
})
t.Run("Monitor Package Configuration", func(t *testing.T) {
opts := MonitorConfigOptions()
// Verify monitor-specific settings
if opts.EnablePoolMonitoring {
t.Error("Monitor package should not enable pool monitoring")
}
if opts.PrometheusRegisterer != nil {
t.Error("Monitor package should not have Prometheus registerer")
}
if opts.MaxOpenConns != 10 {
t.Errorf("Expected MaxOpenConns=10 for monitor, got %d", opts.MaxOpenConns)
}
if opts.MaxIdleConns != 5 {
t.Errorf("Expected MaxIdleConns=5 for monitor, got %d", opts.MaxIdleConns)
}
})
}

34
database/interfaces.go Normal file
View File

@@ -0,0 +1,34 @@
package database
import (
"context"
"database/sql"
)
// DBTX matches the interface expected by SQLC-generated code
// This interface is implemented by both *sql.DB and *sql.Tx
type DBTX interface {
ExecContext(context.Context, string, ...interface{}) (sql.Result, error)
PrepareContext(context.Context, string) (*sql.Stmt, error)
QueryContext(context.Context, string, ...interface{}) (*sql.Rows, error)
QueryRowContext(context.Context, string, ...interface{}) *sql.Row
}
// BaseQuerier provides basic query functionality
// This interface should be implemented by package-specific Queries types
type BaseQuerier interface {
WithTx(tx *sql.Tx) BaseQuerier
}
// BaseQuerierTx provides transaction functionality
// This interface should be implemented by package-specific Queries types
type BaseQuerierTx interface {
BaseQuerier
Begin(ctx context.Context) (BaseQuerierTx, error)
Commit(ctx context.Context) error
Rollback(ctx context.Context) error
}
// TransactionFunc represents a function that operates within a database transaction
// This is used by the shared transaction helpers in transaction.go
type TransactionFunc[Q any] func(ctx context.Context, q Q) error

93
database/metrics.go Normal file
View File

@@ -0,0 +1,93 @@
package database
import (
"context"
"database/sql"
"fmt"
"time"
"github.com/prometheus/client_golang/prometheus"
)
// DatabaseMetrics holds the Prometheus metrics for database connection pool monitoring
type DatabaseMetrics struct {
ConnectionsOpen prometheus.Gauge
ConnectionsIdle prometheus.Gauge
ConnectionsInUse prometheus.Gauge
ConnectionsWaitCount prometheus.Counter
ConnectionsWaitDuration prometheus.Histogram
}
// NewDatabaseMetrics creates a new set of database metrics and registers them
func NewDatabaseMetrics(registerer prometheus.Registerer) *DatabaseMetrics {
metrics := &DatabaseMetrics{
ConnectionsOpen: prometheus.NewGauge(prometheus.GaugeOpts{
Name: "database_connections_open",
Help: "Number of open database connections",
}),
ConnectionsIdle: prometheus.NewGauge(prometheus.GaugeOpts{
Name: "database_connections_idle",
Help: "Number of idle database connections",
}),
ConnectionsInUse: prometheus.NewGauge(prometheus.GaugeOpts{
Name: "database_connections_in_use",
Help: "Number of database connections in use",
}),
ConnectionsWaitCount: prometheus.NewCounter(prometheus.CounterOpts{
Name: "database_connections_wait_count_total",
Help: "Total number of times a connection had to wait",
}),
ConnectionsWaitDuration: prometheus.NewHistogram(prometheus.HistogramOpts{
Name: "database_connections_wait_duration_seconds",
Help: "Time spent waiting for a database connection",
Buckets: prometheus.DefBuckets,
}),
}
if registerer != nil {
registerer.MustRegister(
metrics.ConnectionsOpen,
metrics.ConnectionsIdle,
metrics.ConnectionsInUse,
metrics.ConnectionsWaitCount,
metrics.ConnectionsWaitDuration,
)
}
return metrics
}
// monitorConnectionPool runs a background goroutine to collect connection pool metrics
func monitorConnectionPool(ctx context.Context, db *sql.DB, registerer prometheus.Registerer) {
if registerer == nil {
return // No metrics collection if no registerer provided
}
metrics := NewDatabaseMetrics(registerer)
ticker := time.NewTicker(30 * time.Second)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
stats := db.Stats()
metrics.ConnectionsOpen.Set(float64(stats.OpenConnections))
metrics.ConnectionsIdle.Set(float64(stats.Idle))
metrics.ConnectionsInUse.Set(float64(stats.InUse))
metrics.ConnectionsWaitCount.Add(float64(stats.WaitCount))
if stats.WaitDuration > 0 {
metrics.ConnectionsWaitDuration.Observe(stats.WaitDuration.Seconds())
}
// Log connection pool stats for high usage or waiting
if stats.OpenConnections > 20 || stats.WaitCount > 0 {
fmt.Printf("Connection pool stats: open=%d idle=%d in_use=%d wait_count=%d wait_duration=%s\n",
stats.OpenConnections, stats.Idle, stats.InUse, stats.WaitCount, stats.WaitDuration)
}
}
}
}

78
database/pool.go Normal file
View File

@@ -0,0 +1,78 @@
package database
import (
"context"
"database/sql"
"fmt"
"os"
"go.ntppool.org/common/logger"
)
// OpenDB opens a database connection with the specified configuration options
func OpenDB(ctx context.Context, options ConfigOptions) (*sql.DB, error) {
log := logger.Setup()
configFile, err := findConfigFile(options.ConfigFiles)
if err != nil {
return nil, err
}
dbconn := sql.OpenDB(Driver{
CreateConnectorFunc: createConnector(configFile),
})
// Set connection pool parameters
dbconn.SetConnMaxLifetime(options.ConnMaxLifetime)
dbconn.SetMaxOpenConns(options.MaxOpenConns)
dbconn.SetMaxIdleConns(options.MaxIdleConns)
err = dbconn.Ping()
if err != nil {
log.Error("could not connect to database", "err", err)
return nil, err
}
// Start optional connection pool monitoring
if options.EnablePoolMonitoring && options.PrometheusRegisterer != nil {
go monitorConnectionPool(ctx, dbconn, options.PrometheusRegisterer)
}
return dbconn, nil
}
// OpenDBWithConfigFile opens a database connection using an explicit config file path
// This is a convenience function for API package compatibility
func OpenDBWithConfigFile(ctx context.Context, configFile string) (*sql.DB, error) {
options := DefaultConfigOptions()
options.ConfigFiles = []string{configFile}
return OpenDB(ctx, options)
}
// OpenDBMonitor opens a database connection with monitor-specific defaults
// This is a convenience function for Monitor package compatibility
func OpenDBMonitor() (*sql.DB, error) {
options := MonitorConfigOptions()
return OpenDB(context.Background(), options)
}
// findConfigFile searches for the first existing config file from the list
func findConfigFile(configFiles []string) (string, error) {
var firstErr error
for _, configFile := range configFiles {
if configFile == "" {
continue
}
if _, err := os.Stat(configFile); err == nil {
return configFile, nil
} else if firstErr == nil {
firstErr = err
}
}
if firstErr != nil {
return "", fmt.Errorf("no config file found: %w", firstErr)
}
return "", fmt.Errorf("no valid config files provided")
}

69
database/transaction.go Normal file
View File

@@ -0,0 +1,69 @@
package database
import (
"context"
"fmt"
"go.ntppool.org/common/logger"
)
// DB interface for database operations that can begin transactions
type DB[Q any] interface {
Begin(ctx context.Context) (Q, error)
}
// TX interface for transaction operations
type TX interface {
Commit(ctx context.Context) error
Rollback(ctx context.Context) error
}
// WithTransaction executes a function within a database transaction
// Handles proper rollback on error and commit on success
func WithTransaction[Q TX](ctx context.Context, db DB[Q], fn func(ctx context.Context, q Q) error) error {
tx, err := db.Begin(ctx)
if err != nil {
return fmt.Errorf("failed to begin transaction: %w", err)
}
var committed bool
defer func() {
if !committed {
if rbErr := tx.Rollback(ctx); rbErr != nil {
// Log rollback error but don't override original error
log := logger.FromContext(ctx)
log.ErrorContext(ctx, "failed to rollback transaction", "error", rbErr)
}
}
}()
if err := fn(ctx, tx); err != nil {
return err
}
err = tx.Commit(ctx)
committed = true // Mark as committed regardless of commit success/failure
if err != nil {
return fmt.Errorf("failed to commit transaction: %w", err)
}
return nil
}
// WithReadOnlyTransaction executes a read-only function within a transaction
// Always rolls back at the end (for consistent read isolation)
func WithReadOnlyTransaction[Q TX](ctx context.Context, db DB[Q], fn func(ctx context.Context, q Q) error) error {
tx, err := db.Begin(ctx)
if err != nil {
return fmt.Errorf("failed to begin read-only transaction: %w", err)
}
defer func() {
if rbErr := tx.Rollback(ctx); rbErr != nil {
log := logger.FromContext(ctx)
log.ErrorContext(ctx, "failed to rollback read-only transaction", "error", rbErr)
}
}()
return fn(ctx, tx)
}

View File

@@ -0,0 +1,69 @@
package database
import (
"context"
"database/sql"
"fmt"
"go.ntppool.org/common/logger"
)
// Shared interface definitions that both packages use identically
type BaseBeginner interface {
Begin(context.Context) (sql.Tx, error)
}
type BaseTx interface {
BaseBeginner
Commit(ctx context.Context) error
Rollback(ctx context.Context) error
}
// BeginTransactionForQuerier contains the shared Begin() logic from both packages
func BeginTransactionForQuerier(ctx context.Context, db DBTX) (DBTX, error) {
if sqlDB, ok := db.(*sql.DB); ok {
tx, err := sqlDB.BeginTx(ctx, &sql.TxOptions{})
if err != nil {
return nil, err
}
return tx, nil
} else {
// Handle transaction case
if beginner, ok := db.(BaseBeginner); ok {
tx, err := beginner.Begin(ctx)
if err != nil {
return nil, err
}
return &tx, nil
}
return nil, fmt.Errorf("database connection does not support transactions")
}
}
// CommitTransactionForQuerier contains the shared Commit() logic from both packages
func CommitTransactionForQuerier(ctx context.Context, db DBTX) error {
if sqlTx, ok := db.(*sql.Tx); ok {
return sqlTx.Commit()
}
tx, ok := db.(BaseTx)
if !ok {
log := logger.FromContext(ctx)
log.ErrorContext(ctx, "could not get a Tx", "type", fmt.Sprintf("%T", db))
return sql.ErrTxDone
}
return tx.Commit(ctx)
}
// RollbackTransactionForQuerier contains the shared Rollback() logic from both packages
func RollbackTransactionForQuerier(ctx context.Context, db DBTX) error {
if sqlTx, ok := db.(*sql.Tx); ok {
return sqlTx.Rollback()
}
tx, ok := db.(BaseTx)
if !ok {
return sql.ErrTxDone
}
return tx.Rollback(ctx)
}

View File

@@ -0,0 +1,157 @@
package database
import (
"context"
"errors"
"testing"
)
// Mock implementations for testing
type mockDB struct {
beginError error
txMock *mockTX
}
func (m *mockDB) Begin(ctx context.Context) (*mockTX, error) {
if m.beginError != nil {
return nil, m.beginError
}
return m.txMock, nil
}
type mockTX struct {
commitError error
rollbackError error
commitCalled bool
rollbackCalled bool
}
func (m *mockTX) Commit(ctx context.Context) error {
m.commitCalled = true
return m.commitError
}
func (m *mockTX) Rollback(ctx context.Context) error {
m.rollbackCalled = true
return m.rollbackError
}
func TestWithTransaction_Success(t *testing.T) {
tx := &mockTX{}
db := &mockDB{txMock: tx}
var functionCalled bool
err := WithTransaction(context.Background(), db, func(ctx context.Context, q *mockTX) error {
functionCalled = true
if q != tx {
t.Error("Expected transaction to be passed to function")
}
return nil
})
if err != nil {
t.Errorf("Expected no error, got %v", err)
}
if !functionCalled {
t.Error("Expected function to be called")
}
if !tx.commitCalled {
t.Error("Expected commit to be called")
}
if tx.rollbackCalled {
t.Error("Expected rollback NOT to be called on success")
}
}
func TestWithTransaction_FunctionError(t *testing.T) {
tx := &mockTX{}
db := &mockDB{txMock: tx}
expectedError := errors.New("function error")
err := WithTransaction(context.Background(), db, func(ctx context.Context, q *mockTX) error {
return expectedError
})
if err != expectedError {
t.Errorf("Expected error %v, got %v", expectedError, err)
}
if tx.commitCalled {
t.Error("Expected commit NOT to be called on function error")
}
if !tx.rollbackCalled {
t.Error("Expected rollback to be called on function error")
}
}
func TestWithTransaction_BeginError(t *testing.T) {
expectedError := errors.New("begin error")
db := &mockDB{beginError: expectedError}
err := WithTransaction(context.Background(), db, func(ctx context.Context, q *mockTX) error {
t.Error("Function should not be called when Begin fails")
return nil
})
if err == nil || !errors.Is(err, expectedError) {
t.Errorf("Expected wrapped begin error, got %v", err)
}
}
func TestWithTransaction_CommitError(t *testing.T) {
commitError := errors.New("commit error")
tx := &mockTX{commitError: commitError}
db := &mockDB{txMock: tx}
err := WithTransaction(context.Background(), db, func(ctx context.Context, q *mockTX) error {
return nil
})
if err == nil || !errors.Is(err, commitError) {
t.Errorf("Expected wrapped commit error, got %v", err)
}
if !tx.commitCalled {
t.Error("Expected commit to be called")
}
if tx.rollbackCalled {
t.Error("Expected rollback NOT to be called when commit fails")
}
}
func TestWithReadOnlyTransaction_Success(t *testing.T) {
tx := &mockTX{}
db := &mockDB{txMock: tx}
var functionCalled bool
err := WithReadOnlyTransaction(context.Background(), db, func(ctx context.Context, q *mockTX) error {
functionCalled = true
return nil
})
if err != nil {
t.Errorf("Expected no error, got %v", err)
}
if !functionCalled {
t.Error("Expected function to be called")
}
if tx.commitCalled {
t.Error("Expected commit NOT to be called in read-only transaction")
}
if !tx.rollbackCalled {
t.Error("Expected rollback to be called in read-only transaction")
}
}
func TestWithReadOnlyTransaction_FunctionError(t *testing.T) {
tx := &mockTX{}
db := &mockDB{txMock: tx}
expectedError := errors.New("function error")
err := WithReadOnlyTransaction(context.Background(), db, func(ctx context.Context, q *mockTX) error {
return expectedError
})
if err != expectedError {
t.Errorf("Expected error %v, got %v", expectedError, err)
}
if !tx.rollbackCalled {
t.Error("Expected rollback to be called")
}
}

17
go.mod
View File

@@ -4,10 +4,12 @@ go 1.23.5
require (
github.com/abh/certman v0.4.0
github.com/go-sql-driver/mysql v1.9.3
github.com/labstack/echo-contrib v0.17.2
github.com/labstack/echo/v4 v4.13.3
github.com/oklog/ulid/v2 v2.1.0
github.com/prometheus/client_golang v1.20.5
github.com/prometheus/client_model v0.6.1
github.com/remychantenay/slog-otel v1.3.2
github.com/samber/slog-echo v1.14.8
github.com/samber/slog-multi v1.2.4
@@ -17,20 +19,28 @@ require (
go.opentelemetry.io/contrib/exporters/autoexport v0.58.0
go.opentelemetry.io/contrib/instrumentation/github.com/labstack/echo/otelecho v0.58.0
go.opentelemetry.io/otel v1.33.0
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc v0.9.0
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp v0.9.0
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.33.0
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.33.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.33.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.33.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.33.0
go.opentelemetry.io/otel/log v0.9.0
go.opentelemetry.io/otel/metric v1.33.0
go.opentelemetry.io/otel/sdk v1.33.0
go.opentelemetry.io/otel/sdk/log v0.9.0
go.opentelemetry.io/otel/sdk/metric v1.33.0
go.opentelemetry.io/otel/trace v1.33.0
golang.org/x/mod v0.22.0
golang.org/x/net v0.33.0
golang.org/x/sync v0.10.0
google.golang.org/grpc v1.69.2
gopkg.in/yaml.v3 v3.0.1
)
require (
filippo.io/edwards25519 v1.1.0 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/cenkalti/backoff/v4 v4.3.0 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
@@ -47,7 +57,6 @@ require (
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/pierrec/lz4/v4 v4.1.22 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/prometheus/client_model v0.6.1 // indirect
github.com/prometheus/common v0.61.0 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/samber/lo v1.47.0 // indirect
@@ -56,16 +65,10 @@ require (
github.com/valyala/fasttemplate v1.2.2 // indirect
go.opentelemetry.io/auto/sdk v1.1.0 // indirect
go.opentelemetry.io/contrib/bridges/prometheus v0.58.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc v0.9.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp v0.9.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.33.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.33.0 // indirect
go.opentelemetry.io/otel/exporters/prometheus v0.55.0 // indirect
go.opentelemetry.io/otel/exporters/stdout/stdoutlog v0.9.0 // indirect
go.opentelemetry.io/otel/exporters/stdout/stdoutmetric v1.33.0 // indirect
go.opentelemetry.io/otel/exporters/stdout/stdouttrace v1.33.0 // indirect
go.opentelemetry.io/otel/metric v1.33.0 // indirect
go.opentelemetry.io/otel/sdk/metric v1.33.0 // indirect
go.opentelemetry.io/proto/otlp v1.4.0 // indirect
golang.org/x/crypto v0.31.0 // indirect
golang.org/x/sys v0.28.0 // indirect

12
go.sum
View File

@@ -1,3 +1,5 @@
filippo.io/edwards25519 v1.1.0 h1:FNf4tywRC1HmFuKW5xopWpigGjJKiJSV0Cqo0cJWDaA=
filippo.io/edwards25519 v1.1.0/go.mod h1:BxyFTGdWcka3PhytdK4V28tE5sGfRvvvRV7EaN4VDT4=
github.com/abh/certman v0.4.0 h1:XHoDtb0YyRQPclaHMrBDlKTVZpNjTK6vhB0S3Bd/Sbs=
github.com/abh/certman v0.4.0/go.mod h1:x8QhpKVZifmV1Hdiwdg9gLo2GMPAxezz1s3zrVnPs+I=
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
@@ -17,6 +19,8 @@ github.com/go-logr/logr v1.4.2 h1:6pFjapn8bFcIbiKo3XT4j/BhANplGihG6tvd+8rYgrY=
github.com/go-logr/logr v1.4.2/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
github.com/go-sql-driver/mysql v1.9.3 h1:U/N249h2WzJ3Ukj8SowVFjdtZKfu9vlLZxjPXV1aweo=
github.com/go-sql-driver/mysql v1.9.3/go.mod h1:qn46aNg1333BRMNU69Lq93t8du/dwxI64Gl8i5p1WMU=
github.com/golang/protobuf v1.5.4 h1:i7eJL8qZTpSEXOPTxNKhASYpMn+8e5Q6AdndVa1dWek=
github.com/golang/protobuf v1.5.4/go.mod h1:lnTiLA8Wa4RWRcIUkrtSVa5nRhsEGBg48fD6rSs7xps=
github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
@@ -30,6 +34,10 @@ github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLf
github.com/klauspost/compress v1.15.9/go.mod h1:PhcZ0MbTNciWF3rruxRgKxI5NkcHHrHUDtV4Yw2GlzU=
github.com/klauspost/compress v1.17.11 h1:In6xLpyWOi1+C7tXUUWv2ot1QvBjxevKAaI6IXrJmUc=
github.com/klauspost/compress v1.17.11/go.mod h1:pMDklpSncoRMuLFrf1W9Ss9KT+0rH90U12bZKk7uwG0=
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/kylelemons/godebug v1.1.0 h1:RPNrshWIDI6G2gRW9EHilWtl7Z6Sb1BR0xunSBf0SNc=
github.com/kylelemons/godebug v1.1.0/go.mod h1:9/0rRGxNHcop5bhtWyNeEfOS8JIWk580+fNqagV/RAw=
github.com/labstack/echo-contrib v0.17.2 h1:K1zivqmtcC70X9VdBFdLomjPDEVHlrcAObqmuFj1c6w=
@@ -65,6 +73,8 @@ github.com/prometheus/procfs v0.15.1 h1:YagwOFzUgYfKKHX6Dr+sHT7km/hxC76UB0leargg
github.com/prometheus/procfs v0.15.1/go.mod h1:fB45yRUv8NstnjriLhBQLuOUt+WW4BsoGhij/e3PBqk=
github.com/remychantenay/slog-otel v1.3.2 h1:ZBx8qnwfLJ6e18Vba4e9Xp9B7khTmpIwFsU1sAmActw=
github.com/remychantenay/slog-otel v1.3.2/go.mod h1:gKW4tQ8cGOKoA+bi7wtYba/tcJ6Tc9XyQ/EW8gHA/2E=
github.com/rogpeppe/go-internal v1.13.1 h1:KvO1DLK/DRN07sQ1LQKScxyZJuNnedQ5/wKSR38lUII=
github.com/rogpeppe/go-internal v1.13.1/go.mod h1:uMEvuHeurkdAXX61udpOXGD/AzZDWNMNyH2VO9fmH0o=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/samber/lo v1.47.0 h1:z7RynLwP5nbyRscyvcD043DWYoOcYRv3mV8lBeqOCLc=
github.com/samber/lo v1.47.0/go.mod h1:RmDH9Ct32Qy3gduHQuKJ3gW1fMHAnE/fAzQuf6He5cU=
@@ -211,6 +221,8 @@ google.golang.org/grpc v1.69.2/go.mod h1:vyjdE6jLBI76dgpDojsFGNaHlxdjXN9ghpnd2o7
google.golang.org/protobuf v1.36.1 h1:yBPeRvTftaleIgM3PZ/WBIZ7XM/eEYAaEyCwvyjq/gk=
google.golang.org/protobuf v1.36.1/go.mod h1:9fA7Ob0pmnwhb644+1+CVWFRbNajQ6iRojtC/QF5bRE=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View File

@@ -1,13 +1,71 @@
// Package health provides a standalone HTTP server for health checks.
//
// This package implements a simple health check server that can be used
// to expose health status endpoints for monitoring and load balancing.
// It supports custom health check handlers and provides structured logging
// with graceful shutdown capabilities.
// This package implements a flexible health check server that supports
// different handlers for Kubernetes probe types (liveness, readiness, startup).
// It provides structured logging, graceful shutdown, and standard HTTP endpoints
// for monitoring and load balancing.
//
// # Kubernetes Probe Types
//
// Liveness Probe: Detects when a container is "dead" and needs restarting.
// Should be a lightweight check that verifies the process is still running
// and not in an unrecoverable state.
//
// Readiness Probe: Determines when a container is ready to accept traffic.
// Controls which Pods are used as backends for Services. Should verify
// the application can handle requests properly.
//
// Startup Probe: Verifies when a container application has successfully started.
// Delays liveness and readiness probes until startup succeeds. Useful for
// slow-starting applications.
//
// # Usage Examples
//
// Basic usage with a single handler for all probes:
//
// srv := health.NewServer(myHealthHandler)
// srv.Listen(ctx, 9091)
//
// Advanced usage with separate handlers for each probe type:
//
// srv := health.NewServer(nil,
// health.WithLivenessHandler(func(w http.ResponseWriter, r *http.Request) {
// // Simple alive check
// w.WriteHeader(http.StatusOK)
// }),
// health.WithReadinessHandler(func(w http.ResponseWriter, r *http.Request) {
// // Check if ready to serve traffic
// if err := checkDatabase(); err != nil {
// w.WriteHeader(http.StatusServiceUnavailable)
// return
// }
// w.WriteHeader(http.StatusOK)
// }),
// health.WithStartupHandler(func(w http.ResponseWriter, r *http.Request) {
// // Check if startup is complete
// if !applicationReady() {
// w.WriteHeader(http.StatusServiceUnavailable)
// return
// }
// w.WriteHeader(http.StatusOK)
// }),
// health.WithServiceName("my-service"),
// )
// srv.Listen(ctx, 9091)
//
// # Standard Endpoints
//
// The server exposes these endpoints:
// - /healthz - liveness probe (or general health if no specific handler)
// - /readyz - readiness probe (or general health if no specific handler)
// - /startupz - startup probe (or general health if no specific handler)
// - /__health - general health endpoint (backward compatibility)
// - / - general health endpoint (root path)
package health
import (
"context"
"encoding/json"
"log/slog"
"net/http"
"strconv"
@@ -21,23 +79,74 @@ import (
// It runs separately from the main application server to ensure health
// checks remain available even if the main server is experiencing issues.
//
// The server includes built-in timeouts, graceful shutdown, and structured
// logging for monitoring and debugging health check behavior.
// The server supports separate handlers for different Kubernetes probe types
// (liveness, readiness, startup) and includes built-in timeouts, graceful
// shutdown, and structured logging.
type Server struct {
log *slog.Logger
healthFn http.HandlerFunc
log *slog.Logger
livenessHandler http.HandlerFunc
readinessHandler http.HandlerFunc
startupHandler http.HandlerFunc
generalHandler http.HandlerFunc // fallback for /__health and / paths
serviceName string
}
// NewServer creates a new health check server with the specified health handler.
// If healthFn is nil, a default handler that returns HTTP 200 "ok" is used.
func NewServer(healthFn http.HandlerFunc) *Server {
// Option represents a configuration option for the health server.
type Option func(*Server)
// WithLivenessHandler sets a specific handler for the /healthz endpoint.
// Liveness probes determine if a container should be restarted.
func WithLivenessHandler(handler http.HandlerFunc) Option {
return func(s *Server) {
s.livenessHandler = handler
}
}
// WithReadinessHandler sets a specific handler for the /readyz endpoint.
// Readiness probes determine if a container can receive traffic.
func WithReadinessHandler(handler http.HandlerFunc) Option {
return func(s *Server) {
s.readinessHandler = handler
}
}
// WithStartupHandler sets a specific handler for the /startupz endpoint.
// Startup probes determine if a container has finished initializing.
func WithStartupHandler(handler http.HandlerFunc) Option {
return func(s *Server) {
s.startupHandler = handler
}
}
// WithServiceName sets the service name for JSON responses and logging.
func WithServiceName(serviceName string) Option {
return func(s *Server) {
s.serviceName = serviceName
}
}
// NewServer creates a new health check server with optional probe-specific handlers.
//
// If healthFn is provided, it will be used as a fallback for any probe endpoints
// that don't have specific handlers configured. If healthFn is nil, a default
// handler that returns HTTP 200 "ok" is used as the fallback.
//
// Use the With* option functions to configure specific handlers for different
// probe types (liveness, readiness, startup).
func NewServer(healthFn http.HandlerFunc, opts ...Option) *Server {
if healthFn == nil {
healthFn = basicHealth
}
srv := &Server{
log: logger.Setup(),
healthFn: healthFn,
log: logger.Setup(),
generalHandler: healthFn,
}
for _, opt := range opts {
opt(srv)
}
return srv
}
@@ -47,13 +156,27 @@ func (srv *Server) SetLogger(log *slog.Logger) {
}
// Listen starts the health server on the specified port and blocks until ctx is cancelled.
// The server exposes the health handler at "/__health" with graceful shutdown support.
// The server exposes health check endpoints with graceful shutdown support.
//
// Standard endpoints exposed:
// - /healthz - liveness probe (uses livenessHandler or falls back to generalHandler)
// - /readyz - readiness probe (uses readinessHandler or falls back to generalHandler)
// - /startupz - startup probe (uses startupHandler or falls back to generalHandler)
// - /__health - general health endpoint (uses generalHandler)
// - / - root health endpoint (uses generalHandler)
func (srv *Server) Listen(ctx context.Context, port int) error {
srv.log.Info("starting health listener", "port", port)
serveMux := http.NewServeMux()
serveMux.HandleFunc("/__health", srv.healthFn)
// Register probe-specific handlers
serveMux.HandleFunc("/healthz", srv.createProbeHandler("liveness"))
serveMux.HandleFunc("/readyz", srv.createProbeHandler("readiness"))
serveMux.HandleFunc("/startupz", srv.createProbeHandler("startup"))
// Register general health endpoints for backward compatibility
serveMux.HandleFunc("/__health", srv.createGeneralHandler())
serveMux.HandleFunc("/", srv.createGeneralHandler())
hsrv := &http.Server{
Addr: ":" + strconv.Itoa(port),
@@ -89,6 +212,121 @@ func (srv *Server) Listen(ctx context.Context, port int) error {
return g.Wait()
}
// createProbeHandler creates a handler for a specific probe type that provides
// appropriate JSON responses and falls back to the general handler if no specific
// handler is configured.
func (srv *Server) createProbeHandler(probeType string) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
var handler http.HandlerFunc
// Select the appropriate handler
switch probeType {
case "liveness":
handler = srv.livenessHandler
case "readiness":
handler = srv.readinessHandler
case "startup":
handler = srv.startupHandler
}
// Fall back to general handler if no specific handler is configured
if handler == nil {
handler = srv.generalHandler
}
// Create a response recorder to capture the handler's status code
recorder := &statusRecorder{ResponseWriter: w, statusCode: 200}
handler(recorder, r)
// If the handler already wrote a response, we're done
if recorder.written {
return
}
// Otherwise, provide a standard JSON response based on the status code
w.Header().Set("Content-Type", "application/json")
if recorder.statusCode >= 400 {
// Handler indicated unhealthy
switch probeType {
case "liveness":
json.NewEncoder(w).Encode(map[string]string{"status": "unhealthy"})
case "readiness":
json.NewEncoder(w).Encode(map[string]bool{"ready": false})
case "startup":
json.NewEncoder(w).Encode(map[string]bool{"started": false})
}
} else {
// Handler indicated healthy
switch probeType {
case "liveness":
json.NewEncoder(w).Encode(map[string]string{"status": "alive"})
case "readiness":
json.NewEncoder(w).Encode(map[string]bool{"ready": true})
case "startup":
json.NewEncoder(w).Encode(map[string]bool{"started": true})
}
}
}
}
// createGeneralHandler creates a handler for general health endpoints that provides
// comprehensive health information.
func (srv *Server) createGeneralHandler() http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
// Create a response recorder to capture the handler's status code
// Use a buffer to prevent the handler from writing to the actual response
recorder := &statusRecorder{ResponseWriter: &discardWriter{}, statusCode: 200}
srv.generalHandler(recorder, r)
// Always provide a comprehensive JSON response for general endpoints
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(recorder.statusCode)
response := map[string]interface{}{
"status": map[bool]string{true: "healthy", false: "unhealthy"}[recorder.statusCode < 400],
}
if srv.serviceName != "" {
response["service"] = srv.serviceName
}
json.NewEncoder(w).Encode(response)
}
}
// statusRecorder captures the response status code from handlers while allowing
// them to write their own response content if needed.
type statusRecorder struct {
http.ResponseWriter
statusCode int
written bool
}
func (r *statusRecorder) WriteHeader(code int) {
r.statusCode = code
r.ResponseWriter.WriteHeader(code)
}
func (r *statusRecorder) Write(data []byte) (int, error) {
r.written = true
return r.ResponseWriter.Write(data)
}
// discardWriter implements http.ResponseWriter but discards all writes.
// Used to capture status codes without writing response content.
type discardWriter struct{}
func (d *discardWriter) Header() http.Header {
return make(http.Header)
}
func (d *discardWriter) Write([]byte) (int, error) {
return 0, nil
}
func (d *discardWriter) WriteHeader(int) {}
// HealthCheckListener runs a simple HTTP server on the specified port for health check probes.
func HealthCheckListener(ctx context.Context, port int, log *slog.Logger) error {
srv := NewServer(nil)

View File

@@ -1,13 +1,14 @@
package health
import (
"fmt"
"io"
"net/http"
"net/http/httptest"
"testing"
)
func TestHealthHandler(t *testing.T) {
func TestBasicHealthHandler(t *testing.T) {
req := httptest.NewRequest(http.MethodGet, "/__health", nil)
w := httptest.NewRecorder()
@@ -24,3 +25,129 @@ func TestHealthHandler(t *testing.T) {
t.Errorf("expected ok got %q", string(data))
}
}
func TestProbeHandlers(t *testing.T) {
// Test with separate handlers for each probe type
srv := NewServer(nil,
WithLivenessHandler(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}),
WithReadinessHandler(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}),
WithStartupHandler(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}),
WithServiceName("test-service"),
)
tests := []struct {
handler http.HandlerFunc
expectedStatus int
expectedBody string
}{
{srv.createProbeHandler("liveness"), 200, `{"status":"alive"}`},
{srv.createProbeHandler("readiness"), 200, `{"ready":true}`},
{srv.createProbeHandler("startup"), 200, `{"started":true}`},
{srv.createGeneralHandler(), 200, `{"service":"test-service","status":"healthy"}`},
}
for i, tt := range tests {
t.Run(fmt.Sprintf("test_%d", i), func(t *testing.T) {
req := httptest.NewRequest(http.MethodGet, "/", nil)
w := httptest.NewRecorder()
tt.handler(w, req)
if w.Code != tt.expectedStatus {
t.Errorf("expected status %d, got %d", tt.expectedStatus, w.Code)
}
body := w.Body.String()
if body != tt.expectedBody+"\n" { // json.Encoder adds newline
t.Errorf("expected body %q, got %q", tt.expectedBody, body)
}
})
}
}
func TestProbeHandlerFallback(t *testing.T) {
// Test fallback to general handler when no specific handler is configured
generalHandler := func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}
srv := NewServer(generalHandler, WithServiceName("test-service"))
tests := []struct {
handler http.HandlerFunc
expectedStatus int
expectedBody string
}{
{srv.createProbeHandler("liveness"), 200, `{"status":"alive"}`},
{srv.createProbeHandler("readiness"), 200, `{"ready":true}`},
{srv.createProbeHandler("startup"), 200, `{"started":true}`},
}
for i, tt := range tests {
t.Run(fmt.Sprintf("fallback_%d", i), func(t *testing.T) {
req := httptest.NewRequest(http.MethodGet, "/", nil)
w := httptest.NewRecorder()
tt.handler(w, req)
if w.Code != tt.expectedStatus {
t.Errorf("expected status %d, got %d", tt.expectedStatus, w.Code)
}
body := w.Body.String()
if body != tt.expectedBody+"\n" { // json.Encoder adds newline
t.Errorf("expected body %q, got %q", tt.expectedBody, body)
}
})
}
}
func TestUnhealthyProbeHandlers(t *testing.T) {
// Test with handlers that return unhealthy status
srv := NewServer(nil,
WithLivenessHandler(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusServiceUnavailable)
}),
WithReadinessHandler(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusServiceUnavailable)
}),
WithStartupHandler(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusServiceUnavailable)
}),
WithServiceName("test-service"),
)
tests := []struct {
handler http.HandlerFunc
expectedStatus int
expectedBody string
}{
{srv.createProbeHandler("liveness"), 503, `{"status":"unhealthy"}`},
{srv.createProbeHandler("readiness"), 503, `{"ready":false}`},
{srv.createProbeHandler("startup"), 503, `{"started":false}`},
}
for i, tt := range tests {
t.Run(fmt.Sprintf("unhealthy_%d", i), func(t *testing.T) {
req := httptest.NewRequest(http.MethodGet, "/", nil)
w := httptest.NewRecorder()
tt.handler(w, req)
if w.Code != tt.expectedStatus {
t.Errorf("expected status %d, got %d", tt.expectedStatus, w.Code)
}
body := w.Body.String()
if body != tt.expectedBody+"\n" { // json.Encoder adds newline
t.Errorf("expected body %q, got %q", tt.expectedBody, body)
}
})
}
}

View File

@@ -0,0 +1,378 @@
// Package tracerconfig provides a bridge to eliminate circular dependencies between
// the logger and tracing packages. It stores tracer configuration and provides
// factory functions that can be used by the logger package without importing tracing.
package tracerconfig
import (
"context"
"crypto/tls"
"crypto/x509"
"errors"
"fmt"
"net/url"
"os"
"strings"
"sync"
"time"
"go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc"
"go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp"
"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc"
"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
sdklog "go.opentelemetry.io/otel/sdk/log"
sdkmetric "go.opentelemetry.io/otel/sdk/metric"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
"google.golang.org/grpc/credentials"
)
const (
otelExporterOTLPProtoEnvKey = "OTEL_EXPORTER_OTLP_PROTOCOL"
otelExporterOTLPTracesProtoEnvKey = "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL"
otelExporterOTLPLogsProtoEnvKey = "OTEL_EXPORTER_OTLP_LOGS_PROTOCOL"
otelExporterOTLPMetricsProtoEnvKey = "OTEL_EXPORTER_OTLP_METRICS_PROTOCOL"
)
var errInvalidOTLPProtocol = errors.New("invalid OTLP protocol - should be one of ['grpc', 'http/protobuf']")
// newInvalidProtocolError creates a specific error message for invalid protocols
func newInvalidProtocolError(protocol, signalType string) error {
return fmt.Errorf("invalid OTLP protocol '%s' for %s - should be one of ['grpc', 'http/protobuf', 'http/json']", protocol, signalType)
}
// Validate checks the configuration for common errors and inconsistencies
func (c *Config) Validate() error {
var errs []error
// Check that both Endpoint and EndpointURL are not specified
if c.Endpoint != "" && c.EndpointURL != "" {
errs = append(errs, errors.New("cannot specify both Endpoint and EndpointURL - use one or the other"))
}
// Validate EndpointURL format if specified
if c.EndpointURL != "" {
if _, err := url.Parse(c.EndpointURL); err != nil {
errs = append(errs, fmt.Errorf("invalid EndpointURL format: %w", err))
}
}
// Validate Endpoint format if specified
if c.Endpoint != "" {
// Basic validation - should not contain protocol scheme
if strings.Contains(c.Endpoint, "://") {
errs = append(errs, errors.New("Endpoint should not include protocol scheme (use EndpointURL for full URLs)"))
}
// Should not be empty after trimming whitespace
if strings.TrimSpace(c.Endpoint) == "" {
errs = append(errs, errors.New("Endpoint cannot be empty or whitespace"))
}
}
// Validate TLS configuration consistency
if c.CertificateProvider != nil && c.RootCAs == nil {
// This is just a warning - client cert without custom CAs is valid
// but might indicate a configuration issue
}
// Validate service name if specified
if c.ServiceName != "" && strings.TrimSpace(c.ServiceName) == "" {
errs = append(errs, errors.New("ServiceName cannot be empty or whitespace"))
}
// Combine all errors
if len(errs) > 0 {
var errMsgs []string
for _, err := range errs {
errMsgs = append(errMsgs, err.Error())
}
return fmt.Errorf("configuration validation failed: %s", strings.Join(errMsgs, "; "))
}
return nil
}
// ValidateAndStore validates the configuration before storing it
func ValidateAndStore(ctx context.Context, cfg *Config, logFactory LogExporterFactory, metricFactory MetricExporterFactory, traceFactory TraceExporterFactory) error {
if cfg != nil {
if err := cfg.Validate(); err != nil {
return err
}
}
Store(ctx, cfg, logFactory, metricFactory, traceFactory)
return nil
}
// GetClientCertificate defines a function type for providing client certificates for mutual TLS.
// This is used when exporting telemetry data to secured OTLP endpoints that require
// client certificate authentication.
type GetClientCertificate func(*tls.CertificateRequestInfo) (*tls.Certificate, error)
// Config provides configuration options for OpenTelemetry tracing setup.
// It supplements standard OpenTelemetry environment variables with additional
// NTP Pool-specific configuration including TLS settings for secure OTLP export.
type Config struct {
ServiceName string // Service name for resource identification (overrides OTEL_SERVICE_NAME)
Environment string // Deployment environment (development, staging, production)
Endpoint string // OTLP endpoint hostname/port (e.g., "otlp.example.com:4317")
EndpointURL string // Complete OTLP endpoint URL (e.g., "https://otlp.example.com:4317/v1/traces")
CertificateProvider GetClientCertificate // Client certificate provider for mutual TLS
RootCAs *x509.CertPool // CA certificate pool for server verification
}
// LogExporterFactory creates an OTLP log exporter using the provided configuration.
// This allows the logger package to create exporters without importing the tracing package.
type LogExporterFactory func(context.Context, *Config) (sdklog.Exporter, error)
// MetricExporterFactory creates an OTLP metric exporter using the provided configuration.
// This allows the metrics package to create exporters without importing the tracing package.
type MetricExporterFactory func(context.Context, *Config) (sdkmetric.Exporter, error)
// TraceExporterFactory creates an OTLP trace exporter using the provided configuration.
// This allows for consistent trace exporter creation across packages.
type TraceExporterFactory func(context.Context, *Config) (sdktrace.SpanExporter, error)
// Global state for sharing configuration between packages
var (
globalConfig *Config
globalContext context.Context
logExporterFactory LogExporterFactory
metricExporterFactory MetricExporterFactory
traceExporterFactory TraceExporterFactory
configMu sync.RWMutex
)
// Store saves the tracer configuration and exporter factories for use by other packages.
// This should be called by the tracing package during initialization.
func Store(ctx context.Context, cfg *Config, logFactory LogExporterFactory, metricFactory MetricExporterFactory, traceFactory TraceExporterFactory) {
configMu.Lock()
defer configMu.Unlock()
globalConfig = cfg
globalContext = ctx
logExporterFactory = logFactory
metricExporterFactory = metricFactory
traceExporterFactory = traceFactory
}
// GetLogExporter returns the stored configuration and log exporter factory.
// Returns nil values if no configuration has been stored yet.
func GetLogExporter() (*Config, context.Context, LogExporterFactory) {
configMu.RLock()
defer configMu.RUnlock()
return globalConfig, globalContext, logExporterFactory
}
// GetMetricExporter returns the stored configuration and metric exporter factory.
// Returns nil values if no configuration has been stored yet.
func GetMetricExporter() (*Config, context.Context, MetricExporterFactory) {
configMu.RLock()
defer configMu.RUnlock()
return globalConfig, globalContext, metricExporterFactory
}
// GetTraceExporter returns the stored configuration and trace exporter factory.
// Returns nil values if no configuration has been stored yet.
func GetTraceExporter() (*Config, context.Context, TraceExporterFactory) {
configMu.RLock()
defer configMu.RUnlock()
return globalConfig, globalContext, traceExporterFactory
}
// Get returns the stored tracer configuration, context, and log exporter factory.
// This maintains backward compatibility for the logger package.
// Returns nil values if no configuration has been stored yet.
func Get() (*Config, context.Context, LogExporterFactory) {
return GetLogExporter()
}
// IsConfigured returns true if tracer configuration has been stored.
func IsConfigured() bool {
configMu.RLock()
defer configMu.RUnlock()
return globalConfig != nil && globalContext != nil
}
// Clear removes the stored configuration. This is primarily useful for testing.
func Clear() {
configMu.Lock()
defer configMu.Unlock()
globalConfig = nil
globalContext = nil
logExporterFactory = nil
metricExporterFactory = nil
traceExporterFactory = nil
}
// getTLSConfig creates a TLS configuration from the provided Config.
func getTLSConfig(cfg *Config) *tls.Config {
if cfg.CertificateProvider == nil {
return nil
}
return &tls.Config{
GetClientCertificate: cfg.CertificateProvider,
RootCAs: cfg.RootCAs,
}
}
// getProtocol determines the OTLP protocol to use for the given signal type.
// It follows OpenTelemetry environment variable precedence.
func getProtocol(signalSpecificEnv string) string {
proto := os.Getenv(signalSpecificEnv)
if proto == "" {
proto = os.Getenv(otelExporterOTLPProtoEnvKey)
}
// Fallback to default, http/protobuf.
if proto == "" {
proto = "http/protobuf"
}
return proto
}
// CreateOTLPLogExporter creates an OTLP log exporter using the provided configuration.
func CreateOTLPLogExporter(ctx context.Context, cfg *Config) (sdklog.Exporter, error) {
tlsConfig := getTLSConfig(cfg)
proto := getProtocol(otelExporterOTLPLogsProtoEnvKey)
switch proto {
case "grpc":
opts := []otlploggrpc.Option{
otlploggrpc.WithCompressor("gzip"),
}
if tlsConfig != nil {
opts = append(opts, otlploggrpc.WithTLSCredentials(credentials.NewTLS(tlsConfig)))
}
if len(cfg.Endpoint) > 0 {
opts = append(opts, otlploggrpc.WithEndpoint(cfg.Endpoint))
}
if len(cfg.EndpointURL) > 0 {
opts = append(opts, otlploggrpc.WithEndpointURL(cfg.EndpointURL))
}
return otlploggrpc.New(ctx, opts...)
case "http/protobuf", "http/json":
opts := []otlploghttp.Option{
otlploghttp.WithCompression(otlploghttp.GzipCompression),
}
if tlsConfig != nil {
opts = append(opts, otlploghttp.WithTLSClientConfig(tlsConfig))
}
if len(cfg.Endpoint) > 0 {
opts = append(opts, otlploghttp.WithEndpoint(cfg.Endpoint))
}
if len(cfg.EndpointURL) > 0 {
opts = append(opts, otlploghttp.WithEndpointURL(cfg.EndpointURL))
}
opts = append(opts, otlploghttp.WithRetry(otlploghttp.RetryConfig{
Enabled: true,
InitialInterval: 3 * time.Second,
MaxInterval: 60 * time.Second,
MaxElapsedTime: 5 * time.Minute,
}))
return otlploghttp.New(ctx, opts...)
default:
return nil, newInvalidProtocolError(proto, "logs")
}
}
// CreateOTLPMetricExporter creates an OTLP metric exporter using the provided configuration.
func CreateOTLPMetricExporter(ctx context.Context, cfg *Config) (sdkmetric.Exporter, error) {
tlsConfig := getTLSConfig(cfg)
proto := getProtocol(otelExporterOTLPMetricsProtoEnvKey)
switch proto {
case "grpc":
opts := []otlpmetricgrpc.Option{
otlpmetricgrpc.WithCompressor("gzip"),
}
if tlsConfig != nil {
opts = append(opts, otlpmetricgrpc.WithTLSCredentials(credentials.NewTLS(tlsConfig)))
}
if len(cfg.Endpoint) > 0 {
opts = append(opts, otlpmetricgrpc.WithEndpoint(cfg.Endpoint))
}
if len(cfg.EndpointURL) > 0 {
opts = append(opts, otlpmetricgrpc.WithEndpointURL(cfg.EndpointURL))
}
return otlpmetricgrpc.New(ctx, opts...)
case "http/protobuf", "http/json":
opts := []otlpmetrichttp.Option{
otlpmetrichttp.WithCompression(otlpmetrichttp.GzipCompression),
}
if tlsConfig != nil {
opts = append(opts, otlpmetrichttp.WithTLSClientConfig(tlsConfig))
}
if len(cfg.Endpoint) > 0 {
opts = append(opts, otlpmetrichttp.WithEndpoint(cfg.Endpoint))
}
if len(cfg.EndpointURL) > 0 {
opts = append(opts, otlpmetrichttp.WithEndpointURL(cfg.EndpointURL))
}
opts = append(opts, otlpmetrichttp.WithRetry(otlpmetrichttp.RetryConfig{
Enabled: true,
InitialInterval: 3 * time.Second,
MaxInterval: 60 * time.Second,
MaxElapsedTime: 5 * time.Minute,
}))
return otlpmetrichttp.New(ctx, opts...)
default:
return nil, newInvalidProtocolError(proto, "metrics")
}
}
// CreateOTLPTraceExporter creates an OTLP trace exporter using the provided configuration.
func CreateOTLPTraceExporter(ctx context.Context, cfg *Config) (sdktrace.SpanExporter, error) {
tlsConfig := getTLSConfig(cfg)
proto := getProtocol(otelExporterOTLPTracesProtoEnvKey)
var client otlptrace.Client
switch proto {
case "grpc":
opts := []otlptracegrpc.Option{
otlptracegrpc.WithCompressor("gzip"),
}
if tlsConfig != nil {
opts = append(opts, otlptracegrpc.WithTLSCredentials(credentials.NewTLS(tlsConfig)))
}
if len(cfg.Endpoint) > 0 {
opts = append(opts, otlptracegrpc.WithEndpoint(cfg.Endpoint))
}
if len(cfg.EndpointURL) > 0 {
opts = append(opts, otlptracegrpc.WithEndpointURL(cfg.EndpointURL))
}
client = otlptracegrpc.NewClient(opts...)
case "http/protobuf", "http/json":
opts := []otlptracehttp.Option{
otlptracehttp.WithCompression(otlptracehttp.GzipCompression),
}
if tlsConfig != nil {
opts = append(opts, otlptracehttp.WithTLSClientConfig(tlsConfig))
}
if len(cfg.Endpoint) > 0 {
opts = append(opts, otlptracehttp.WithEndpoint(cfg.Endpoint))
}
if len(cfg.EndpointURL) > 0 {
opts = append(opts, otlptracehttp.WithEndpointURL(cfg.EndpointURL))
}
opts = append(opts, otlptracehttp.WithRetry(otlptracehttp.RetryConfig{
Enabled: true,
InitialInterval: 3 * time.Second,
MaxInterval: 60 * time.Second,
MaxElapsedTime: 5 * time.Minute,
}))
client = otlptracehttp.NewClient(opts...)
default:
return nil, newInvalidProtocolError(proto, "traces")
}
return otlptrace.New(ctx, client)
}

View File

@@ -0,0 +1,474 @@
package tracerconfig
import (
"context"
"crypto/tls"
"crypto/x509"
"os"
"strings"
"sync"
"testing"
"time"
sdklog "go.opentelemetry.io/otel/sdk/log"
sdkmetric "go.opentelemetry.io/otel/sdk/metric"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
)
func TestStore_And_Retrieve(t *testing.T) {
// Clear any existing configuration
Clear()
ctx := context.Background()
config := &Config{
ServiceName: "test-service",
Environment: "test",
Endpoint: "localhost:4317",
}
// Create mock factories
logFactory := func(context.Context, *Config) (sdklog.Exporter, error) { return nil, nil }
metricFactory := func(context.Context, *Config) (sdkmetric.Exporter, error) { return nil, nil }
traceFactory := func(context.Context, *Config) (sdktrace.SpanExporter, error) { return nil, nil }
// Store configuration
Store(ctx, config, logFactory, metricFactory, traceFactory)
// Test IsConfigured
if !IsConfigured() {
t.Error("IsConfigured() should return true after Store()")
}
// Test GetLogExporter
cfg, ctx2, factory := GetLogExporter()
if cfg == nil || ctx2 == nil || factory == nil {
t.Error("GetLogExporter() should return non-nil values")
}
if cfg.ServiceName != "test-service" {
t.Errorf("Expected ServiceName 'test-service', got '%s'", cfg.ServiceName)
}
// Test GetMetricExporter
cfg, ctx3, metricFact := GetMetricExporter()
if cfg == nil || ctx3 == nil || metricFact == nil {
t.Error("GetMetricExporter() should return non-nil values")
}
// Test GetTraceExporter
cfg, ctx4, traceFact := GetTraceExporter()
if cfg == nil || ctx4 == nil || traceFact == nil {
t.Error("GetTraceExporter() should return non-nil values")
}
// Test backward compatibility Get()
cfg, ctx5, logFact := Get()
if cfg == nil || ctx5 == nil || logFact == nil {
t.Error("Get() should return non-nil values for backward compatibility")
}
}
func TestClear(t *testing.T) {
// Store some configuration first
ctx := context.Background()
config := &Config{ServiceName: "test"}
Store(ctx, config, nil, nil, nil)
if !IsConfigured() {
t.Error("Should be configured before Clear()")
}
// Clear configuration
Clear()
if IsConfigured() {
t.Error("Should not be configured after Clear()")
}
// All getters should return nil
cfg, ctx2, factory := GetLogExporter()
if cfg != nil || ctx2 != nil || factory != nil {
t.Error("GetLogExporter() should return nil values after Clear()")
}
}
func TestConcurrentAccess(t *testing.T) {
Clear()
ctx := context.Background()
config := &Config{ServiceName: "concurrent-test"}
var wg sync.WaitGroup
const numGoroutines = 10
// Test concurrent Store and Get operations
wg.Add(numGoroutines * 2)
// Concurrent Store operations
for i := 0; i < numGoroutines; i++ {
go func() {
defer wg.Done()
Store(ctx, config, nil, nil, nil)
}()
}
// Concurrent Get operations
for i := 0; i < numGoroutines; i++ {
go func() {
defer wg.Done()
IsConfigured()
GetLogExporter()
GetMetricExporter()
GetTraceExporter()
}()
}
wg.Wait()
// Should be configured after all operations
if !IsConfigured() {
t.Error("Should be configured after concurrent operations")
}
}
func TestGetTLSConfig(t *testing.T) {
tests := []struct {
name string
config *Config
expected bool // whether TLS config should be nil
}{
{
name: "nil certificate provider",
config: &Config{},
expected: true, // should be nil
},
{
name: "with certificate provider",
config: &Config{
CertificateProvider: func(*tls.CertificateRequestInfo) (*tls.Certificate, error) {
return &tls.Certificate{}, nil
},
},
expected: false, // should not be nil
},
{
name: "with certificate provider and RootCAs",
config: &Config{
CertificateProvider: func(*tls.CertificateRequestInfo) (*tls.Certificate, error) {
return &tls.Certificate{}, nil
},
RootCAs: x509.NewCertPool(),
},
expected: false, // should not be nil
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tlsConfig := getTLSConfig(tt.config)
if tt.expected && tlsConfig != nil {
t.Errorf("Expected nil TLS config, got %v", tlsConfig)
}
if !tt.expected && tlsConfig == nil {
t.Error("Expected non-nil TLS config, got nil")
}
if !tt.expected && tlsConfig != nil {
if tlsConfig.GetClientCertificate == nil {
t.Error("Expected GetClientCertificate to be set")
}
if tt.config.RootCAs != nil && tlsConfig.RootCAs != tt.config.RootCAs {
t.Error("Expected RootCAs to be set correctly")
}
}
})
}
}
func TestGetProtocol(t *testing.T) {
// Save original env vars
originalGeneral := os.Getenv(otelExporterOTLPProtoEnvKey)
originalLogs := os.Getenv(otelExporterOTLPLogsProtoEnvKey)
defer func() {
// Restore original env vars
if originalGeneral != "" {
os.Setenv(otelExporterOTLPProtoEnvKey, originalGeneral)
} else {
os.Unsetenv(otelExporterOTLPProtoEnvKey)
}
if originalLogs != "" {
os.Setenv(otelExporterOTLPLogsProtoEnvKey, originalLogs)
} else {
os.Unsetenv(otelExporterOTLPLogsProtoEnvKey)
}
}()
tests := []struct {
name string
signalSpecific string
generalProto string
specificProto string
expectedResult string
}{
{
name: "no env vars set - default",
signalSpecific: otelExporterOTLPLogsProtoEnvKey,
expectedResult: "http/protobuf",
},
{
name: "general env var set",
signalSpecific: otelExporterOTLPLogsProtoEnvKey,
generalProto: "grpc",
expectedResult: "grpc",
},
{
name: "specific env var overrides general",
signalSpecific: otelExporterOTLPLogsProtoEnvKey,
generalProto: "grpc",
specificProto: "http/protobuf",
expectedResult: "http/protobuf",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Clear env vars
os.Unsetenv(otelExporterOTLPProtoEnvKey)
os.Unsetenv(otelExporterOTLPLogsProtoEnvKey)
// Set test env vars
if tt.generalProto != "" {
os.Setenv(otelExporterOTLPProtoEnvKey, tt.generalProto)
}
if tt.specificProto != "" {
os.Setenv(tt.signalSpecific, tt.specificProto)
}
result := getProtocol(tt.signalSpecific)
if result != tt.expectedResult {
t.Errorf("Expected protocol '%s', got '%s'", tt.expectedResult, result)
}
})
}
}
func TestCreateExporterErrors(t *testing.T) {
ctx := context.Background()
config := &Config{
ServiceName: "test-service",
Endpoint: "invalid-endpoint",
}
// Test with invalid protocol for logs
os.Setenv(otelExporterOTLPLogsProtoEnvKey, "invalid-protocol")
defer os.Unsetenv(otelExporterOTLPLogsProtoEnvKey)
_, err := CreateOTLPLogExporter(ctx, config)
if err == nil {
t.Error("Expected error for invalid protocol")
}
// Check that it's a protocol error (the specific message will be different now)
if !strings.Contains(err.Error(), "invalid OTLP protocol") {
t.Errorf("Expected protocol error, got %v", err)
}
// Test with invalid protocol for metrics
os.Setenv(otelExporterOTLPMetricsProtoEnvKey, "invalid-protocol")
defer os.Unsetenv(otelExporterOTLPMetricsProtoEnvKey)
_, err = CreateOTLPMetricExporter(ctx, config)
if err == nil {
t.Error("Expected error for invalid protocol")
}
if !strings.Contains(err.Error(), "invalid OTLP protocol") {
t.Errorf("Expected protocol error, got %v", err)
}
// Test with invalid protocol for traces
os.Setenv(otelExporterOTLPTracesProtoEnvKey, "invalid-protocol")
defer os.Unsetenv(otelExporterOTLPTracesProtoEnvKey)
_, err = CreateOTLPTraceExporter(ctx, config)
if err == nil {
t.Error("Expected error for invalid protocol")
}
if !strings.Contains(err.Error(), "invalid OTLP protocol") {
t.Errorf("Expected protocol error, got %v", err)
}
}
func TestCreateExporterValidProtocols(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
config := &Config{
ServiceName: "test-service",
Endpoint: "localhost:4317", // This will likely fail to connect, but should create exporter
}
protocols := []string{"grpc", "http/protobuf", "http/json"}
for _, proto := range protocols {
t.Run("logs_"+proto, func(t *testing.T) {
os.Setenv(otelExporterOTLPLogsProtoEnvKey, proto)
defer os.Unsetenv(otelExporterOTLPLogsProtoEnvKey)
exporter, err := CreateOTLPLogExporter(ctx, config)
if err != nil {
// Connection errors are expected since we're not running a real OTLP server
// but the exporter should be created successfully
t.Logf("Connection error expected: %v", err)
}
if exporter != nil {
exporter.Shutdown(ctx)
}
})
t.Run("metrics_"+proto, func(t *testing.T) {
os.Setenv(otelExporterOTLPMetricsProtoEnvKey, proto)
defer os.Unsetenv(otelExporterOTLPMetricsProtoEnvKey)
exporter, err := CreateOTLPMetricExporter(ctx, config)
if err != nil {
t.Logf("Connection error expected: %v", err)
}
if exporter != nil {
exporter.Shutdown(ctx)
}
})
t.Run("traces_"+proto, func(t *testing.T) {
os.Setenv(otelExporterOTLPTracesProtoEnvKey, proto)
defer os.Unsetenv(otelExporterOTLPTracesProtoEnvKey)
exporter, err := CreateOTLPTraceExporter(ctx, config)
if err != nil {
t.Logf("Connection error expected: %v", err)
}
if exporter != nil {
exporter.Shutdown(ctx)
}
})
}
}
func TestConfigValidation(t *testing.T) {
tests := []struct {
name string
config *Config
shouldErr bool
}{
{
name: "valid empty config",
config: &Config{},
shouldErr: false,
},
{
name: "valid config with endpoint",
config: &Config{
ServiceName: "test-service",
Endpoint: "localhost:4317",
},
shouldErr: false,
},
{
name: "valid config with endpoint URL",
config: &Config{
ServiceName: "test-service",
EndpointURL: "https://otlp.example.com:4317/v1/traces",
},
shouldErr: false,
},
{
name: "invalid - both endpoint and endpoint URL",
config: &Config{
ServiceName: "test-service",
Endpoint: "localhost:4317",
EndpointURL: "https://otlp.example.com:4317/v1/traces",
},
shouldErr: true,
},
{
name: "invalid - endpoint with protocol",
config: &Config{
ServiceName: "test-service",
Endpoint: "https://localhost:4317",
},
shouldErr: true,
},
{
name: "invalid - empty endpoint",
config: &Config{
ServiceName: "test-service",
Endpoint: " ",
},
shouldErr: true,
},
{
name: "invalid - malformed endpoint URL",
config: &Config{
ServiceName: "test-service",
EndpointURL: "://invalid-url-missing-scheme",
},
shouldErr: true,
},
{
name: "invalid - empty service name",
config: &Config{
ServiceName: " ",
Endpoint: "localhost:4317",
},
shouldErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := tt.config.Validate()
if tt.shouldErr && err == nil {
t.Error("Expected validation error, got nil")
}
if !tt.shouldErr && err != nil {
t.Errorf("Expected no validation error, got: %v", err)
}
})
}
}
func TestValidateAndStore(t *testing.T) {
Clear()
ctx := context.Background()
// Test with valid config
validConfig := &Config{
ServiceName: "test-service",
Endpoint: "localhost:4317",
}
err := ValidateAndStore(ctx, validConfig, nil, nil, nil)
if err != nil {
t.Errorf("ValidateAndStore with valid config should not error: %v", err)
}
if !IsConfigured() {
t.Error("Should be configured after ValidateAndStore")
}
Clear()
// Test with invalid config
invalidConfig := &Config{
ServiceName: "test-service",
Endpoint: "localhost:4317",
EndpointURL: "https://example.com:4317", // both specified - invalid
}
err = ValidateAndStore(ctx, invalidConfig, nil, nil, nil)
if err == nil {
t.Error("ValidateAndStore with invalid config should return error")
}
if IsConfigured() {
t.Error("Should not be configured after failed ValidateAndStore")
}
}

View File

@@ -0,0 +1,204 @@
package logger
import (
"context"
"errors"
"fmt"
"sync"
"time"
"go.ntppool.org/common/internal/tracerconfig"
otellog "go.opentelemetry.io/otel/sdk/log"
)
// bufferingExporter wraps an OTLP exporter and buffers logs until tracing is configured
type bufferingExporter struct {
mu sync.RWMutex
// Buffered records while waiting for tracing config
buffer [][]otellog.Record
bufferSize int
maxBuffSize int
// Real exporter (created when tracing is configured)
exporter otellog.Exporter
// Thread-safe initialization state (managed only by checkReadiness)
initErr error
// Background checker
stopChecker chan struct{}
checkerDone chan struct{}
}
// newBufferingExporter creates a new exporter that buffers logs until tracing is configured
func newBufferingExporter() *bufferingExporter {
e := &bufferingExporter{
maxBuffSize: 1000, // Max number of batches to buffer
stopChecker: make(chan struct{}),
checkerDone: make(chan struct{}),
}
// Start background readiness checker
go e.checkReadiness()
return e
}
// Export implements otellog.Exporter
func (e *bufferingExporter) Export(ctx context.Context, records []otellog.Record) error {
// Check if exporter is ready (initialization handled by checkReadiness goroutine)
e.mu.RLock()
exporter := e.exporter
e.mu.RUnlock()
if exporter != nil {
return exporter.Export(ctx, records)
}
// Not ready yet, buffer the records
return e.bufferRecords(records)
}
// initialize attempts to create the real OTLP exporter using tracing config
func (e *bufferingExporter) initialize() error {
cfg, ctx, factory := tracerconfig.Get()
if cfg == nil || ctx == nil || factory == nil {
return errors.New("tracer not configured yet")
}
// Add timeout for initialization
initCtx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()
exporter, err := factory(initCtx, cfg)
if err != nil {
return fmt.Errorf("failed to create OTLP exporter: %w", err)
}
e.mu.Lock()
e.exporter = exporter
flushErr := e.flushBuffer(initCtx)
e.mu.Unlock()
if flushErr != nil {
// Log but don't fail initialization
Setup().Warn("buffer flush failed during initialization", "error", flushErr)
}
return nil
}
// bufferRecords adds records to the buffer for later processing
func (e *bufferingExporter) bufferRecords(records []otellog.Record) error {
e.mu.Lock()
defer e.mu.Unlock()
// Buffer the batch if we have space
if e.bufferSize < e.maxBuffSize {
// Clone records to avoid retention issues
cloned := make([]otellog.Record, len(records))
for i, r := range records {
cloned[i] = r.Clone()
}
e.buffer = append(e.buffer, cloned)
e.bufferSize++
}
// Always return success to BatchProcessor
return nil
}
// checkReadiness periodically attempts initialization until successful
func (e *bufferingExporter) checkReadiness() {
defer close(e.checkerDone)
ticker := time.NewTicker(1 * time.Second)
defer ticker.Stop()
for {
select {
case <-ticker.C:
// Check if we already have a working exporter
e.mu.RLock()
hasExporter := e.exporter != nil
e.mu.RUnlock()
if hasExporter {
return // Exporter ready, checker no longer needed
}
// Try to initialize
err := e.initialize()
e.mu.Lock()
e.initErr = err
e.mu.Unlock()
case <-e.stopChecker:
return
}
}
}
// flushBuffer sends all buffered batches through the real exporter
func (e *bufferingExporter) flushBuffer(ctx context.Context) error {
if len(e.buffer) == 0 {
return nil
}
flushCtx, cancel := context.WithTimeout(ctx, 30*time.Second)
defer cancel()
var lastErr error
for _, batch := range e.buffer {
if err := e.exporter.Export(flushCtx, batch); err != nil {
lastErr = err
}
}
// Clear buffer after flush attempt
e.buffer = nil
e.bufferSize = 0
return lastErr
}
// ForceFlush implements otellog.Exporter
func (e *bufferingExporter) ForceFlush(ctx context.Context) error {
e.mu.RLock()
defer e.mu.RUnlock()
if e.exporter != nil {
return e.exporter.ForceFlush(ctx)
}
return nil
}
// Shutdown implements otellog.Exporter
func (e *bufferingExporter) Shutdown(ctx context.Context) error {
// Stop the readiness checker from continuing
close(e.stopChecker)
// Wait for readiness checker goroutine to complete
<-e.checkerDone
// Give one final chance for TLS/tracing to become ready for buffer flushing
e.mu.RLock()
hasExporter := e.exporter != nil
e.mu.RUnlock()
if !hasExporter {
err := e.initialize()
e.mu.Lock()
e.initErr = err
e.mu.Unlock()
}
e.mu.Lock()
defer e.mu.Unlock()
if e.exporter != nil {
return e.exporter.Shutdown(ctx)
}
return nil
}

235
logger/level_test.go Normal file
View File

@@ -0,0 +1,235 @@
package logger
import (
"context"
"log/slog"
"os"
"testing"
"time"
)
func TestParseLevel(t *testing.T) {
tests := []struct {
name string
input string
expected slog.Level
expectError bool
}{
{"empty string", "", slog.LevelInfo, false},
{"DEBUG upper", "DEBUG", slog.LevelDebug, false},
{"debug lower", "debug", slog.LevelDebug, false},
{"INFO upper", "INFO", slog.LevelInfo, false},
{"info lower", "info", slog.LevelInfo, false},
{"WARN upper", "WARN", slog.LevelWarn, false},
{"warn lower", "warn", slog.LevelWarn, false},
{"ERROR upper", "ERROR", slog.LevelError, false},
{"error lower", "error", slog.LevelError, false},
{"invalid level", "invalid", slog.LevelInfo, true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
level, err := ParseLevel(tt.input)
if tt.expectError {
if err == nil {
t.Errorf("expected error for input %q, got nil", tt.input)
}
} else {
if err != nil {
t.Errorf("unexpected error for input %q: %v", tt.input, err)
}
if level != tt.expected {
t.Errorf("expected level %v for input %q, got %v", tt.expected, tt.input, level)
}
}
})
}
}
func TestSetLevel(t *testing.T) {
// Store original level to restore later
originalLevel := Level.Level()
defer Level.Set(originalLevel)
SetLevel(slog.LevelDebug)
if Level.Level() != slog.LevelDebug {
t.Errorf("expected Level to be Debug, got %v", Level.Level())
}
SetLevel(slog.LevelError)
if Level.Level() != slog.LevelError {
t.Errorf("expected Level to be Error, got %v", Level.Level())
}
}
func TestSetOTLPLevel(t *testing.T) {
// Store original level to restore later
originalLevel := OTLPLevel.Level()
defer OTLPLevel.Set(originalLevel)
SetOTLPLevel(slog.LevelWarn)
if OTLPLevel.Level() != slog.LevelWarn {
t.Errorf("expected OTLPLevel to be Warn, got %v", OTLPLevel.Level())
}
SetOTLPLevel(slog.LevelDebug)
if OTLPLevel.Level() != slog.LevelDebug {
t.Errorf("expected OTLPLevel to be Debug, got %v", OTLPLevel.Level())
}
}
func TestOTLPLevelHandler(t *testing.T) {
// Create a mock handler that counts calls
callCount := 0
mockHandler := &mockHandler{
handleFunc: func(ctx context.Context, r slog.Record) error {
callCount++
return nil
},
}
// Set OTLP level to Warn
originalLevel := OTLPLevel.Level()
defer OTLPLevel.Set(originalLevel)
OTLPLevel.Set(slog.LevelWarn)
// Create OTLP level handler
handler := newOTLPLevelHandler(mockHandler)
ctx := context.Background()
// Test that Debug and Info are filtered out
if handler.Enabled(ctx, slog.LevelDebug) {
t.Error("Debug level should be disabled when OTLP level is Warn")
}
if handler.Enabled(ctx, slog.LevelInfo) {
t.Error("Info level should be disabled when OTLP level is Warn")
}
// Test that Warn and Error are enabled
if !handler.Enabled(ctx, slog.LevelWarn) {
t.Error("Warn level should be enabled when OTLP level is Warn")
}
if !handler.Enabled(ctx, slog.LevelError) {
t.Error("Error level should be enabled when OTLP level is Warn")
}
// Test that Handle respects level filtering
now := time.Now()
debugRecord := slog.NewRecord(now, slog.LevelDebug, "debug message", 0)
warnRecord := slog.NewRecord(now, slog.LevelWarn, "warn message", 0)
handler.Handle(ctx, debugRecord)
if callCount != 0 {
t.Error("Debug record should not be passed to underlying handler")
}
handler.Handle(ctx, warnRecord)
if callCount != 1 {
t.Error("Warn record should be passed to underlying handler")
}
}
func TestEnvironmentVariables(t *testing.T) {
tests := []struct {
name string
envVar string
envValue string
configPrefix string
testFunc func(t *testing.T)
}{
{
name: "LOG_LEVEL sets stderr level",
envVar: "LOG_LEVEL",
envValue: "ERROR",
testFunc: func(t *testing.T) {
// Reset the setup state
resetLoggerSetup()
// Call setupStdErrHandler which should read the env var
handler := setupStdErrHandler()
if handler == nil {
t.Fatal("setupStdErrHandler returned nil")
}
if Level.Level() != slog.LevelError {
t.Errorf("expected Level to be Error after setting LOG_LEVEL=ERROR, got %v", Level.Level())
}
},
},
{
name: "Prefixed LOG_LEVEL",
envVar: "TEST_LOG_LEVEL",
envValue: "DEBUG",
configPrefix: "TEST",
testFunc: func(t *testing.T) {
ConfigPrefix = "TEST"
defer func() { ConfigPrefix = "" }()
resetLoggerSetup()
handler := setupStdErrHandler()
if handler == nil {
t.Fatal("setupStdErrHandler returned nil")
}
if Level.Level() != slog.LevelDebug {
t.Errorf("expected Level to be Debug after setting TEST_LOG_LEVEL=DEBUG, got %v", Level.Level())
}
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Store original env value and level
originalEnv := os.Getenv(tt.envVar)
originalLevel := Level.Level()
defer func() {
os.Setenv(tt.envVar, originalEnv)
Level.Set(originalLevel)
}()
// Set test environment variable
os.Setenv(tt.envVar, tt.envValue)
// Run the test
tt.testFunc(t)
})
}
}
// mockHandler is a simple mock implementation of slog.Handler for testing
type mockHandler struct {
handleFunc func(ctx context.Context, r slog.Record) error
}
func (m *mockHandler) Enabled(ctx context.Context, level slog.Level) bool {
return true
}
func (m *mockHandler) Handle(ctx context.Context, r slog.Record) error {
if m.handleFunc != nil {
return m.handleFunc(ctx, r)
}
return nil
}
func (m *mockHandler) WithAttrs(attrs []slog.Attr) slog.Handler {
return m
}
func (m *mockHandler) WithGroup(name string) slog.Handler {
return m
}
// resetLoggerSetup resets the sync.Once instances for testing
func resetLoggerSetup() {
// Reset package-level variables
textLogger = nil
otlpLogger = nil
multiLogger = nil
// Note: We can't easily reset sync.Once instances in tests,
// but for the specific test we're doing (environment variable parsing)
// we can test the setupStdErrHandler function directly
}

View File

@@ -16,23 +16,28 @@ type logfmt struct {
mu sync.Mutex
}
// createTextHandlerOptions creates the common slog.HandlerOptions used by all logfmt handlers
func createTextHandlerOptions() *slog.HandlerOptions {
return &slog.HandlerOptions{
ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
if a.Key == slog.TimeKey && len(groups) == 0 {
return slog.Attr{}
}
if a.Key == slog.LevelKey && len(groups) == 0 {
return slog.Attr{}
}
return a
},
}
}
func newLogFmtHandler(next slog.Handler) slog.Handler {
buf := bytes.NewBuffer([]byte{})
h := &logfmt{
buf: buf,
next: next,
txt: slog.NewTextHandler(buf, &slog.HandlerOptions{
ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
if a.Key == slog.TimeKey && len(groups) == 0 {
return slog.Attr{}
}
if a.Key == slog.LevelKey && len(groups) == 0 {
return slog.Attr{}
}
return a
},
}),
txt: slog.NewTextHandler(buf, createTextHandlerOptions()),
}
return h
@@ -43,10 +48,11 @@ func (h *logfmt) Enabled(ctx context.Context, lvl slog.Level) bool {
}
func (h *logfmt) WithAttrs(attrs []slog.Attr) slog.Handler {
buf := bytes.NewBuffer([]byte{})
return &logfmt{
buf: bytes.NewBuffer([]byte{}),
buf: buf,
next: h.next.WithAttrs(slices.Clone(attrs)),
txt: h.txt.WithAttrs(slices.Clone(attrs)),
txt: slog.NewTextHandler(buf, createTextHandlerOptions()).WithAttrs(slices.Clone(attrs)),
}
}
@@ -54,10 +60,11 @@ func (h *logfmt) WithGroup(g string) slog.Handler {
if g == "" {
return h
}
buf := bytes.NewBuffer([]byte{})
return &logfmt{
buf: bytes.NewBuffer([]byte{}),
buf: buf,
next: h.next.WithGroup(g),
txt: h.txt.WithGroup(g),
txt: slog.NewTextHandler(buf, createTextHandlerOptions()).WithGroup(g),
}
}
@@ -69,10 +76,22 @@ func (h *logfmt) Handle(ctx context.Context, r slog.Record) error {
panic("buffer wasn't empty")
}
h.txt.Handle(ctx, r)
r.Message = h.buf.String()
r.Message = strings.TrimSuffix(r.Message, "\n")
// Format using text handler to get the formatted message
err := h.txt.Handle(ctx, r)
if err != nil {
return err
}
formattedMessage := h.buf.String()
formattedMessage = strings.TrimSuffix(formattedMessage, "\n")
h.buf.Reset()
return h.next.Handle(ctx, r)
// Create a new record with the formatted message
newRecord := slog.NewRecord(r.Time, r.Level, formattedMessage, r.PC)
r.Attrs(func(a slog.Attr) bool {
newRecord.AddAttrs(a)
return true
})
return h.next.Handle(ctx, newRecord)
}

View File

@@ -18,21 +18,27 @@
// - Context propagation for request-scoped logging
//
// Environment variables:
// - DEBUG: Enable debug level logging (configurable prefix via ConfigPrefix)
// - LOG_LEVEL: Set stderr log level (DEBUG, INFO, WARN, ERROR) (configurable prefix via ConfigPrefix)
// - OTLP_LOG_LEVEL: Set OTLP log level independently (configurable prefix via ConfigPrefix)
// - DEBUG: Enable debug level logging for backward compatibility (configurable prefix via ConfigPrefix)
// - INVOCATION_ID: Systemd detection for timestamp handling
package logger
import (
"context"
"fmt"
"log"
"log/slog"
"os"
"strconv"
"sync"
"time"
slogtraceid "github.com/remychantenay/slog-otel"
slogmulti "github.com/samber/slog-multi"
"go.opentelemetry.io/contrib/bridges/otelslog"
"go.opentelemetry.io/otel/log/global"
otellog "go.opentelemetry.io/otel/sdk/log"
)
// ConfigPrefix allows customizing the environment variable prefix for configuration.
@@ -40,6 +46,16 @@ import (
// This enables multiple services to have independent logging configuration.
var ConfigPrefix = ""
var (
// Level controls the log level for the default stderr logger.
// Can be changed at runtime to adjust logging verbosity.
Level = new(slog.LevelVar) // Info by default
// OTLPLevel controls the log level for OTLP output.
// Can be changed independently from the stderr logger level.
OTLPLevel = new(slog.LevelVar) // Info by default
)
var (
textLogger *slog.Logger
otlpLogger *slog.Logger
@@ -53,21 +69,64 @@ var (
mu sync.Mutex
)
func setupStdErrHandler() slog.Handler {
programLevel := new(slog.LevelVar) // Info by default
// SetLevel sets the log level for the default stderr logger.
// This affects the primary application logger returned by Setup().
func SetLevel(level slog.Level) {
Level.Set(level)
}
envVar := "DEBUG"
// SetOTLPLevel sets the log level for OTLP output.
// This affects the logger returned by SetupOLTP() and the OTLP portion of SetupMultiLogger().
func SetOTLPLevel(level slog.Level) {
OTLPLevel.Set(level)
}
// ParseLevel converts a string log level to slog.Level.
// Supported levels: "DEBUG", "INFO", "WARN", "ERROR" (case insensitive).
// Returns an error for unrecognized level strings.
func ParseLevel(level string) (slog.Level, error) {
switch {
case level == "":
return slog.LevelInfo, nil
case level == "DEBUG" || level == "debug":
return slog.LevelDebug, nil
case level == "INFO" || level == "info":
return slog.LevelInfo, nil
case level == "WARN" || level == "warn":
return slog.LevelWarn, nil
case level == "ERROR" || level == "error":
return slog.LevelError, nil
default:
return slog.LevelInfo, fmt.Errorf("unknown log level: %s", level)
}
}
func setupStdErrHandler() slog.Handler {
// Parse LOG_LEVEL environment variable
logLevelVar := "LOG_LEVEL"
if len(ConfigPrefix) > 0 {
envVar = ConfigPrefix + "_" + envVar
logLevelVar = ConfigPrefix + "_" + logLevelVar
}
if opt := os.Getenv(envVar); len(opt) > 0 {
if debug, _ := strconv.ParseBool(opt); debug {
programLevel.Set(slog.LevelDebug)
if levelStr := os.Getenv(logLevelVar); levelStr != "" {
if level, err := ParseLevel(levelStr); err == nil {
Level.Set(level)
}
}
logOptions := &slog.HandlerOptions{Level: programLevel}
// Maintain backward compatibility with DEBUG environment variable
debugVar := "DEBUG"
if len(ConfigPrefix) > 0 {
debugVar = ConfigPrefix + "_" + debugVar
}
if opt := os.Getenv(debugVar); len(opt) > 0 {
if debug, _ := strconv.ParseBool(opt); debug {
Level.Set(slog.LevelDebug)
}
}
logOptions := &slog.HandlerOptions{Level: Level}
if len(os.Getenv("INVOCATION_ID")) > 0 {
// don't add timestamps when running under systemd
@@ -85,9 +144,41 @@ func setupStdErrHandler() slog.Handler {
func setupOtlpLogger() *slog.Logger {
setupOtlp.Do(func() {
otlpLogger = slog.New(
newLogFmtHandler(otelslog.NewHandler("common")),
// Parse OTLP_LOG_LEVEL environment variable
otlpLevelVar := "OTLP_LOG_LEVEL"
if len(ConfigPrefix) > 0 {
otlpLevelVar = ConfigPrefix + "_" + otlpLevelVar
}
if levelStr := os.Getenv(otlpLevelVar); levelStr != "" {
if level, err := ParseLevel(levelStr); err == nil {
OTLPLevel.Set(level)
}
}
// Create our buffering exporter
// It will buffer until tracing is configured
bufferingExp := newBufferingExporter()
// Use BatchProcessor with our custom exporter
processor := otellog.NewBatchProcessor(bufferingExp,
otellog.WithExportInterval(10*time.Second),
otellog.WithMaxQueueSize(2048),
otellog.WithExportMaxBatchSize(512),
)
// Create logger provider
provider := otellog.NewLoggerProvider(
otellog.WithProcessor(processor),
)
// Set global provider
global.SetLoggerProvider(provider)
// Create slog handler with level control
baseHandler := newLogFmtHandler(otelslog.NewHandler("common"))
handler := newOTLPLevelHandler(baseHandler)
otlpLogger = slog.New(handler)
})
return otlpLogger
}

48
logger/otlp_handler.go Normal file
View File

@@ -0,0 +1,48 @@
package logger
import (
"context"
"log/slog"
)
// otlpLevelHandler is a wrapper that enforces level checking for OTLP handlers.
// This allows independent level control for OTLP output separate from stderr logging.
type otlpLevelHandler struct {
next slog.Handler
}
// newOTLPLevelHandler creates a new OTLP level wrapper handler.
func newOTLPLevelHandler(next slog.Handler) slog.Handler {
return &otlpLevelHandler{
next: next,
}
}
// Enabled checks if the log level should be processed by the OTLP handler.
// It uses the OTLPLevel variable to determine if the record should be processed.
func (h *otlpLevelHandler) Enabled(ctx context.Context, level slog.Level) bool {
return level >= OTLPLevel.Level()
}
// Handle processes the log record if the level is enabled.
// If disabled by level checking, the record is silently dropped.
func (h *otlpLevelHandler) Handle(ctx context.Context, r slog.Record) error {
if !h.Enabled(ctx, r.Level) {
return nil
}
return h.next.Handle(ctx, r)
}
// WithAttrs returns a new handler with the specified attributes added.
func (h *otlpLevelHandler) WithAttrs(attrs []slog.Attr) slog.Handler {
return &otlpLevelHandler{
next: h.next.WithAttrs(attrs),
}
}
// WithGroup returns a new handler with the specified group name.
func (h *otlpLevelHandler) WithGroup(name string) slog.Handler {
return &otlpLevelHandler{
next: h.next.WithGroup(name),
}
}

122
metrics/metrics.go Normal file
View File

@@ -0,0 +1,122 @@
// Package metrics provides OpenTelemetry-native metrics with OTLP export support.
//
// This package implements a metrics system using the OpenTelemetry metrics data model
// with OTLP export capabilities. It's designed for new applications that want to use
// structured metrics export to observability backends.
//
// Key features:
// - OpenTelemetry native metric types (Counter, Histogram, Gauge, etc.)
// - OTLP export for sending metrics to observability backends
// - Resource detection and correlation with traces/logs
// - Graceful handling when OTLP configuration is not available
//
// Example usage:
//
// // Initialize metrics along with tracing
// shutdown, err := tracing.InitTracer(ctx, cfg)
// if err != nil {
// log.Fatal(err)
// }
// defer shutdown(ctx)
//
// // Get a meter and create instruments
// meter := metrics.GetMeter("my-service")
// counter, _ := meter.Int64Counter("requests_total")
// counter.Add(ctx, 1, metric.WithAttributes(attribute.String("method", "GET")))
package metrics
import (
"context"
"log/slog"
"sync"
"time"
"go.ntppool.org/common/internal/tracerconfig"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/metric"
sdkmetric "go.opentelemetry.io/otel/sdk/metric"
)
var (
meterProvider metric.MeterProvider
setupOnce sync.Once
setupErr error
)
// Setup initializes the OpenTelemetry metrics provider with OTLP export.
// This function uses the configuration stored by the tracing package and
// creates a metrics provider that exports to the same OTLP endpoint.
//
// The function is safe to call multiple times - it will only initialize once.
// If tracing configuration is not available, it returns a no-op provider that
// doesn't export metrics.
//
// Returns an error only if there's a configuration problem. Missing tracing
// configuration is handled gracefully with a warning log.
func Setup(ctx context.Context) error {
setupOnce.Do(func() {
setupErr = initializeMetrics(ctx)
})
return setupErr
}
// GetMeter returns a named meter for creating metric instruments.
// The meter uses the configured metrics provider, or the global provider
// if metrics haven't been set up yet.
//
// This is the primary entry point for creating metric instruments in your application.
func GetMeter(name string, opts ...metric.MeterOption) metric.Meter {
if meterProvider == nil {
// Return the global provider as fallback (no-op if not configured)
return otel.GetMeterProvider().Meter(name, opts...)
}
return meterProvider.Meter(name, opts...)
}
// initializeMetrics sets up the OpenTelemetry metrics provider with OTLP export.
func initializeMetrics(ctx context.Context) error {
log := slog.Default()
// Check if tracing configuration is available
cfg, configCtx, factory := tracerconfig.GetMetricExporter()
if cfg == nil || configCtx == nil || factory == nil {
log.Warn("metrics setup: tracing configuration not available, using no-op provider")
// Set the global provider as fallback - metrics just won't be exported
meterProvider = otel.GetMeterProvider()
return nil
}
// Create OTLP metrics exporter
exporter, err := factory(ctx, cfg)
if err != nil {
log.Error("metrics setup: failed to create OTLP exporter", "error", err)
// Fall back to global provider
meterProvider = otel.GetMeterProvider()
return nil
}
// Create metrics provider with the exporter
provider := sdkmetric.NewMeterProvider(
sdkmetric.WithReader(sdkmetric.NewPeriodicReader(
exporter,
sdkmetric.WithInterval(15*time.Second),
)),
)
// Set the global provider
otel.SetMeterProvider(provider)
meterProvider = provider
log.Info("metrics setup: OTLP metrics provider initialized")
return nil
}
// Shutdown gracefully shuts down the metrics provider.
// This should be called during application shutdown to ensure all metrics
// are properly flushed and exported.
func Shutdown(ctx context.Context) error {
if provider, ok := meterProvider.(*sdkmetric.MeterProvider); ok {
return provider.Shutdown(ctx)
}
return nil
}

296
metrics/metrics_test.go Normal file
View File

@@ -0,0 +1,296 @@
package metrics
import (
"context"
"os"
"testing"
"time"
"go.ntppool.org/common/internal/tracerconfig"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"
sdkmetric "go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/sdk/metric/metricdata"
)
func TestSetup_NoConfiguration(t *testing.T) {
// Clear any existing configuration
tracerconfig.Clear()
ctx := context.Background()
err := Setup(ctx)
// Should not return an error even when no configuration is available
if err != nil {
t.Errorf("Setup() returned unexpected error: %v", err)
}
// Should be able to get a meter (even if it's a no-op)
meter := GetMeter("test-meter")
if meter == nil {
t.Error("GetMeter() returned nil")
}
}
func TestGetMeter(t *testing.T) {
// Clear any existing configuration
tracerconfig.Clear()
ctx := context.Background()
_ = Setup(ctx)
meter := GetMeter("test-service")
if meter == nil {
t.Fatal("GetMeter() returned nil")
}
// Test creating a counter instrument
counter, err := meter.Int64Counter("test_counter")
if err != nil {
t.Errorf("Failed to create counter: %v", err)
}
// Test using the counter (should not error even with no-op provider)
counter.Add(ctx, 1, metric.WithAttributes(attribute.String("test", "value")))
}
func TestSetup_MultipleCallsSafe(t *testing.T) {
// Clear any existing configuration
tracerconfig.Clear()
ctx := context.Background()
// Call Setup multiple times
err1 := Setup(ctx)
err2 := Setup(ctx)
err3 := Setup(ctx)
if err1 != nil {
t.Errorf("First Setup() call returned error: %v", err1)
}
if err2 != nil {
t.Errorf("Second Setup() call returned error: %v", err2)
}
if err3 != nil {
t.Errorf("Third Setup() call returned error: %v", err3)
}
// Should still be able to get meters
meter := GetMeter("test-meter")
if meter == nil {
t.Error("GetMeter() returned nil after multiple Setup() calls")
}
}
func TestSetup_WithConfiguration(t *testing.T) {
// Clear any existing configuration
tracerconfig.Clear()
ctx := context.Background()
config := &tracerconfig.Config{
ServiceName: "test-metrics-service",
Environment: "test",
Endpoint: "localhost:4317", // Will likely fail to connect, but should set up provider
}
// Create a mock exporter factory that returns a working exporter
mockFactory := func(ctx context.Context, cfg *tracerconfig.Config) (sdkmetric.Exporter, error) {
// Create a simple in-memory exporter for testing
return &mockMetricExporter{}, nil
}
// Store configuration with mock factory
tracerconfig.Store(ctx, config, nil, mockFactory, nil)
// Setup metrics
err := Setup(ctx)
if err != nil {
t.Errorf("Setup() returned error: %v", err)
}
// Should be able to get a meter
meter := GetMeter("test-service")
if meter == nil {
t.Fatal("GetMeter() returned nil")
}
// Test creating and using instruments
counter, err := meter.Int64Counter("test_counter")
if err != nil {
t.Errorf("Failed to create counter: %v", err)
}
histogram, err := meter.Float64Histogram("test_histogram")
if err != nil {
t.Errorf("Failed to create histogram: %v", err)
}
gauge, err := meter.Int64UpDownCounter("test_gauge")
if err != nil {
t.Errorf("Failed to create gauge: %v", err)
}
// Use the instruments
counter.Add(ctx, 1, metric.WithAttributes(attribute.String("test", "value")))
histogram.Record(ctx, 1.5, metric.WithAttributes(attribute.String("test", "value")))
gauge.Add(ctx, 10, metric.WithAttributes(attribute.String("test", "value")))
// Test shutdown
err = Shutdown(ctx)
if err != nil {
t.Errorf("Shutdown() returned error: %v", err)
}
}
func TestSetup_WithRealOTLPConfig(t *testing.T) {
// Skip this test in short mode since it may try to make network connections
if testing.Short() {
t.Skip("Skipping integration test in short mode")
}
// Clear any existing configuration
tracerconfig.Clear()
// Set environment variables for OTLP configuration
originalEndpoint := os.Getenv("OTEL_EXPORTER_OTLP_ENDPOINT")
originalProtocol := os.Getenv("OTEL_EXPORTER_OTLP_PROTOCOL")
defer func() {
if originalEndpoint != "" {
os.Setenv("OTEL_EXPORTER_OTLP_ENDPOINT", originalEndpoint)
} else {
os.Unsetenv("OTEL_EXPORTER_OTLP_ENDPOINT")
}
if originalProtocol != "" {
os.Setenv("OTEL_EXPORTER_OTLP_PROTOCOL", originalProtocol)
} else {
os.Unsetenv("OTEL_EXPORTER_OTLP_PROTOCOL")
}
}()
os.Setenv("OTEL_EXPORTER_OTLP_ENDPOINT", "http://localhost:4318") // HTTP endpoint
os.Setenv("OTEL_EXPORTER_OTLP_PROTOCOL", "http/protobuf")
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
config := &tracerconfig.Config{
ServiceName: "test-metrics-e2e",
Environment: "test",
Endpoint: "localhost:4318",
}
// Store configuration with real factory
tracerconfig.Store(ctx, config, nil, tracerconfig.CreateOTLPMetricExporter, nil)
// Setup metrics - this may fail if no OTLP collector is running, which is okay
err := Setup(ctx)
if err != nil {
t.Logf("Setup() returned error (expected if no OTLP collector): %v", err)
}
// Should still be able to get a meter
meter := GetMeter("test-service-e2e")
if meter == nil {
t.Fatal("GetMeter() returned nil")
}
// Create and use instruments
counter, err := meter.Int64Counter("e2e_test_counter")
if err != nil {
t.Errorf("Failed to create counter: %v", err)
}
// Add some metrics
for i := 0; i < 5; i++ {
counter.Add(ctx, 1, metric.WithAttributes(
attribute.String("iteration", string(rune('0'+i))),
attribute.String("test_type", "e2e"),
))
}
// Give some time for export (if collector is running)
time.Sleep(100 * time.Millisecond)
// Test shutdown
err = Shutdown(ctx)
if err != nil {
t.Logf("Shutdown() returned error (may be expected): %v", err)
}
}
func TestConcurrentMetricUsage(t *testing.T) {
// Clear any existing configuration
tracerconfig.Clear()
ctx := context.Background()
config := &tracerconfig.Config{
ServiceName: "concurrent-test",
}
// Use mock factory
mockFactory := func(ctx context.Context, cfg *tracerconfig.Config) (sdkmetric.Exporter, error) {
return &mockMetricExporter{}, nil
}
tracerconfig.Store(ctx, config, nil, mockFactory, nil)
Setup(ctx)
meter := GetMeter("concurrent-test")
counter, err := meter.Int64Counter("concurrent_counter")
if err != nil {
t.Fatalf("Failed to create counter: %v", err)
}
// Test concurrent metric usage
const numGoroutines = 10
const metricsPerGoroutine = 100
done := make(chan bool, numGoroutines)
for i := 0; i < numGoroutines; i++ {
go func(goroutineID int) {
for j := 0; j < metricsPerGoroutine; j++ {
counter.Add(ctx, 1, metric.WithAttributes(
attribute.Int("goroutine", goroutineID),
attribute.Int("iteration", j),
))
}
done <- true
}(i)
}
// Wait for all goroutines to complete
for i := 0; i < numGoroutines; i++ {
<-done
}
// Shutdown
err = Shutdown(ctx)
if err != nil {
t.Errorf("Shutdown() returned error: %v", err)
}
}
// mockMetricExporter is a simple mock exporter for testing
type mockMetricExporter struct{}
func (m *mockMetricExporter) Export(ctx context.Context, rm *metricdata.ResourceMetrics) error {
// Just pretend to export
return nil
}
func (m *mockMetricExporter) ForceFlush(ctx context.Context) error {
return nil
}
func (m *mockMetricExporter) Shutdown(ctx context.Context) error {
return nil
}
func (m *mockMetricExporter) Temporality(kind sdkmetric.InstrumentKind) metricdata.Temporality {
return metricdata.CumulativeTemporality
}
func (m *mockMetricExporter) Aggregation(kind sdkmetric.InstrumentKind) sdkmetric.Aggregation {
return sdkmetric.DefaultAggregationSelector(kind)
}

View File

@@ -15,11 +15,11 @@ import (
func TestNew(t *testing.T) {
metrics := New()
if metrics == nil {
t.Fatal("New returned nil")
}
if metrics.r == nil {
t.Error("metrics registry is nil")
}
@@ -28,32 +28,32 @@ func TestNew(t *testing.T) {
func TestRegistry(t *testing.T) {
metrics := New()
registry := metrics.Registry()
if registry == nil {
t.Fatal("Registry() returned nil")
}
if registry != metrics.r {
t.Error("Registry() did not return the metrics registry")
}
// Test that we can register a metric
counter := prometheus.NewCounter(prometheus.CounterOpts{
Name: "test_counter",
Help: "A test counter",
})
err := registry.Register(counter)
if err != nil {
t.Errorf("failed to register metric: %v", err)
}
// Test that the metric is registered
metricFamilies, err := registry.Gather()
if err != nil {
t.Errorf("failed to gather metrics: %v", err)
}
found := false
for _, mf := range metricFamilies {
if mf.GetName() == "test_counter" {
@@ -61,7 +61,7 @@ func TestRegistry(t *testing.T) {
break
}
}
if !found {
t.Error("registered metric not found in registry")
}
@@ -69,7 +69,7 @@ func TestRegistry(t *testing.T) {
func TestHandler(t *testing.T) {
metrics := New()
// Register a test metric
counter := prometheus.NewCounterVec(
prometheus.CounterOpts{
@@ -80,40 +80,40 @@ func TestHandler(t *testing.T) {
)
metrics.Registry().MustRegister(counter)
counter.WithLabelValues("GET").Inc()
// Test the handler
handler := metrics.Handler()
if handler == nil {
t.Fatal("Handler() returned nil")
}
// Create a test request
req := httptest.NewRequest("GET", "/metrics", nil)
recorder := httptest.NewRecorder()
// Call the handler
handler.ServeHTTP(recorder, req)
// Check response
resp := recorder.Result()
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Errorf("expected status 200, got %d", resp.StatusCode)
}
body, err := io.ReadAll(resp.Body)
if err != nil {
t.Fatalf("failed to read response body: %v", err)
}
bodyStr := string(body)
// Check for our test metric
if !strings.Contains(bodyStr, "test_requests_total") {
t.Error("test metric not found in metrics output")
}
// Check for OpenMetrics format indicators
if !strings.Contains(bodyStr, "# TYPE") {
t.Error("metrics output missing TYPE comments")
@@ -122,7 +122,7 @@ func TestHandler(t *testing.T) {
func TestListenAndServe(t *testing.T) {
metrics := New()
// Register a test metric
counter := prometheus.NewCounterVec(
prometheus.CounterOpts{
@@ -133,46 +133,46 @@ func TestListenAndServe(t *testing.T) {
)
metrics.Registry().MustRegister(counter)
counter.WithLabelValues("GET").Inc()
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// Start server in a goroutine
errCh := make(chan error, 1)
go func() {
// Use a high port number to avoid conflicts
errCh <- metrics.ListenAndServe(ctx, 9999)
}()
// Give the server a moment to start
time.Sleep(100 * time.Millisecond)
// Test metrics endpoint
resp, err := http.Get("http://localhost:9999/metrics")
if err != nil {
t.Fatalf("failed to GET /metrics: %v", err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Errorf("expected status 200, got %d", resp.StatusCode)
}
body, err := io.ReadAll(resp.Body)
if err != nil {
t.Fatalf("failed to read response body: %v", err)
}
bodyStr := string(body)
// Check for our test metric
if !strings.Contains(bodyStr, "test_requests_total") {
t.Error("test metric not found in metrics output")
}
// Cancel context to stop server
cancel()
// Wait for server to stop
select {
case err := <-errCh:
@@ -186,21 +186,21 @@ func TestListenAndServe(t *testing.T) {
func TestListenAndServeContextCancellation(t *testing.T) {
metrics := New()
ctx, cancel := context.WithCancel(context.Background())
// Start server
errCh := make(chan error, 1)
go func() {
errCh <- metrics.ListenAndServe(ctx, 9998)
}()
// Give server time to start
time.Sleep(100 * time.Millisecond)
// Cancel context
cancel()
// Server should stop gracefully
select {
case err := <-errCh:
@@ -215,7 +215,7 @@ func TestListenAndServeContextCancellation(t *testing.T) {
// Benchmark the metrics handler response time
func BenchmarkMetricsHandler(b *testing.B) {
metrics := New()
// Register some test metrics
for i := 0; i < 10; i++ {
counter := prometheus.NewCounter(prometheus.CounterOpts{
@@ -225,18 +225,18 @@ func BenchmarkMetricsHandler(b *testing.B) {
metrics.Registry().MustRegister(counter)
counter.Add(float64(i * 100))
}
handler := metrics.Handler()
b.ResetTimer()
for i := 0; i < b.N; i++ {
req := httptest.NewRequest("GET", "/metrics", nil)
recorder := httptest.NewRecorder()
handler.ServeHTTP(recorder, req)
if recorder.Code != http.StatusOK {
b.Fatalf("unexpected status code: %d", recorder.Code)
}
}
}
}

View File

@@ -15,7 +15,7 @@ mkdir -p $DIR
BASE=https://geodns.bitnames.com/${BASE}/builds/${BUILD}
files=`curl -sSf ${BASE}/checksums.txt | awk '{print $2}'`
files=`curl -sSf ${BASE}/checksums.txt | sed 's/^[a-f0-9]*[[:space:]]*//'`
metafiles="checksums.txt metadata.json CHANGELOG.md artifacts.json"
for f in $metafiles; do

View File

@@ -2,7 +2,7 @@
set -euo pipefail
go install github.com/goreleaser/goreleaser/v2@v2.8.2
go install github.com/goreleaser/goreleaser/v2@v2.11.0
if [ ! -z "${harbor_username:-}" ]; then
DOCKER_FILE=~/.docker/config.json

View File

@@ -38,26 +38,23 @@ package tracing
import (
"context"
"crypto/tls"
"crypto/x509"
"errors"
"log/slog"
"os"
"slices"
"time"
"go.ntppool.org/common/logger"
"go.ntppool.org/common/internal/tracerconfig"
"go.ntppool.org/common/version"
"google.golang.org/grpc/credentials"
"go.opentelemetry.io/contrib/exporters/autoexport"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
logglobal "go.opentelemetry.io/otel/log/global"
"go.opentelemetry.io/otel/log/global"
"go.opentelemetry.io/otel/propagation"
sdklog "go.opentelemetry.io/otel/sdk/log"
sdkmetric "go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/sdk/resource"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/semconv/v1.26.0"
@@ -67,12 +64,25 @@ import (
const (
// svcNameKey is the environment variable name that Service Name information will be read from.
svcNameKey = "OTEL_SERVICE_NAME"
otelExporterOTLPProtoEnvKey = "OTEL_EXPORTER_OTLP_PROTOCOL"
otelExporterOTLPTracesProtoEnvKey = "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL"
)
var errInvalidOTLPProtocol = errors.New("invalid OTLP protocol - should be one of ['grpc', 'http/protobuf']")
// createOTLPLogExporter creates an OTLP log exporter using the provided configuration.
// This function is used as the LogExporterFactory for the tracerconfig bridge.
func createOTLPLogExporter(ctx context.Context, cfg *tracerconfig.Config) (sdklog.Exporter, error) {
return tracerconfig.CreateOTLPLogExporter(ctx, cfg)
}
// createOTLPMetricExporter creates an OTLP metric exporter using the provided configuration.
// This function is used as the MetricExporterFactory for the tracerconfig bridge.
func createOTLPMetricExporter(ctx context.Context, cfg *tracerconfig.Config) (sdkmetric.Exporter, error) {
return tracerconfig.CreateOTLPMetricExporter(ctx, cfg)
}
// createOTLPTraceExporter creates an OTLP trace exporter using the provided configuration.
// This function is used as the TraceExporterFactory for the tracerconfig bridge.
func createOTLPTraceExporter(ctx context.Context, cfg *tracerconfig.Config) (sdktrace.SpanExporter, error) {
return tracerconfig.CreateOTLPTraceExporter(ctx, cfg)
}
// https://github.com/open-telemetry/opentelemetry-go/blob/main/exporters/otlp/otlptrace/otlptracehttp/example_test.go
@@ -98,10 +108,9 @@ func Start(ctx context.Context, spanName string, opts ...trace.SpanStartOption)
return Tracer().Start(ctx, spanName, opts...)
}
// GetClientCertificate defines a function type for providing client certificates for mutual TLS.
// This is used when exporting telemetry data to secured OTLP endpoints that require
// client certificate authentication.
type GetClientCertificate func(*tls.CertificateRequestInfo) (*tls.Certificate, error)
// GetClientCertificate is an alias for the type defined in tracerconfig.
// This maintains backward compatibility for existing code.
type GetClientCertificate = tracerconfig.GetClientCertificate
// TracerConfig provides configuration options for OpenTelemetry tracing setup.
// It supplements standard OpenTelemetry environment variables with additional
@@ -143,7 +152,18 @@ func SetupSDK(ctx context.Context, cfg *TracerConfig) (shutdown TpShutdownFunc,
cfg = &TracerConfig{}
}
log := logger.Setup()
// Store configuration for use by logger and metrics packages via bridge
bridgeConfig := &tracerconfig.Config{
ServiceName: cfg.ServiceName,
Environment: cfg.Environment,
Endpoint: cfg.Endpoint,
EndpointURL: cfg.EndpointURL,
CertificateProvider: cfg.CertificateProvider,
RootCAs: cfg.RootCAs,
}
tracerconfig.Store(ctx, bridgeConfig, createOTLPLogExporter, createOTLPMetricExporter, createOTLPTraceExporter)
log := slog.Default()
if serviceName := os.Getenv(svcNameKey); len(serviceName) == 0 {
if len(cfg.ServiceName) > 0 {
@@ -184,13 +204,21 @@ func SetupSDK(ctx context.Context, cfg *TracerConfig) (shutdown TpShutdownFunc,
var shutdownFuncs []func(context.Context) error
shutdown = func(ctx context.Context) error {
// Force flush the global logger provider before shutting down anything else
if loggerProvider := global.GetLoggerProvider(); loggerProvider != nil {
if sdkProvider, ok := loggerProvider.(*sdklog.LoggerProvider); ok {
if flushErr := sdkProvider.ForceFlush(ctx); flushErr != nil {
log.Warn("logger provider force flush failed", "err", flushErr)
}
}
}
var err error
// need to shutdown the providers first,
// exporters after which is the opposite
// order they are setup.
slices.Reverse(shutdownFuncs)
for _, fn := range shutdownFuncs {
// log.Warn("shutting down", "fn", fn)
err = errors.Join(err, fn(ctx))
}
shutdownFuncs = nil
@@ -212,9 +240,9 @@ func SetupSDK(ctx context.Context, cfg *TracerConfig) (shutdown TpShutdownFunc,
switch os.Getenv("OTEL_TRACES_EXPORTER") {
case "":
spanExporter, err = newOLTPExporter(ctx, cfg)
spanExporter, err = createOTLPTraceExporter(ctx, bridgeConfig)
case "otlp":
spanExporter, err = newOLTPExporter(ctx, cfg)
spanExporter, err = createOTLPTraceExporter(ctx, bridgeConfig)
default:
// log.Debug("OTEL_TRACES_EXPORTER", "fallback", os.Getenv("OTEL_TRACES_EXPORTER"))
spanExporter, err = autoexport.NewSpanExporter(ctx)
@@ -225,13 +253,6 @@ func SetupSDK(ctx context.Context, cfg *TracerConfig) (shutdown TpShutdownFunc,
}
shutdownFuncs = append(shutdownFuncs, spanExporter.Shutdown)
logExporter, err := autoexport.NewLogExporter(ctx)
if err != nil {
handleErr(err)
return
}
shutdownFuncs = append(shutdownFuncs, logExporter.Shutdown)
// Set up trace provider.
tracerProvider, err := newTraceProvider(spanExporter, res)
if err != nil {
@@ -241,19 +262,6 @@ func SetupSDK(ctx context.Context, cfg *TracerConfig) (shutdown TpShutdownFunc,
shutdownFuncs = append(shutdownFuncs, tracerProvider.Shutdown)
otel.SetTracerProvider(tracerProvider)
logProvider := sdklog.NewLoggerProvider(sdklog.WithResource(res),
sdklog.WithProcessor(
sdklog.NewBatchProcessor(logExporter, sdklog.WithExportBufferSize(10)),
),
)
logglobal.SetLoggerProvider(logProvider)
shutdownFuncs = append(shutdownFuncs, func(ctx context.Context) error {
logProvider.ForceFlush(ctx)
return logProvider.Shutdown(ctx)
},
)
if err != nil {
handleErr(err)
return
@@ -262,74 +270,6 @@ func SetupSDK(ctx context.Context, cfg *TracerConfig) (shutdown TpShutdownFunc,
return
}
func newOLTPExporter(ctx context.Context, cfg *TracerConfig) (sdktrace.SpanExporter, error) {
log := logger.Setup()
var tlsConfig *tls.Config
if cfg.CertificateProvider != nil {
tlsConfig = &tls.Config{
GetClientCertificate: cfg.CertificateProvider,
RootCAs: cfg.RootCAs,
}
}
proto := os.Getenv(otelExporterOTLPTracesProtoEnvKey)
if proto == "" {
proto = os.Getenv(otelExporterOTLPProtoEnvKey)
}
// Fallback to default, http/protobuf.
if proto == "" {
proto = "http/protobuf"
}
var client otlptrace.Client
switch proto {
case "grpc":
opts := []otlptracegrpc.Option{
otlptracegrpc.WithCompressor("gzip"),
}
if tlsConfig != nil {
opts = append(opts, otlptracegrpc.WithTLSCredentials(credentials.NewTLS(tlsConfig)))
}
if len(cfg.Endpoint) > 0 {
log.Info("adding option", "Endpoint", cfg.Endpoint)
opts = append(opts, otlptracegrpc.WithEndpoint(cfg.Endpoint))
}
if len(cfg.EndpointURL) > 0 {
log.Info("adding option", "EndpointURL", cfg.EndpointURL)
opts = append(opts, otlptracegrpc.WithEndpointURL(cfg.EndpointURL))
}
client = otlptracegrpc.NewClient(opts...)
case "http/protobuf", "http/json":
opts := []otlptracehttp.Option{
otlptracehttp.WithCompression(otlptracehttp.GzipCompression),
}
if tlsConfig != nil {
opts = append(opts, otlptracehttp.WithTLSClientConfig(tlsConfig))
}
if len(cfg.Endpoint) > 0 {
opts = append(opts, otlptracehttp.WithEndpoint(cfg.Endpoint))
}
if len(cfg.EndpointURL) > 0 {
opts = append(opts, otlptracehttp.WithEndpointURL(cfg.EndpointURL))
}
client = otlptracehttp.NewClient(opts...)
default:
return nil, errInvalidOTLPProtocol
}
exporter, err := otlptrace.New(ctx, client)
if err != nil {
log.ErrorContext(ctx, "creating OTLP trace exporter", "err", err)
}
return exporter, err
}
func newTraceProvider(traceExporter sdktrace.SpanExporter, res *resource.Resource) (*sdktrace.TracerProvider, error) {
traceProvider := sdktrace.NewTracerProvider(
sdktrace.WithResource(res),

View File

@@ -10,8 +10,18 @@
// -X go.ntppool.org/common/version.buildTime=2023-01-01T00:00:00Z \
// -X go.ntppool.org/common/version.gitVersion=abc123"
//
// The package also automatically extracts build information from Go's debug.BuildInfo
// when available, providing fallback values for VCS time and revision.
// Build time supports both Unix epoch timestamps and RFC3339 format:
//
// # Unix epoch (simpler, recommended)
// go build -ldflags "-X go.ntppool.org/common/version.buildTime=$(date +%s)"
//
// # RFC3339 format
// go build -ldflags "-X go.ntppool.org/common/version.buildTime=$(date -u +%Y-%m-%dT%H:%M:%SZ)"
//
// Both formats are automatically converted to RFC3339 for consistent output. The buildTime
// parameter takes priority over Git commit time. If buildTime is not specified, the package
// automatically extracts build information from Go's debug.BuildInfo when available,
// providing fallback values for VCS time and revision.
package version
import (
@@ -19,7 +29,9 @@ import (
"log/slog"
"runtime"
"runtime/debug"
"strconv"
"strings"
"time"
"github.com/prometheus/client_golang/prometheus"
"github.com/spf13/cobra"
@@ -30,7 +42,7 @@ import (
// If not set, defaults to "dev-snapshot". The version should follow semantic versioning.
var (
VERSION string // Semantic version (e.g., "1.0.0" or "v1.0.0")
buildTime string // Build timestamp (RFC3339 format)
buildTime string // Build timestamp (Unix epoch or RFC3339, normalized to RFC3339)
gitVersion string // Git commit hash
gitModified bool // Whether the working tree was modified during build
)
@@ -38,6 +50,28 @@ var (
// info holds the consolidated version information extracted from build variables and debug.BuildInfo.
var info Info
// parseBuildTime converts a build time string to RFC3339 format.
// Supports both Unix epoch timestamps (numeric strings) and RFC3339 format.
// Returns the input unchanged if it cannot be parsed as either format.
func parseBuildTime(s string) string {
if s == "" {
return s
}
// Try parsing as Unix epoch timestamp (numeric string)
if epoch, err := strconv.ParseInt(s, 10, 64); err == nil {
return time.Unix(epoch, 0).UTC().Format(time.RFC3339)
}
// Try parsing as RFC3339 to validate format
if _, err := time.Parse(time.RFC3339, s); err == nil {
return s // Already in RFC3339 format
}
// Return original string if neither format works (graceful fallback)
return s
}
// Info represents structured version and build information.
// This struct is used for JSON serialization and programmatic access to build metadata.
type Info struct {
@@ -48,6 +82,7 @@ type Info struct {
}
func init() {
buildTime = parseBuildTime(buildTime)
info.BuildTime = buildTime
info.GitRev = gitVersion
@@ -67,9 +102,9 @@ func init() {
switch h.Key {
case "vcs.time":
if len(buildTime) == 0 {
buildTime = h.Value
buildTime = parseBuildTime(h.Value)
info.BuildTime = buildTime
}
info.BuildTime = h.Value
case "vcs.revision":
// https://blog.carlmjohnson.net/post/2023/golang-git-hash-how-to/
// todo: use BuildInfo.Main.Version if revision is empty

View File

@@ -309,3 +309,106 @@ func BenchmarkCheckVersionDevSnapshot(b *testing.B) {
_ = CheckVersion(version, minimum)
}
}
func TestParseBuildTime(t *testing.T) {
tests := []struct {
name string
input string
expected string
}{
{
name: "Unix epoch timestamp",
input: "1672531200", // 2023-01-01T00:00:00Z
expected: "2023-01-01T00:00:00Z",
},
{
name: "Unix epoch zero",
input: "0",
expected: "1970-01-01T00:00:00Z",
},
{
name: "Valid RFC3339 format",
input: "2023-12-25T15:30:45Z",
expected: "2023-12-25T15:30:45Z",
},
{
name: "RFC3339 with timezone",
input: "2023-12-25T10:30:45-05:00",
expected: "2023-12-25T10:30:45-05:00",
},
{
name: "Empty string",
input: "",
expected: "",
},
{
name: "Invalid format - return unchanged",
input: "not-a-date",
expected: "not-a-date",
},
{
name: "Invalid timestamp - return unchanged",
input: "invalid-timestamp",
expected: "invalid-timestamp",
},
{
name: "Partial date - return unchanged",
input: "2023-01-01",
expected: "2023-01-01",
},
{
name: "Negative epoch - should work",
input: "-1",
expected: "1969-12-31T23:59:59Z",
},
{
name: "Large epoch timestamp",
input: "4102444800", // 2100-01-01T00:00:00Z
expected: "2100-01-01T00:00:00Z",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := parseBuildTime(tt.input)
if result != tt.expected {
t.Errorf("parseBuildTime(%q) = %q, expected %q", tt.input, result, tt.expected)
}
})
}
}
func TestParseBuildTimeConsistency(t *testing.T) {
// Test that calling parseBuildTime multiple times with the same input returns the same result
testInputs := []string{
"1672531200",
"2023-01-01T00:00:00Z",
"invalid-date",
"",
}
for _, input := range testInputs {
result1 := parseBuildTime(input)
result2 := parseBuildTime(input)
if result1 != result2 {
t.Errorf("parseBuildTime(%q) not consistent: %q != %q", input, result1, result2)
}
}
}
func BenchmarkParseBuildTime(b *testing.B) {
inputs := []string{
"1672531200", // Unix epoch
"2023-01-01T00:00:00Z", // RFC3339
"invalid-timestamp", // Invalid
"", // Empty
}
for _, input := range inputs {
b.Run(input, func(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = parseBuildTime(input)
}
})
}
}