network

package
v1.7.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 12, 2026 License: Apache-2.0 Imports: 28 Imported by: 0

Documentation

Overview

Package network provides network health monitoring capabilities.

Package network provides network health monitoring capabilities.

Package network provides network health monitoring capabilities.

Package network provides network-related health monitoring capabilities. It includes DNS resolution monitoring for cluster and external domains, latency measurement, and nameserver verification.

Package network provides network health monitoring capabilities.

Package network provides network health monitoring capabilities.

Package network provides network health monitoring capabilities.

Package network provides network health monitoring capabilities.

Package network provides network health monitoring capabilities.

Package network provides network health monitoring capabilities.

Package network provides network health monitoring capabilities.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func AddCNIHealthToStatus

func AddCNIHealthToStatus(status *types.Status, result *CNIHealthResult)

AddCNIHealthToStatus adds CNI health check results to a Status object.

func NewCNIMonitor

func NewCNIMonitor(ctx context.Context, config types.MonitorConfig) (types.Monitor, error)

NewCNIMonitor creates a new CNI connectivity monitor instance.

func NewConnectivityMonitor

func NewConnectivityMonitor(ctx context.Context, config types.MonitorConfig) (types.Monitor, error)

NewConnectivityMonitor creates a new connectivity monitor instance.

func NewDNSMonitor

func NewDNSMonitor(ctx context.Context, config types.MonitorConfig) (types.Monitor, error)

NewDNSMonitor creates a new DNS monitor instance.

func NewGatewayMonitor

func NewGatewayMonitor(ctx context.Context, config types.MonitorConfig) (types.Monitor, error)

NewGatewayMonitor creates a new gateway monitor instance.

func NewIPForwardingMonitor added in v1.7.2

func NewIPForwardingMonitor(ctx context.Context, config types.MonitorConfig) (types.Monitor, error)

NewIPForwardingMonitor creates a new IP forwarding monitor instance.

func ValidateCNIConfig

func ValidateCNIConfig(config types.MonitorConfig) error

ValidateCNIConfig validates the CNI monitor configuration.

func ValidateConnectivityConfig

func ValidateConnectivityConfig(config types.MonitorConfig) error

ValidateConnectivityConfig validates the connectivity monitor configuration.

func ValidateDNSConfig

func ValidateDNSConfig(config types.MonitorConfig) error

ValidateDNSConfig validates the DNS monitor configuration.

func ValidateGatewayConfig

func ValidateGatewayConfig(config types.MonitorConfig) error

ValidateGatewayConfig validates the gateway monitor configuration.

func ValidateIPForwardingConfig added in v1.7.2

func ValidateIPForwardingConfig(config types.MonitorConfig) error

ValidateIPForwardingConfig validates the IP forwarding monitor configuration.

Types

type CNIConfigResult

type CNIConfigResult struct {
	Healthy       bool
	ConfigPath    string
	ConfigFiles   []string
	ValidConfigs  int
	InvalidFiles  []string
	Errors        []string
	PrimaryConfig string
	CNIType       string
}

CNIConfigResult holds CNI configuration file check results.

type CNIHealthChecker

type CNIHealthChecker interface {
	// CheckHealth performs CNI health validation.
	CheckHealth() *CNIHealthResult
	// CheckConfigFiles validates CNI config files exist and are readable.
	CheckConfigFiles() *CNIConfigResult
	// CheckInterfaces validates expected network interfaces exist.
	CheckInterfaces() *CNIInterfaceResult
}

CNIHealthChecker validates CNI configuration and network interfaces.

func NewCNIHealthChecker

func NewCNIHealthChecker(config CNIHealthConfig) CNIHealthChecker

NewCNIHealthChecker creates a new CNI health checker.

type CNIHealthConfig

type CNIHealthConfig struct {
	// Enabled indicates whether CNI health checks are enabled.
	Enabled bool
	// ConfigPath is the path to CNI configuration directory.
	ConfigPath string
	// CheckInterfaces enables interface health checking.
	CheckInterfaces bool
	// ExpectedInterfaces is a list of expected CNI interfaces.
	ExpectedInterfaces []string
}

CNIHealthConfig holds CNI health check configuration.

type CNIHealthResult

type CNIHealthResult struct {
	Healthy      bool
	ConfigResult *CNIConfigResult
	IfaceResult  *CNIInterfaceResult
	Issues       []string
}

CNIHealthResult holds the overall CNI health status.

type CNIInterface

type CNIInterface struct {
	Name    string
	Type    string // e.g., "calico", "flannel", "weave", "bridge", "veth"
	Up      bool
	MTU     int
	HasIPv4 bool
	HasIPv6 bool
	Addrs   []string
}

CNIInterface represents a discovered CNI-related network interface.

type CNIInterfaceResult

type CNIInterfaceResult struct {
	Healthy            bool
	ExpectedInterfaces []string
	FoundInterfaces    []string
	MissingInterfaces  []string
	CNIInterfaces      []CNIInterface
	Errors             []string
}

CNIInterfaceResult holds network interface check results.

type CNIMonitor

type CNIMonitor struct {
	*monitors.BaseMonitor
	// contains filtered or unexported fields
}

CNIMonitor monitors CNI connectivity and cross-node health.

func (*CNIMonitor) GetPeerStatuses

func (m *CNIMonitor) GetPeerStatuses() map[string]*PeerStatus

GetPeerStatuses returns the current peer status map. This is useful for debugging and monitoring.

func (*CNIMonitor) Start

func (m *CNIMonitor) Start() (<-chan *types.Status, error)

Start starts the CNI monitor with peer discovery.

func (*CNIMonitor) Stop

func (m *CNIMonitor) Stop()

Stop stops the CNI monitor and peer discovery.

type CNIMonitorConfig

type CNIMonitorConfig struct {
	// Discovery configuration
	Discovery DiscoveryConfig

	// Connectivity configuration
	Connectivity ConnectivityConfig

	// CNI health check configuration
	CNIHealth CNIHealthConfig
}

CNIMonitorConfig holds the configuration for the CNI connectivity monitor.

type CheckResult added in v1.5.0

type CheckResult struct {
	Timestamp time.Time
	Success   bool
	Latency   time.Duration
}

CheckResult represents a single DNS check outcome for success rate tracking.

type ConnectivityConfig

type ConnectivityConfig struct {
	// PingCount is the number of pings to send per peer.
	PingCount int
	// PingTimeout is the timeout for each ping.
	PingTimeout time.Duration
	// WarningLatency is the latency threshold for warnings.
	WarningLatency time.Duration
	// CriticalLatency is the latency threshold for critical conditions.
	CriticalLatency time.Duration
	// FailureThreshold is consecutive failures before marking peer unreachable.
	FailureThreshold int
	// MinReachablePeers is the percentage of peers that must be reachable.
	MinReachablePeers int
	// ProbeMethod is the connectivity probe method: "http" or "icmp".
	// Empty string means auto-detect based on OverlayTestEnabled.
	ProbeMethod string
	// ProbePort is the HTTP probe port (default 8023).
	ProbePort int
	// ProbePath is the HTTP probe path (default "/healthz").
	ProbePath string
}

ConnectivityConfig holds connectivity check configuration.

type ConnectivityMonitor

type ConnectivityMonitor struct {
	*monitors.BaseMonitor
	// contains filtered or unexported fields
}

ConnectivityMonitor monitors external network connectivity by checking HTTP/HTTPS endpoints.

type ConnectivityMonitorConfig

type ConnectivityMonitorConfig struct {
	// Endpoints is the list of endpoints to check for connectivity.
	Endpoints []EndpointConfig
	// FailureThreshold is the number of consecutive failures before reporting NetworkUnreachable.
	FailureThreshold int
}

ConnectivityMonitorConfig holds the configuration for the connectivity monitor.

type ConsistencyCheckConfig added in v1.5.0

type ConsistencyCheckConfig struct {
	Enabled                bool          `json:"enabled"`
	QueriesPerCheck        int           `json:"queriesPerCheck"`        // Number of rapid queries (default: 5)
	IntervalBetweenQueries time.Duration `json:"intervalBetweenQueries"` // Delay between queries (default: 200ms)
}

ConsistencyCheckConfig holds configuration for DNS consistency checking. Consistency checking performs multiple rapid queries to detect intermittent DNS issues.

type DNSErrorType added in v1.5.0

type DNSErrorType string

DNSErrorType represents the classification of DNS errors for better diagnostics.

const (
	// DNSErrorTimeout indicates the DNS query timed out.
	// Suggests: Network issues, server overload, or firewall blocking.
	DNSErrorTimeout DNSErrorType = "Timeout"

	// DNSErrorNXDOMAIN indicates the domain does not exist.
	// Suggests: Typo in domain, missing DNS record, or DNS zone not configured.
	DNSErrorNXDOMAIN DNSErrorType = "NXDOMAIN"

	// DNSErrorSERVFAIL indicates the server failed to complete the query.
	// Suggests: Upstream DNS server error or DNSSEC validation failure.
	DNSErrorSERVFAIL DNSErrorType = "SERVFAIL"

	// DNSErrorRefused indicates the connection to the DNS server was refused.
	// Suggests: DNS server down, wrong port, or firewall blocking.
	DNSErrorRefused DNSErrorType = "Refused"

	// DNSErrorTemporary indicates a temporary/transient DNS failure.
	// Suggests: Retry may succeed.
	DNSErrorTemporary DNSErrorType = "Temporary"

	// DNSErrorUnknown indicates an unclassified DNS error.
	DNSErrorUnknown DNSErrorType = "Unknown"
)

type DNSMonitor

type DNSMonitor struct {

	// BaseMonitor for lifecycle management
	*monitors.BaseMonitor
	// contains filtered or unexported fields
}

DNSMonitor monitors DNS resolution health.

type DNSMonitorConfig

type DNSMonitorConfig struct {
	// ClusterDomains are Kubernetes cluster internal domains to test
	ClusterDomains []string `json:"clusterDomains"`

	// ExternalDomains are external domains to test for internet connectivity
	ExternalDomains []string `json:"externalDomains"`

	// CustomQueries are additional custom DNS queries to perform
	CustomQueries []DNSQuery `json:"customQueries"`

	// LatencyThreshold is the maximum acceptable DNS query latency
	LatencyThreshold time.Duration `json:"latencyThreshold"`

	// NameserverCheckEnabled enables checking nameserver reachability
	NameserverCheckEnabled bool `json:"nameserverCheckEnabled"`

	// ResolverPath is the path to the resolver configuration file
	ResolverPath string `json:"resolverPath"`

	// FailureCountThreshold is the number of consecutive failures before reporting NetworkUnreachable
	FailureCountThreshold int `json:"failureCountThreshold"`

	// SuccessRateTracking configures sliding window success rate tracking
	SuccessRateTracking *SuccessRateConfig `json:"successRateTracking"`

	// ConsistencyChecking configures DNS consistency verification via multiple rapid queries
	ConsistencyChecking *ConsistencyCheckConfig `json:"consistencyChecking"`
}

DNSMonitorConfig holds the configuration for the DNS monitor.

type DNSQuery

type DNSQuery struct {
	Domain             string
	RecordType         string // Currently only "A" is supported
	TestEachNameserver bool   // Test this domain against each nameserver individually
	ConsistencyCheck   bool   // Enable consistency checking with multiple rapid queries
}

DNSQuery represents a custom DNS query configuration.

type DiscoveryConfig

type DiscoveryConfig struct {
	// Method is the discovery method ("kubernetes" or "static").
	Method string
	// Namespace to search for peers (for kubernetes method).
	Namespace string
	// LabelSelector for filtering pods (for kubernetes method).
	LabelSelector string
	// RefreshInterval is how often to refresh the peer list.
	RefreshInterval time.Duration
	// StaticPeers is a list of static peer IPs (for static method).
	StaticPeers []string
	// OverlayTestEnabled enables overlay network testing mode.
	// When enabled, discovers overlay-test pods and pings their overlay IPs
	// instead of node IPs. This provides accurate CNI/overlay network testing.
	OverlayTestEnabled bool
	// OverlayTestLabelSelector is the label selector for overlay test pods.
	OverlayTestLabelSelector string
}

DiscoveryConfig holds peer discovery configuration.

type EndpointConfig

type EndpointConfig struct {
	// Name is a human-readable identifier for this endpoint.
	Name string
	// URL is the endpoint URL to check.
	URL string
	// Method is the HTTP method to use (GET, HEAD, POST, etc.).
	Method string
	// ExpectedStatusCode is the HTTP status code expected for a successful check.
	ExpectedStatusCode int
	// Timeout is the maximum duration for the HTTP request.
	Timeout time.Duration
	// Headers are optional HTTP headers to include in the request.
	Headers map[string]string
	// FollowRedirects determines whether to follow HTTP redirects.
	FollowRedirects bool
}

EndpointConfig defines the configuration for a connectivity endpoint check.

type EndpointResult

type EndpointResult struct {
	// Success indicates whether the endpoint check succeeded.
	Success bool
	// StatusCode is the HTTP response status code (0 if request failed).
	StatusCode int
	// ResponseTime is the duration of the HTTP request.
	ResponseTime time.Duration
	// Error contains any error that occurred during the check.
	Error error
	// ErrorType classifies the error (DNS, Connection, HTTP, Timeout, etc.).
	ErrorType string
}

EndpointResult contains the result of an endpoint connectivity check.

type GatewayMonitor

type GatewayMonitor struct {
	*monitors.BaseMonitor
	// contains filtered or unexported fields
}

GatewayMonitor monitors the default gateway's reachability and latency.

type GatewayMonitorConfig

type GatewayMonitorConfig struct {
	// PingCount is the number of pings to send per check cycle.
	PingCount int
	// PingTimeout is the timeout for each individual ping.
	PingTimeout time.Duration
	// LatencyThreshold is the threshold above which latency is considered high.
	LatencyThreshold time.Duration
	// AutoDetectGateway enables automatic gateway detection from route table.
	AutoDetectGateway bool
	// ManualGateway allows manual specification of gateway IP (overrides auto-detection).
	ManualGateway string
	// FailureCountThreshold is the number of consecutive failures before reporting NetworkUnreachable.
	FailureCountThreshold int
}

GatewayMonitorConfig holds the configuration for the gateway monitor.

type HTTPClient

type HTTPClient interface {
	// CheckEndpoint performs a connectivity check against the given endpoint.
	CheckEndpoint(ctx context.Context, endpoint EndpointConfig) (*EndpointResult, error)
}

HTTPClient interface abstracts HTTP endpoint checking for testability.

type IPForwardingConfig added in v1.7.2

type IPForwardingConfig struct {
	// CheckIPv4 enables checking /proc/sys/net/ipv4/ip_forward.
	CheckIPv4 bool
	// CheckIPv6 enables checking /proc/sys/net/ipv6/conf/all/forwarding.
	CheckIPv6 bool
	// CheckPerInterface enables checking per-interface forwarding settings.
	CheckPerInterface bool
	// Interfaces limits per-interface checks to these specific interfaces.
	// Empty means check all interfaces found via glob.
	Interfaces []string
	// ProcPath is the base path for the proc filesystem.
	// Defaults to "/proc", but can be set to "/host/proc" for containerized deployments.
	ProcPath string
}

IPForwardingConfig holds the configuration for the IP forwarding monitor.

type IPForwardingMonitor added in v1.7.2

type IPForwardingMonitor struct {
	*monitors.BaseMonitor
	// contains filtered or unexported fields
}

IPForwardingMonitor monitors IP forwarding settings required for Kubernetes networking.

type NameserverDomainStatus added in v1.5.0

type NameserverDomainStatus struct {
	Nameserver   string
	Domain       string
	FailureCount int
	LastSuccess  time.Time
	LastLatency  time.Duration
}

NameserverDomainStatus tracks the status of DNS resolution for a specific nameserver and domain combination.

type Peer

type Peer struct {
	// Name is the pod name.
	Name string
	// NodeName is the Kubernetes node the peer is running on.
	NodeName string
	// NodeIP is the IP address of the node (used for pinging since hostNetwork=true).
	NodeIP string
	// PodIP is the pod IP (same as NodeIP when using hostNetwork).
	PodIP string
	// LastSeen is when this peer was last seen in discovery.
	LastSeen time.Time
}

Peer represents a discovered node-doctor peer.

type PeerDiscovery

type PeerDiscovery interface {
	// GetPeers returns the current list of discovered peers.
	GetPeers() []Peer
	// Refresh forces an immediate refresh of the peer list.
	Refresh(ctx context.Context) error
	// Start begins background peer discovery.
	Start(ctx context.Context) error
	// Stop stops background peer discovery.
	Stop()
}

PeerDiscovery handles discovery of node-doctor peer instances.

func NewKubernetesPeerDiscovery

func NewKubernetesPeerDiscovery(config *PeerDiscoveryConfig) (PeerDiscovery, error)

NewKubernetesPeerDiscovery creates a new peer discovery instance using Kubernetes API.

func NewKubernetesPeerDiscoveryWithClient

func NewKubernetesPeerDiscoveryWithClient(config *PeerDiscoveryConfig, client kubernetes.Interface) (PeerDiscovery, error)

NewKubernetesPeerDiscoveryWithClient creates a peer discovery instance with an existing client. This is useful for testing with mock clients.

func NewStaticPeerDiscovery

func NewStaticPeerDiscovery(peers []Peer) PeerDiscovery

NewStaticPeerDiscovery creates a peer discovery instance with a static list.

type PeerDiscoveryConfig

type PeerDiscoveryConfig struct {
	// Namespace to search for peers.
	Namespace string
	// LabelSelector for filtering pods (e.g., "app=node-doctor").
	LabelSelector string
	// RefreshInterval is how often to refresh the peer list.
	RefreshInterval time.Duration
	// SelfNodeName is the name of the current node (to exclude self).
	SelfNodeName string
	// Kubeconfig path (optional, uses in-cluster config if empty).
	Kubeconfig string
}

PeerDiscoveryConfig holds configuration for peer discovery.

type PeerStatus

type PeerStatus struct {
	Peer             Peer
	Reachable        bool
	LastLatency      time.Duration
	AvgLatency       time.Duration
	FailureCount     int
	LastCheck        time.Time
	LastSuccess      time.Time
	ConsecutiveFails int
}

PeerStatus tracks the status of a single peer.

type PingResult

type PingResult struct {
	// Success indicates whether the ping succeeded.
	Success bool
	// RTT is the round-trip time for successful pings.
	RTT time.Duration
	// Error contains the error if the ping failed.
	Error error
}

PingResult represents the result of a single ping operation.

type Pinger

type Pinger interface {
	// Ping sends connectivity probes to the target IP address.
	// It returns a slice of results, one for each probe attempt.
	Ping(ctx context.Context, target string, count int, timeout time.Duration) ([]PingResult, error)
}

Pinger is an interface for connectivity probe operations (ICMP or HTTP). This interface allows for mocking probe operations in tests.

type Resolver

type Resolver interface {
	// LookupHost looks up the given host using the local resolver.
	// It returns a slice of that host's addresses.
	LookupHost(ctx context.Context, host string) ([]string, error)

	// LookupAddr performs a reverse lookup for the given address,
	// returning a list of names mapping to that address.
	LookupAddr(ctx context.Context, addr string) ([]string, error)
}

Resolver is an interface for DNS resolution operations. This interface allows for mocking DNS resolution in tests.

type RingBuffer added in v1.5.0

type RingBuffer struct {
	// contains filtered or unexported fields
}

RingBuffer implements a fixed-size circular buffer for tracking check results. It provides O(1) insertions and O(n) success rate calculation.

func NewRingBuffer added in v1.5.0

func NewRingBuffer(size int) *RingBuffer

NewRingBuffer creates a new ring buffer with specified capacity.

func (*RingBuffer) Add added in v1.5.0

func (rb *RingBuffer) Add(result *CheckResult)

Add appends a result to the buffer, overwriting the oldest entry if full.

func (*RingBuffer) Count added in v1.5.0

func (rb *RingBuffer) Count() int

Count returns the number of valid entries in the buffer.

func (*RingBuffer) GetFailureRate added in v1.5.0

func (rb *RingBuffer) GetFailureRate() float64

GetFailureRate calculates the failure percentage in the buffer.

func (*RingBuffer) GetSuccessRate added in v1.5.0

func (rb *RingBuffer) GetSuccessRate() float64

GetSuccessRate calculates the success percentage in the buffer. Returns 0.0 if no entries exist.

func (*RingBuffer) Size added in v1.5.0

func (rb *RingBuffer) Size() int

Size returns the capacity of the buffer.

type SuccessRateConfig added in v1.5.0

type SuccessRateConfig struct {
	Enabled              bool    `json:"enabled"`
	WindowSize           int     `json:"windowSize"`           // Number of checks to track (default: 10)
	FailureRateThreshold float64 `json:"failureRateThreshold"` // Alert if failure rate exceeds this (default: 0.3 = 30%)
	MinSamplesRequired   int     `json:"minSamplesRequired"`   // Minimum samples before alerting (default: 5)
}

SuccessRateConfig holds configuration for success rate tracking.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL