Documentation
¶
Overview ¶
Package network provides network health monitoring capabilities.
Package network provides network health monitoring capabilities.
Package network provides network health monitoring capabilities.
Package network provides network-related health monitoring capabilities. It includes DNS resolution monitoring for cluster and external domains, latency measurement, and nameserver verification.
Package network provides network health monitoring capabilities.
Package network provides network health monitoring capabilities.
Package network provides network health monitoring capabilities.
Package network provides network health monitoring capabilities.
Package network provides network health monitoring capabilities.
Package network provides network health monitoring capabilities.
Package network provides network health monitoring capabilities.
Index ¶
- func AddCNIHealthToStatus(status *types.Status, result *CNIHealthResult)
- func NewCNIMonitor(ctx context.Context, config types.MonitorConfig) (types.Monitor, error)
- func NewConnectivityMonitor(ctx context.Context, config types.MonitorConfig) (types.Monitor, error)
- func NewDNSMonitor(ctx context.Context, config types.MonitorConfig) (types.Monitor, error)
- func NewGatewayMonitor(ctx context.Context, config types.MonitorConfig) (types.Monitor, error)
- func NewIPForwardingMonitor(ctx context.Context, config types.MonitorConfig) (types.Monitor, error)
- func ValidateCNIConfig(config types.MonitorConfig) error
- func ValidateConnectivityConfig(config types.MonitorConfig) error
- func ValidateDNSConfig(config types.MonitorConfig) error
- func ValidateGatewayConfig(config types.MonitorConfig) error
- func ValidateIPForwardingConfig(config types.MonitorConfig) error
- type CNIConfigResult
- type CNIHealthChecker
- type CNIHealthConfig
- type CNIHealthResult
- type CNIInterface
- type CNIInterfaceResult
- type CNIMonitor
- type CNIMonitorConfig
- type CheckResult
- type ConnectivityConfig
- type ConnectivityMonitor
- type ConnectivityMonitorConfig
- type ConsistencyCheckConfig
- type DNSErrorType
- type DNSMonitor
- type DNSMonitorConfig
- type DNSQuery
- type DiscoveryConfig
- type EndpointConfig
- type EndpointResult
- type GatewayMonitor
- type GatewayMonitorConfig
- type HTTPClient
- type IPForwardingConfig
- type IPForwardingMonitor
- type NameserverDomainStatus
- type Peer
- type PeerDiscovery
- type PeerDiscoveryConfig
- type PeerStatus
- type PingResult
- type Pinger
- type Resolver
- type RingBuffer
- type SuccessRateConfig
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AddCNIHealthToStatus ¶
func AddCNIHealthToStatus(status *types.Status, result *CNIHealthResult)
AddCNIHealthToStatus adds CNI health check results to a Status object.
func NewCNIMonitor ¶
NewCNIMonitor creates a new CNI connectivity monitor instance.
func NewConnectivityMonitor ¶
NewConnectivityMonitor creates a new connectivity monitor instance.
func NewDNSMonitor ¶
NewDNSMonitor creates a new DNS monitor instance.
func NewGatewayMonitor ¶
NewGatewayMonitor creates a new gateway monitor instance.
func NewIPForwardingMonitor ¶ added in v1.7.2
NewIPForwardingMonitor creates a new IP forwarding monitor instance.
func ValidateCNIConfig ¶
func ValidateCNIConfig(config types.MonitorConfig) error
ValidateCNIConfig validates the CNI monitor configuration.
func ValidateConnectivityConfig ¶
func ValidateConnectivityConfig(config types.MonitorConfig) error
ValidateConnectivityConfig validates the connectivity monitor configuration.
func ValidateDNSConfig ¶
func ValidateDNSConfig(config types.MonitorConfig) error
ValidateDNSConfig validates the DNS monitor configuration.
func ValidateGatewayConfig ¶
func ValidateGatewayConfig(config types.MonitorConfig) error
ValidateGatewayConfig validates the gateway monitor configuration.
func ValidateIPForwardingConfig ¶ added in v1.7.2
func ValidateIPForwardingConfig(config types.MonitorConfig) error
ValidateIPForwardingConfig validates the IP forwarding monitor configuration.
Types ¶
type CNIConfigResult ¶
type CNIConfigResult struct {
Healthy bool
ConfigPath string
ConfigFiles []string
ValidConfigs int
InvalidFiles []string
Errors []string
PrimaryConfig string
CNIType string
}
CNIConfigResult holds CNI configuration file check results.
type CNIHealthChecker ¶
type CNIHealthChecker interface {
// CheckHealth performs CNI health validation.
CheckHealth() *CNIHealthResult
// CheckConfigFiles validates CNI config files exist and are readable.
CheckConfigFiles() *CNIConfigResult
// CheckInterfaces validates expected network interfaces exist.
CheckInterfaces() *CNIInterfaceResult
}
CNIHealthChecker validates CNI configuration and network interfaces.
func NewCNIHealthChecker ¶
func NewCNIHealthChecker(config CNIHealthConfig) CNIHealthChecker
NewCNIHealthChecker creates a new CNI health checker.
type CNIHealthConfig ¶
type CNIHealthConfig struct {
// Enabled indicates whether CNI health checks are enabled.
Enabled bool
// ConfigPath is the path to CNI configuration directory.
ConfigPath string
// CheckInterfaces enables interface health checking.
CheckInterfaces bool
// ExpectedInterfaces is a list of expected CNI interfaces.
ExpectedInterfaces []string
}
CNIHealthConfig holds CNI health check configuration.
type CNIHealthResult ¶
type CNIHealthResult struct {
Healthy bool
ConfigResult *CNIConfigResult
IfaceResult *CNIInterfaceResult
Issues []string
}
CNIHealthResult holds the overall CNI health status.
type CNIInterface ¶
type CNIInterface struct {
Name string
Type string // e.g., "calico", "flannel", "weave", "bridge", "veth"
Up bool
MTU int
HasIPv4 bool
HasIPv6 bool
Addrs []string
}
CNIInterface represents a discovered CNI-related network interface.
type CNIInterfaceResult ¶
type CNIInterfaceResult struct {
Healthy bool
ExpectedInterfaces []string
FoundInterfaces []string
MissingInterfaces []string
CNIInterfaces []CNIInterface
Errors []string
}
CNIInterfaceResult holds network interface check results.
type CNIMonitor ¶
type CNIMonitor struct {
*monitors.BaseMonitor
// contains filtered or unexported fields
}
CNIMonitor monitors CNI connectivity and cross-node health.
func (*CNIMonitor) GetPeerStatuses ¶
func (m *CNIMonitor) GetPeerStatuses() map[string]*PeerStatus
GetPeerStatuses returns the current peer status map. This is useful for debugging and monitoring.
func (*CNIMonitor) Start ¶
func (m *CNIMonitor) Start() (<-chan *types.Status, error)
Start starts the CNI monitor with peer discovery.
func (*CNIMonitor) Stop ¶
func (m *CNIMonitor) Stop()
Stop stops the CNI monitor and peer discovery.
type CNIMonitorConfig ¶
type CNIMonitorConfig struct {
// Discovery configuration
Discovery DiscoveryConfig
// Connectivity configuration
Connectivity ConnectivityConfig
// CNI health check configuration
CNIHealth CNIHealthConfig
}
CNIMonitorConfig holds the configuration for the CNI connectivity monitor.
type CheckResult ¶ added in v1.5.0
CheckResult represents a single DNS check outcome for success rate tracking.
type ConnectivityConfig ¶
type ConnectivityConfig struct {
// PingCount is the number of pings to send per peer.
PingCount int
// PingTimeout is the timeout for each ping.
PingTimeout time.Duration
// WarningLatency is the latency threshold for warnings.
WarningLatency time.Duration
// CriticalLatency is the latency threshold for critical conditions.
CriticalLatency time.Duration
// FailureThreshold is consecutive failures before marking peer unreachable.
FailureThreshold int
// MinReachablePeers is the percentage of peers that must be reachable.
MinReachablePeers int
// ProbeMethod is the connectivity probe method: "http" or "icmp".
// Empty string means auto-detect based on OverlayTestEnabled.
ProbeMethod string
// ProbePort is the HTTP probe port (default 8023).
ProbePort int
// ProbePath is the HTTP probe path (default "/healthz").
ProbePath string
}
ConnectivityConfig holds connectivity check configuration.
type ConnectivityMonitor ¶
type ConnectivityMonitor struct {
*monitors.BaseMonitor
// contains filtered or unexported fields
}
ConnectivityMonitor monitors external network connectivity by checking HTTP/HTTPS endpoints.
type ConnectivityMonitorConfig ¶
type ConnectivityMonitorConfig struct {
// Endpoints is the list of endpoints to check for connectivity.
Endpoints []EndpointConfig
// FailureThreshold is the number of consecutive failures before reporting NetworkUnreachable.
FailureThreshold int
}
ConnectivityMonitorConfig holds the configuration for the connectivity monitor.
type ConsistencyCheckConfig ¶ added in v1.5.0
type ConsistencyCheckConfig struct {
Enabled bool `json:"enabled"`
QueriesPerCheck int `json:"queriesPerCheck"` // Number of rapid queries (default: 5)
IntervalBetweenQueries time.Duration `json:"intervalBetweenQueries"` // Delay between queries (default: 200ms)
}
ConsistencyCheckConfig holds configuration for DNS consistency checking. Consistency checking performs multiple rapid queries to detect intermittent DNS issues.
type DNSErrorType ¶ added in v1.5.0
type DNSErrorType string
DNSErrorType represents the classification of DNS errors for better diagnostics.
const ( // DNSErrorTimeout indicates the DNS query timed out. // Suggests: Network issues, server overload, or firewall blocking. DNSErrorTimeout DNSErrorType = "Timeout" // DNSErrorNXDOMAIN indicates the domain does not exist. // Suggests: Typo in domain, missing DNS record, or DNS zone not configured. DNSErrorNXDOMAIN DNSErrorType = "NXDOMAIN" // DNSErrorSERVFAIL indicates the server failed to complete the query. // Suggests: Upstream DNS server error or DNSSEC validation failure. DNSErrorSERVFAIL DNSErrorType = "SERVFAIL" // DNSErrorRefused indicates the connection to the DNS server was refused. // Suggests: DNS server down, wrong port, or firewall blocking. DNSErrorRefused DNSErrorType = "Refused" // DNSErrorTemporary indicates a temporary/transient DNS failure. // Suggests: Retry may succeed. DNSErrorTemporary DNSErrorType = "Temporary" // DNSErrorUnknown indicates an unclassified DNS error. DNSErrorUnknown DNSErrorType = "Unknown" )
type DNSMonitor ¶
type DNSMonitor struct {
// BaseMonitor for lifecycle management
*monitors.BaseMonitor
// contains filtered or unexported fields
}
DNSMonitor monitors DNS resolution health.
type DNSMonitorConfig ¶
type DNSMonitorConfig struct {
// ClusterDomains are Kubernetes cluster internal domains to test
ClusterDomains []string `json:"clusterDomains"`
// ExternalDomains are external domains to test for internet connectivity
ExternalDomains []string `json:"externalDomains"`
// CustomQueries are additional custom DNS queries to perform
CustomQueries []DNSQuery `json:"customQueries"`
// LatencyThreshold is the maximum acceptable DNS query latency
LatencyThreshold time.Duration `json:"latencyThreshold"`
// NameserverCheckEnabled enables checking nameserver reachability
NameserverCheckEnabled bool `json:"nameserverCheckEnabled"`
// ResolverPath is the path to the resolver configuration file
ResolverPath string `json:"resolverPath"`
// FailureCountThreshold is the number of consecutive failures before reporting NetworkUnreachable
FailureCountThreshold int `json:"failureCountThreshold"`
// SuccessRateTracking configures sliding window success rate tracking
SuccessRateTracking *SuccessRateConfig `json:"successRateTracking"`
// ConsistencyChecking configures DNS consistency verification via multiple rapid queries
ConsistencyChecking *ConsistencyCheckConfig `json:"consistencyChecking"`
}
DNSMonitorConfig holds the configuration for the DNS monitor.
type DNSQuery ¶
type DNSQuery struct {
Domain string
RecordType string // Currently only "A" is supported
TestEachNameserver bool // Test this domain against each nameserver individually
ConsistencyCheck bool // Enable consistency checking with multiple rapid queries
}
DNSQuery represents a custom DNS query configuration.
type DiscoveryConfig ¶
type DiscoveryConfig struct {
// Method is the discovery method ("kubernetes" or "static").
Method string
// Namespace to search for peers (for kubernetes method).
Namespace string
// LabelSelector for filtering pods (for kubernetes method).
LabelSelector string
// RefreshInterval is how often to refresh the peer list.
RefreshInterval time.Duration
// StaticPeers is a list of static peer IPs (for static method).
StaticPeers []string
// OverlayTestEnabled enables overlay network testing mode.
// When enabled, discovers overlay-test pods and pings their overlay IPs
// instead of node IPs. This provides accurate CNI/overlay network testing.
OverlayTestEnabled bool
// OverlayTestLabelSelector is the label selector for overlay test pods.
OverlayTestLabelSelector string
}
DiscoveryConfig holds peer discovery configuration.
type EndpointConfig ¶
type EndpointConfig struct {
// Name is a human-readable identifier for this endpoint.
Name string
// URL is the endpoint URL to check.
URL string
// Method is the HTTP method to use (GET, HEAD, POST, etc.).
Method string
// ExpectedStatusCode is the HTTP status code expected for a successful check.
ExpectedStatusCode int
// Timeout is the maximum duration for the HTTP request.
Timeout time.Duration
// Headers are optional HTTP headers to include in the request.
Headers map[string]string
// FollowRedirects determines whether to follow HTTP redirects.
FollowRedirects bool
}
EndpointConfig defines the configuration for a connectivity endpoint check.
type EndpointResult ¶
type EndpointResult struct {
// Success indicates whether the endpoint check succeeded.
Success bool
// StatusCode is the HTTP response status code (0 if request failed).
StatusCode int
// ResponseTime is the duration of the HTTP request.
ResponseTime time.Duration
// Error contains any error that occurred during the check.
Error error
// ErrorType classifies the error (DNS, Connection, HTTP, Timeout, etc.).
ErrorType string
}
EndpointResult contains the result of an endpoint connectivity check.
type GatewayMonitor ¶
type GatewayMonitor struct {
*monitors.BaseMonitor
// contains filtered or unexported fields
}
GatewayMonitor monitors the default gateway's reachability and latency.
type GatewayMonitorConfig ¶
type GatewayMonitorConfig struct {
// PingCount is the number of pings to send per check cycle.
PingCount int
// PingTimeout is the timeout for each individual ping.
PingTimeout time.Duration
// LatencyThreshold is the threshold above which latency is considered high.
LatencyThreshold time.Duration
// AutoDetectGateway enables automatic gateway detection from route table.
AutoDetectGateway bool
// ManualGateway allows manual specification of gateway IP (overrides auto-detection).
ManualGateway string
// FailureCountThreshold is the number of consecutive failures before reporting NetworkUnreachable.
FailureCountThreshold int
}
GatewayMonitorConfig holds the configuration for the gateway monitor.
type HTTPClient ¶
type HTTPClient interface {
// CheckEndpoint performs a connectivity check against the given endpoint.
CheckEndpoint(ctx context.Context, endpoint EndpointConfig) (*EndpointResult, error)
}
HTTPClient interface abstracts HTTP endpoint checking for testability.
type IPForwardingConfig ¶ added in v1.7.2
type IPForwardingConfig struct {
// CheckIPv4 enables checking /proc/sys/net/ipv4/ip_forward.
CheckIPv4 bool
// CheckIPv6 enables checking /proc/sys/net/ipv6/conf/all/forwarding.
CheckIPv6 bool
// CheckPerInterface enables checking per-interface forwarding settings.
CheckPerInterface bool
// Interfaces limits per-interface checks to these specific interfaces.
// Empty means check all interfaces found via glob.
Interfaces []string
// ProcPath is the base path for the proc filesystem.
// Defaults to "/proc", but can be set to "/host/proc" for containerized deployments.
ProcPath string
}
IPForwardingConfig holds the configuration for the IP forwarding monitor.
type IPForwardingMonitor ¶ added in v1.7.2
type IPForwardingMonitor struct {
*monitors.BaseMonitor
// contains filtered or unexported fields
}
IPForwardingMonitor monitors IP forwarding settings required for Kubernetes networking.
type NameserverDomainStatus ¶ added in v1.5.0
type NameserverDomainStatus struct {
Nameserver string
Domain string
FailureCount int
LastSuccess time.Time
LastLatency time.Duration
}
NameserverDomainStatus tracks the status of DNS resolution for a specific nameserver and domain combination.
type Peer ¶
type Peer struct {
// Name is the pod name.
Name string
// NodeName is the Kubernetes node the peer is running on.
NodeName string
// NodeIP is the IP address of the node (used for pinging since hostNetwork=true).
NodeIP string
// PodIP is the pod IP (same as NodeIP when using hostNetwork).
PodIP string
// LastSeen is when this peer was last seen in discovery.
LastSeen time.Time
}
Peer represents a discovered node-doctor peer.
type PeerDiscovery ¶
type PeerDiscovery interface {
// GetPeers returns the current list of discovered peers.
GetPeers() []Peer
// Refresh forces an immediate refresh of the peer list.
Refresh(ctx context.Context) error
// Start begins background peer discovery.
Start(ctx context.Context) error
// Stop stops background peer discovery.
Stop()
}
PeerDiscovery handles discovery of node-doctor peer instances.
func NewKubernetesPeerDiscovery ¶
func NewKubernetesPeerDiscovery(config *PeerDiscoveryConfig) (PeerDiscovery, error)
NewKubernetesPeerDiscovery creates a new peer discovery instance using Kubernetes API.
func NewKubernetesPeerDiscoveryWithClient ¶
func NewKubernetesPeerDiscoveryWithClient(config *PeerDiscoveryConfig, client kubernetes.Interface) (PeerDiscovery, error)
NewKubernetesPeerDiscoveryWithClient creates a peer discovery instance with an existing client. This is useful for testing with mock clients.
func NewStaticPeerDiscovery ¶
func NewStaticPeerDiscovery(peers []Peer) PeerDiscovery
NewStaticPeerDiscovery creates a peer discovery instance with a static list.
type PeerDiscoveryConfig ¶
type PeerDiscoveryConfig struct {
// Namespace to search for peers.
Namespace string
// LabelSelector for filtering pods (e.g., "app=node-doctor").
LabelSelector string
// RefreshInterval is how often to refresh the peer list.
RefreshInterval time.Duration
// SelfNodeName is the name of the current node (to exclude self).
SelfNodeName string
// Kubeconfig path (optional, uses in-cluster config if empty).
Kubeconfig string
}
PeerDiscoveryConfig holds configuration for peer discovery.
type PeerStatus ¶
type PeerStatus struct {
Peer Peer
Reachable bool
LastLatency time.Duration
AvgLatency time.Duration
FailureCount int
LastCheck time.Time
LastSuccess time.Time
ConsecutiveFails int
}
PeerStatus tracks the status of a single peer.
type PingResult ¶
type PingResult struct {
// Success indicates whether the ping succeeded.
Success bool
// RTT is the round-trip time for successful pings.
RTT time.Duration
// Error contains the error if the ping failed.
Error error
}
PingResult represents the result of a single ping operation.
type Pinger ¶
type Pinger interface {
// Ping sends connectivity probes to the target IP address.
// It returns a slice of results, one for each probe attempt.
Ping(ctx context.Context, target string, count int, timeout time.Duration) ([]PingResult, error)
}
Pinger is an interface for connectivity probe operations (ICMP or HTTP). This interface allows for mocking probe operations in tests.
type Resolver ¶
type Resolver interface {
// LookupHost looks up the given host using the local resolver.
// It returns a slice of that host's addresses.
LookupHost(ctx context.Context, host string) ([]string, error)
// LookupAddr performs a reverse lookup for the given address,
// returning a list of names mapping to that address.
LookupAddr(ctx context.Context, addr string) ([]string, error)
}
Resolver is an interface for DNS resolution operations. This interface allows for mocking DNS resolution in tests.
type RingBuffer ¶ added in v1.5.0
type RingBuffer struct {
// contains filtered or unexported fields
}
RingBuffer implements a fixed-size circular buffer for tracking check results. It provides O(1) insertions and O(n) success rate calculation.
func NewRingBuffer ¶ added in v1.5.0
func NewRingBuffer(size int) *RingBuffer
NewRingBuffer creates a new ring buffer with specified capacity.
func (*RingBuffer) Add ¶ added in v1.5.0
func (rb *RingBuffer) Add(result *CheckResult)
Add appends a result to the buffer, overwriting the oldest entry if full.
func (*RingBuffer) Count ¶ added in v1.5.0
func (rb *RingBuffer) Count() int
Count returns the number of valid entries in the buffer.
func (*RingBuffer) GetFailureRate ¶ added in v1.5.0
func (rb *RingBuffer) GetFailureRate() float64
GetFailureRate calculates the failure percentage in the buffer.
func (*RingBuffer) GetSuccessRate ¶ added in v1.5.0
func (rb *RingBuffer) GetSuccessRate() float64
GetSuccessRate calculates the success percentage in the buffer. Returns 0.0 if no entries exist.
func (*RingBuffer) Size ¶ added in v1.5.0
func (rb *RingBuffer) Size() int
Size returns the capacity of the buffer.
type SuccessRateConfig ¶ added in v1.5.0
type SuccessRateConfig struct {
Enabled bool `json:"enabled"`
WindowSize int `json:"windowSize"` // Number of checks to track (default: 10)
FailureRateThreshold float64 `json:"failureRateThreshold"` // Alert if failure rate exceeds this (default: 0.3 = 30%)
MinSamplesRequired int `json:"minSamplesRequired"` // Minimum samples before alerting (default: 5)
}
SuccessRateConfig holds configuration for success rate tracking.