distance

package module
v1.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 6, 2025 License: MIT Imports: 9 Imported by: 0

README ΒΆ

go-distance

Go Reference Go Report Card Coverage

A comprehensive, production-ready Go library providing 100+ distance, similarity, and divergence metrics for vectors, strings, probability distributions, time series, graphs, and geographic coordinates.

Why go-distance?

  • Comprehensive: 100+ metrics across 8 categories - the most complete distance library in Go
  • Type-Safe: Leverages Go 1.18+ generics for compile-time type safety
  • Zero Dependencies: Pure Go implementation with no external dependencies
  • Production-Ready: 80% test coverage, extensively benchmarked, battle-tested algorithms
  • High Performance: Optimized implementations with O(1) space complexity where possible
  • Well-Documented: Every function includes complexity analysis and usage examples
  • Versatile: Use cases spanning ML, NLP, GIS, bioinformatics, data science, and optimization

Installation

go get github.com/reeshijoshi/go-distance

Requirements: Go 1.21 or later

Quick Start

package main

import (
    "fmt"
    "github.com/reeshijoshi/go-distance"
)

func main() {
    // Vector distances
    euclidean, _ := distance.Euclidean([]float64{1, 2, 3}, []float64{4, 5, 6})
    fmt.Printf("Euclidean: %.2f\n", euclidean) // 5.20

    // String distances
    levenshtein, _ := distance.Levenshtein("kitten", "sitting")
    fmt.Printf("Levenshtein: %d\n", levenshtein) // 3

    // Statistical divergence
    kl, _ := distance.KLDivergence([]float64{0.5, 0.5}, []float64{0.6, 0.4})
    fmt.Printf("KL Divergence: %.4f\n", kl) // 0.0202

    // Geographic distance
    nyc := distance.Coord{Lat: 40.7128, Lon: -74.0060}
    london := distance.Coord{Lat: 51.5074, Lon: -0.1278}
    haversine := distance.Haversine(nyc, london)
    fmt.Printf("NYC to London: %.0f km\n", haversine) // 5570 km
}

Feature Categories

πŸ”’ Vector Distances

For numerical data, feature vectors, embeddings

Function Description Use Case
Euclidean L2 norm, straight-line distance General-purpose, ML features
Manhattan L1 norm, taxicab distance Grid-based problems, sparse data
Chebyshev L∞ norm, maximum difference Chess moves, uniform grids
Minkowski Generalized Lp norm Flexible distance metric
Cosine Angular distance Text embeddings, recommendations
Canberra Weighted Manhattan Positive data, outlier-sensitive
BrayCurtis Ecological dissimilarity Compositional data
Hamming Differing positions Binary vectors, error detection
WeightedEuclidean Weighted L2 norm Feature importance weighting

Example:

// K-nearest neighbors with custom metric
vectors := [][]float64{{1, 2}, {3, 4}, {5, 6}}
neighbors, _ := distance.KNearestNeighbors(vectors, 2, distance.Euclidean[float64])
πŸ“ String Distances

For text comparison, spell checking, fuzzy matching

Function Description Use Case
Levenshtein Edit distance (insert/delete/substitute) Spell checking, DNA alignment
DamerauLevenshtein Edit distance with transpositions OCR correction, typo detection
Jaro String similarity Record linkage, name matching
JaroWinkler Jaro with prefix bonus Names, addresses
HammingString Substitution-only distance Fixed-length codes
LongestCommonSubsequence LCS length Diff algorithms, plagiarism
NGramDistance N-gram overlap Language detection, fuzzy search
Soundex Phonetic encoding Name search, homophones
TokenSortRatio Word-order invariant Document similarity

Example:

// Fuzzy string matching for search
query := "color"
candidates := []string{"colour", "cooler", "collar", "colored"}
for _, candidate := range candidates {
    dist, _ := distance.Levenshtein(query, candidate)
    if dist <= 2 {
        fmt.Printf("Match: %s (distance: %d)\n", candidate, dist)
    }
}
πŸ“Š Statistical Distances

For probability distributions, histograms, data analysis

Function Description Use Case
KLDivergence Kullback-Leibler divergence Information theory, ML loss
JensenShannonDivergence Symmetric KL divergence Distribution comparison
Bhattacharyya Distribution overlap Classification, image processing
Hellinger Distribution distance Density estimation
ChiSquare χ² distance Histogram comparison
TotalVariation L1 on probabilities Statistical testing
CrossEntropy Information-theoretic loss ML training
PearsonCorrelation Linear correlation Regression analysis
SpearmanCorrelation Rank correlation Non-parametric statistics
Wasserstein1D Earth mover's distance Optimal transport

Example:

// Compare distributions
observed := []float64{0.25, 0.35, 0.20, 0.20}
expected := []float64{0.25, 0.25, 0.25, 0.25}

kl, _ := distance.KLDivergence(observed, expected)
js, _ := distance.JensenShannonDivergence(observed, expected)
fmt.Printf("KL: %.4f, JS: %.4f\n", kl, js)
πŸ”€ Set-Based Distances

For comparing sets, bags, collections

Function Description Use Case
JaccardSet Intersection over union Document similarity
DiceSorensen 2*intersection / (size1+size2) Image segmentation
OverlapCoefficient Subset similarity Query matching
TanimotoCoefficient Generalized Jaccard Chemical fingerprints
CosineSimilaritySet Bag-of-words cosine Text retrieval

Example:

// Find similar documents by tags
doc1 := []string{"go", "programming", "tutorial"}
doc2 := []string{"go", "golang", "programming"}

jaccard, _ := distance.JaccardSet(doc1, doc2)
dice, _ := distance.DiceSorensen(doc1, doc2)
fmt.Printf("Jaccard: %.2f, Dice: %.2f\n", 1-jaccard, dice)
🌍 Geographic Distances

For location data, GIS, mapping applications

Function Description Accuracy Speed
Haversine Great circle distance Β±0.5% Fast
Vincenty Geodesic on ellipsoid (WGS-84) Β±0.5mm Moderate
GreatCircle Spherical law of cosines Β±0.5% Fast
Equirectangular Fast approximation Β±1% (short) Very Fast

Example:

// Calculate delivery distance
warehouse := distance.Coord{Lat: 37.7749, Lon: -122.4194} // SF
customer := distance.Coord{Lat: 34.0522, Lon: -118.2437}  // LA

km := distance.Haversine(warehouse, customer)
miles := distance.HaversineMiles(warehouse, customer)
fmt.Printf("Distance: %.0f km (%.0f miles)\n", km, miles)
⏱️ Time Series Distances

For temporal data, signal processing, pattern matching

Function Description Use Case
DTW Dynamic Time Warping Speech recognition, gesture matching
DTWWithWindow DTW with Sakoe-Chiba band Faster DTW with constraints
Frechet Discrete FrΓ©chet distance Curve similarity, GPS tracks
Hausdorff Maximum point-to-set distance Shape matching, image comparison
SmithWaterman Local sequence alignment DNA/protein sequence analysis
NeedlemanWunsch Global sequence alignment Bioinformatics
Autocorrelation Self-similarity at lag Seasonality detection

Example:

// Find similar patterns in sensor data
signal1 := []float64{1, 2, 3, 4, 5}
signal2 := []float64{0, 1, 2, 3, 4, 5, 6}

dtwDist, _ := distance.DTW(signal1, signal2)
fmt.Printf("DTW Distance: %.2f\n", dtwDist)
πŸ“ˆ Graph Distances

For network analysis, shortest paths, graph theory

Function Description Complexity
Dijkstra Single-source shortest path O((V+E)log V)
BellmanFord Handles negative weights O(VE)
FloydWarshall All-pairs shortest paths O(VΒ³)
AStar Heuristic search O(E log V)
BFS Unweighted shortest path O(V+E)
GraphDiameter Maximum shortest path O(VΒ³)
GraphEditDistance Graph similarity Approximate

Example:

// Route planning
g := distance.NewGraph()
g.AddEdge(0, 1, 10.0) // Node 0 to 1, weight 10
g.AddEdge(1, 2, 5.0)
g.AddEdge(0, 2, 25.0)

dist, path := g.Dijkstra(0, 2)
fmt.Printf("Shortest path: %v, distance: %.0f\n", path, dist)
🎯 Optimization Algorithms

For parameter tuning, function minimization, search

Algorithm Type Use Case
GradientDescent First-order Convex optimization
Adam Adaptive moment ML training
BFGS Quasi-Newton Smooth non-convex
NelderMead Derivative-free Black-box optimization
SimulatedAnnealing Stochastic Combinatorial problems
GeneticAlgorithm Evolutionary Multi-modal search
ParticleSwarmOptimization Swarm intelligence Continuous optimization

Example:

// Minimize a function
f := func(x []float64) float64 {
    return x[0]*x[0] + x[1]*x[1] // f(x,y) = xΒ² + yΒ²
}

grad := func(x []float64) []float64 {
    return []float64{2*x[0], 2*x[1]}
}

minimum := distance.GradientDescent(f, grad, []float64{5, 5}, 0.1, 100)
fmt.Printf("Minimum at: %v\n", minimum) // Near [0, 0]

Advanced Features

Batch Operations

Efficient parallel computation for large datasets:

// Compute distance matrix for all pairs
vectors := [][]float64{{1, 2}, {3, 4}, {5, 6}, {7, 8}}
matrix, _ := distance.BatchComputeParallel(vectors, distance.Euclidean[float64], 4)
Context-Aware Computation

Cancel long-running operations:

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

matrix, _ := distance.BatchComputeWithContext(ctx, vectors, distFunc, 8)

Performance Benchmarks

Benchmarks run on Apple M1, Go 1.21:

BenchmarkEuclidean-8         100000000    10.2 ns/op     0 B/op   0 allocs/op
BenchmarkManhattan-8         200000000     8.5 ns/op     0 B/op   0 allocs/op
BenchmarkCosine-8             50000000    24.3 ns/op     0 B/op   0 allocs/op
BenchmarkLevenshtein-8         1000000  1243.0 ns/op   512 B/op   2 allocs/op
BenchmarkHaversine-8          10000000   112.0 ns/op     0 B/op   0 allocs/op
BenchmarkDTW-8                    5000 245000.0 ns/op  8192 B/op   2 allocs/op

Real-World Use Cases

Machine Learning
// K-means clustering
func assignClusters(points, centroids [][]float64) []int {
    assignments := make([]int, len(points))
    for i, point := range points {
        idx, _, _ := distance.NearestNeighbor(centroids, point, distance.Euclidean[float64])
        assignments[i] = idx
    }
    return assignments
}
Information Retrieval
// Document ranking by relevance
func rankDocuments(query string, docs []string) []struct{ idx int; score float64 } {
    scores := make([]struct{ idx int; score float64 }, len(docs))
    for i, doc := range docs {
        sim, _ := distance.CosineSimilarityStrings(query, doc)
        scores[i] = struct{ idx int; score float64 }{i, sim}
    }
    // Sort by score descending...
    return scores
}
GIS/Location Services
// Find nearby locations
func findNearby(user distance.Coord, places []distance.Coord, radiusKm float64) []int {
    var nearby []int
    for i, place := range places {
        if distance.Haversine(user, place) <= radiusKm {
            nearby = append(nearby, i)
        }
    }
    return nearby
}

API Design Philosophy

  1. Consistency: All distance functions follow func(a, b Type) (float64, error) pattern
  2. Type Safety: Generic functions provide compile-time guarantees
  3. Error Handling: Explicit error returns for invalid inputs
  4. Performance: Zero-allocation hot paths, O(1) space complexity where possible
  5. Documentation: Every function includes time/space complexity and use cases

Testing & Quality

  • 80% Test Coverage: Comprehensive test suite with edge cases
  • Extensive Benchmarks: Performance regression tracking
  • No External Dependencies: Reduced supply chain risk
  • Semantic Versioning: Stable public API

Contributing

Contributions are welcome! Please:

  1. Open an issue to discuss major changes
  2. Add tests for new functionality
  3. Update documentation
  4. Follow existing code style

License

MIT License - see LICENSE for details

Citation

If you use go-distance in academic work, please cite:

@software{go_distance,
  title = {go-distance: Comprehensive distance metrics library for Go},
  author = {go-distance contributors},
  year = {2025},
  url = {https://github.com/reeshijoshi/go-distance}
}

Acknowledgments

Algorithms implemented from:

  • Levenshtein (1966) - Binary codes capable of correcting deletions
  • Damerau (1964) - A technique for computer detection and correction of spelling errors
  • Vincenty (1975) - Direct and inverse solutions of geodesics on the ellipsoid
  • Sakoe & Chiba (1978) - Dynamic programming algorithm optimization for spoken word recognition

Support


Made with ❀️ for the Go community | Star ⭐ if you find this useful!

Documentation ΒΆ

Overview ΒΆ

Package distance provides comprehensive distance, similarity, and divergence metrics for vectors, strings, sets, probability distributions, and time series.

Package distance provides distance metrics and optimization algorithms.

Cryptographic randomness is not required for these mathematical optimization functions (simulated annealing, genetic algorithms, PSO, differential evolution). Using crypto/rand would be unnecessarily slow and provide no security benefit.

Index ΒΆ

Constants ΒΆ

This section is empty.

Variables ΒΆ

View Source
var (
	// ErrDimensionMismatch is returned when vector dimensions don't match.
	ErrDimensionMismatch = errors.New("dimension mismatch between vectors")

	// ErrEmptyInput is returned when input is empty.
	ErrEmptyInput = errors.New("empty input provided")

	// ErrInvalidParameter is returned when a parameter value is invalid.
	ErrInvalidParameter = errors.New("invalid parameter value")

	// ErrZeroVector is returned when a zero vector is encountered.
	ErrZeroVector = errors.New("zero vector encountered")

	// ErrNegativeValue is returned when a negative value is found in input that requires non-negative values.
	ErrNegativeValue = errors.New("negative value in input")
)

Functions ΒΆ

func Adam ΒΆ

func Adam(
	_ OptimizationFunc,
	grad GradientFunc,
	initial []float64,
	learningRate float64,
	beta1, beta2 float64,
	epsilon float64,
	iterations int,
) []float64

Adam optimizer (Adaptive Moment Estimation) Time: O(iterations * d), Space: O(d)

func Autocorrelation ΒΆ

func Autocorrelation[T Number](data []T, lag int) (float64, error)

Autocorrelation computes autocorrelation at lag k. Measures correlation of a signal with a delayed copy of itself. Time: O(n), Space: O(1)

func BFGS ΒΆ

func BFGS(
	f OptimizationFunc,
	grad GradientFunc,
	initial []float64,
	iterations int,
	tolerance float64,
) []float64

BFGS performs BFGS quasi-Newton optimization Time: O(iterations * dΒ²), Space: O(dΒ²)

func BatchCompute ΒΆ

func BatchCompute[T Number](vectors [][]T, distFn DistanceFunc[T]) ([][]float64, error)

BatchCompute computes distances between all pairs of vectors (distance matrix). Time: O(nΒ²d), Space: O(nΒ²) where n=vectors, d=dimensions

func BatchComputeParallel ΒΆ

func BatchComputeParallel[T Number](vectors [][]T, distFn DistanceFunc[T], workers int) ([][]float64, error)

BatchComputeParallel computes distance matrix in parallel. Time: O(nΒ²d/workers), Space: O(nΒ²)

func BatchComputeWithContext ΒΆ

func BatchComputeWithContext[T Number](ctx context.Context, vectors [][]T, distFn DistanceFunc[T], workers int) ([][]float64, error)

BatchComputeWithContext computes distance matrix with cancellation.

func Bhattacharyya ΒΆ

func Bhattacharyya[T Float](p, q []T) (float64, error)

Bhattacharyya computes Bhattacharyya distance. Measures similarity between probability distributions. Range [0, +∞) where 0=identical Time: O(n), Space: O(1)

func BrayCurtis ΒΆ

func BrayCurtis[T Number](a, b []T) (float64, error)

BrayCurtis computes the Bray-Curtis dissimilarity. Used in ecology and biology for compositional data. Range [0, 1] where 0=identical, 1=completely different Time: O(n), Space: O(1)

func Canberra ΒΆ

func Canberra[T Number](a, b []T) (float64, error)

Canberra computes the Canberra distance (weighted Manhattan distance). Sensitive to small changes near zero. Used in biology/ecology. Time: O(n), Space: O(1)

func Centroid ΒΆ

func Centroid[T Number](vectors [][]T) ([]float64, error)

Centroid computes the centroid (mean) of a set of vectors. Time: O(nd), Space: O(d)

func Chebyshev ΒΆ

func Chebyshev[T Number](a, b []T) (float64, error)

Chebyshev computes the L-infinity norm (maximum absolute difference). Also known as: Chessboard distance Time: O(n), Space: O(1)

func ChiSquare ΒΆ

func ChiSquare[T Float](p, q []T) (float64, error)

ChiSquare computes Chi-square distance. Used for histogram comparison in computer vision. Time: O(n), Space: O(1)

func ComputeToPoint ΒΆ

func ComputeToPoint[T Number](vectors [][]T, point []T, distFn DistanceFunc[T]) ([]float64, error)

ComputeToPoint computes distances from all vectors to a single point. Time: O(nd), Space: O(n)

func ComputeWithContext ΒΆ

func ComputeWithContext[T Number](ctx context.Context, a, b []T, distFn DistanceFunc[T]) (float64, error)

ComputeWithContext computes distance with cancellation support.

func ConjugateGradient ΒΆ

func ConjugateGradient(
	f OptimizationFunc,
	grad GradientFunc,
	initial []float64,
	iterations int,
	tolerance float64,
) []float64

ConjugateGradient performs conjugate gradient optimization Time: O(iterations * d), Space: O(d)

func Cosine ΒΆ

func Cosine[T Number](a, b []T) (float64, error)

Cosine computes the cosine distance (1 - cosine similarity). Measures angle between vectors, range [0, 2] (0=identical direction, 2=opposite) Time: O(n), Space: O(1)

func CosineDistanceSet ΒΆ

func CosineDistanceSet[T comparable](a, b []T) (float64, error)

CosineDistanceSet computes cosine distance for bag-of-words sets. Time: O(n+m), Space: O(n+m)

func CosineSimilarity ΒΆ

func CosineSimilarity[T Number](a, b []T) (float64, error)

CosineSimilarity computes the cosine similarity (dot product of normalized vectors). Range [-1, 1] where 1=identical, 0=orthogonal, -1=opposite Time: O(n), Space: O(1)

func CosineSimilaritySet ΒΆ

func CosineSimilaritySet[T comparable](a, b []T) (float64, error)

CosineSimilaritySet computes cosine similarity for bag-of-words. Uses term frequency vectors constructed from sets. Time: O(n+m), Space: O(n+m)

func CosineSimilarityStrings ΒΆ

func CosineSimilarityStrings(a, b string) (float64, error)

CosineSimilarityStrings computes cosine similarity for strings Treats strings as character frequency vectors Range [0, 1] where 1=identical distribution Time: O(n+m), Space: O(n+m)

func CrossEntropy ΒΆ

func CrossEntropy[T Float](p, q []T) (float64, error)

CrossEntropy computes cross-entropy H(P,Q). Used in ML loss functions: H(P,Q) = -Ξ£ p(x) log q(x) Time: O(n), Space: O(1)

func DTW ΒΆ

func DTW[T Number](a, b []T) (float64, error)

DTW computes Dynamic Time Warping distance between two time series. Allows matching sequences of different lengths. Time: O(mn), Space: O(min(m,n)) with optimization

func DTWWithWindow ΒΆ

func DTWWithWindow[T Number](a, b []T, window int) (float64, error)

DTWWithWindow computes DTW with Sakoe-Chiba band constraint. Window limits how far sequences can deviate (improves performance). Time: O(mn), Space: O(min(m,n))

func DamerauLevenshtein ΒΆ

func DamerauLevenshtein(a, b string) (int, error)

DamerauLevenshtein computes Damerau-Levenshtein distance. Includes transposition of adjacent characters (ab -> ba). Time: O(mn), Space: O(mn)

func DiceDistance ΒΆ

func DiceDistance[T comparable](a, b []T) (float64, error)

DiceDistance computes Dice distance (1 - Dice coefficient). Time: O(n+m), Space: O(n)

func DiceSorensen ΒΆ

func DiceSorensen[T comparable](a, b []T) (float64, error)

DiceSorensen computes Dice-Sørensen coefficient. Similarity = 2|A ∩ B| / (|A| + |B|) Range [0, 1] where 1=identical Time: O(n+m), Space: O(n)

func DifferentialEvolution ΒΆ

func DifferentialEvolution(
	f OptimizationFunc,
	dimensions int,
	bounds [][]float64,
	popSize int,
	generations int,
	mutationFactor float64,
	crossoverProb float64,
) []float64

DifferentialEvolution performs differential evolution Time: O(generations * popSize * d), Space: O(popSize * d)

func DotProduct ΒΆ

func DotProduct[T Number](a, b []T) (float64, error)

DotProduct computes the dot product (inner product) of two vectors. Time: O(n), Space: O(1)

func EditDistance ΒΆ

func EditDistance(a, b string, insertCost, deleteCost, replaceCost int) (int, error)

EditDistance computes generic edit distance with custom costs Time: O(mn), Space: O(min(m,n))

func Equirectangular ΒΆ

func Equirectangular(a, b Coord) float64

Equirectangular computes approximate distance using equirectangular projection. Fast but less accurate approximation for small distances. Returns distance in kilometers. Time: O(1), Space: O(1)

func EquirectangularWithRadius ΒΆ

func EquirectangularWithRadius(a, b Coord, radius float64) float64

EquirectangularWithRadius computes equirectangular distance with custom radius. Time: O(1), Space: O(1)

func Euclidean ΒΆ

func Euclidean[T Number](a, b []T) (float64, error)

Euclidean computes the L2 norm (straight-line distance) between two vectors. Time: O(n), Space: O(1)

func EuclideanSquared ΒΆ

func EuclideanSquared[T Number](a, b []T) (float64, error)

EuclideanSquared computes squared Euclidean distance (faster, avoids sqrt). Time: O(n), Space: O(1)

func Frechet ΒΆ

func Frechet[T Number](a, b [][]T) (float64, error)

Frechet computes discrete FrΓ©chet distance between two curves. Measures similarity considering the flow of the curves. Time: O(mn), Space: O(mn)

func GeneticAlgorithm ΒΆ

func GeneticAlgorithm(
	f OptimizationFunc,
	dimensions int,
	bounds [][]float64,
	popSize int,
	generations int,
	mutationRate float64,
	crossoverRate float64,
) []float64

GeneticAlgorithm performs genetic algorithm optimization Time: O(generations * popSize * d), Space: O(popSize * d)

func GradientDescent ΒΆ

func GradientDescent(
	_ OptimizationFunc,
	grad GradientFunc,
	initial []float64,
	learningRate float64,
	iterations int,
) []float64

GradientDescent performs gradient descent optimization Time: O(iterations * d), Space: O(d)

func GradientDescentWithMomentum ΒΆ

func GradientDescentWithMomentum(
	_ OptimizationFunc,
	grad GradientFunc,
	initial []float64,
	learningRate float64,
	momentum float64,
	iterations int,
) []float64

GradientDescentWithMomentum performs gradient descent with momentum Time: O(iterations * d), Space: O(d)

func GraphEditDistance ΒΆ

func GraphEditDistance(g1, g2 *Graph) float64

GraphEditDistance computes graph edit distance between two graphs Time: Exponential (NP-hard), Space: O(VΒ²)

func GreatCircle ΒΆ

func GreatCircle(a, b Coord) float64

GreatCircle computes great-circle distance using spherical law of cosines. Simpler but less accurate for small distances than Haversine. Returns distance in kilometers. Time: O(1), Space: O(1)

func GreatCircleWithRadius ΒΆ

func GreatCircleWithRadius(a, b Coord, radius float64) float64

GreatCircleWithRadius computes great-circle distance with custom radius. Time: O(1), Space: O(1)

func Hamming ΒΆ

func Hamming[T Number](a, b []T) (float64, error)

Hamming computes the Hamming distance (number of differing positions). Time: O(n), Space: O(1)

func HammingString ΒΆ

func HammingString(a, b string) (int, error)

HammingString computes Hamming distance for strings (must be equal length). Time: O(n), Space: O(1)

func Hausdorff ΒΆ

func Hausdorff[T Number](a, b [][]T) (float64, error)

Hausdorff computes Hausdorff distance between two point sets. Measures maximum distance from a point in one set to the nearest point in the other. Time: O(nm), Space: O(1)

func Haversine ΒΆ

func Haversine(a, b Coord) float64

Haversine computes great-circle distance using Haversine formula. Returns distance in kilometers by default. Time: O(1), Space: O(1)

func HaversineMiles ΒΆ

func HaversineMiles(a, b Coord) float64

HaversineMiles computes Haversine distance in miles. Time: O(1), Space: O(1)

func HaversineWithRadius ΒΆ

func HaversineWithRadius(a, b Coord, radius float64) float64

HaversineWithRadius computes Haversine distance with custom Earth radius. Time: O(1), Space: O(1)

func Hellinger ΒΆ

func Hellinger[T Float](p, q []T) (float64, error)

Hellinger computes Hellinger distance. Related to Bhattacharyya: HΒ² = 1 - BC Range [0, 1] where 0=identical, 1=completely different Time: O(n), Space: O(1)

func JaccardIndex ΒΆ

func JaccardIndex(a, b string, n int) (float64, error)

JaccardIndex computes Jaccard index for strings (using n-grams) Range [0, 1] where 1=identical Time: O(n+m), Space: O(n+m)

func JaccardSet ΒΆ

func JaccardSet[T comparable](a, b []T) (float64, error)

JaccardSet computes Jaccard distance for sets. Distance = 1 - |A ∩ B| / |A βˆͺ B| Range [0, 1] where 0=identical, 1=completely different Time: O(n+m), Space: O(n)

func JaccardSimilarity ΒΆ

func JaccardSimilarity[T comparable](a, b []T) (float64, error)

JaccardSimilarity computes Jaccard similarity coefficient. Similarity = |A ∩ B| / |A βˆͺ B| Range [0, 1] where 1=identical, 0=no overlap Time: O(n+m), Space: O(n)

func Jaro ΒΆ

func Jaro(a, b string) (float64, error)

Jaro computes the Jaro similarity between two strings. Returns similarity in [0, 1] where 1=identical Time: O(mn), Space: O(max(m,n))

func JaroWinkler ΒΆ

func JaroWinkler(a, b string, prefixScale float64) (float64, error)

JaroWinkler computes Jaro-Winkler similarity (Jaro with prefix bonus). prefixScale: scaling factor for prefix (standard: 0.1) Returns similarity in [0, 1] where 1=identical Time: O(mn), Space: O(max(m,n))

func JensenShannonDivergence ΒΆ

func JensenShannonDivergence[T Float](p, q []T) (float64, error)

JensenShannonDivergence computes Jensen-Shannon divergence. Symmetric version of KL divergence: JS(P||Q) = JS(Q||P) Bounded: 0 ≀ JS ≀ log(2) Time: O(n), Space: O(n)

func KLDivergence ΒΆ

func KLDivergence[T Float](p, q []T) (float64, error)

KLDivergence computes Kullback-Leibler divergence KL(P||Q). Measures how probability distribution P diverges from Q. NOTE: Asymmetric (KL(P||Q) β‰  KL(Q||P)) Time: O(n), Space: O(1)

func KNearestNeighbors ΒΆ

func KNearestNeighbors[T Number](vectors [][]T, k int, distFn DistanceFunc[T]) ([][]int, error)

KNearestNeighbors finds k nearest neighbors for each vector. Returns indices of k nearest neighbors for each vector. Time: O(nΒ²d), Space: O(nk)

func LCSDistance ΒΆ

func LCSDistance(a, b string) (int, error)

LCSDistance computes distance based on LCS. Returns: len(a) + len(b) - 2*LCS(a,b) Time: O(mn), Space: O(min(m,n))

func LCSRatio ΒΆ

func LCSRatio(a, b string) (float64, error)

LCSRatio computes LCS-based similarity ratio Range [0, 1] where 1=identical Time: O(mn), Space: O(min(m,n))

func Levenshtein ΒΆ

func Levenshtein(a, b string) (int, error)

Levenshtein computes the Levenshtein edit distance between two strings. Counts minimum insertions, deletions, and substitutions. Time: O(mn), Space: O(min(m,n)) with optimization

func LongestCommonSubsequence ΒΆ

func LongestCommonSubsequence(a, b string) (int, error)

LongestCommonSubsequence computes the length of LCS. Time: O(mn), Space: O(min(m,n))

func LongestCommonSubstring ΒΆ

func LongestCommonSubstring[T comparable](a, b []T) int

LongestCommonSubstring computes longest common substring length for sequences. Time: O(mn), Space: O(mn)

func Manhattan ΒΆ

func Manhattan[T Number](a, b []T) (float64, error)

Manhattan computes the L1 norm (sum of absolute differences). Also known as: Taxicab distance, City Block distance Time: O(n), Space: O(1)

func Metaphone ΒΆ

func Metaphone(s string) string

Metaphone computes metaphone phonetic encoding Returns phonetic code for phonetic matching Time: O(n), Space: O(n)

func Minkowski ΒΆ

func Minkowski[T Number](a, b []T, p float64) (float64, error)

Minkowski computes the Lp norm with parameter p. p=1: Manhattan, p=2: Euclidean, p=inf: Chebyshev Time: O(n), Space: O(1)

func MongeElkan ΒΆ

func MongeElkan(a, b string, tokenSim func(string, string) float64) (float64, error)

MongeElkan computes Monge-Elkan similarity Uses maximum token similarity Range [0, 1] where 1=identical Time: O(nΒ²m), Space: O(n)

func NGramDistance ΒΆ

func NGramDistance(a, b string, n int) (float64, error)

NGramDistance computes distance based on n-gram overlap. Time: O(m+n), Space: O(m+n)

func NearestNeighbor ΒΆ

func NearestNeighbor[T Number](vectors [][]T, query []T, distFn DistanceFunc[T]) (int, float64, error)

NearestNeighbor finds the nearest neighbor for a query point. Time: O(nd), Space: O(1)

func NeedlemanWunsch ΒΆ

func NeedlemanWunsch[T comparable](a, b []T, match, mismatch, gap int) (int, error)

NeedlemanWunsch computes global sequence alignment score. Time: O(mn), Space: O(mn)

func NelderMead ΒΆ

func NelderMead(
	f OptimizationFunc,
	initial []float64,
	iterations int,
	alpha, gamma, rho, sigma float64,
) []float64

NelderMead performs Nelder-Mead simplex optimization Time: O(iterations * dΒ²), Space: O(dΒ²)

func Norm ΒΆ

func Norm[T Number](v []T, p float64) (float64, error)

Norm computes the Lp norm of a vector. Time: O(n), Space: O(1)

func NormalizedLevenshtein ΒΆ

func NormalizedLevenshtein(a, b string) (float64, error)

NormalizedLevenshtein computes normalized Levenshtein distance Range [0, 1] where 0=identical Time: O(mn), Space: O(min(m,n))

func OverlapCoefficient ΒΆ

func OverlapCoefficient[T comparable](a, b []T) (float64, error)

OverlapCoefficient computes overlap coefficient. Overlap = |A ∩ B| / min(|A|, |B|) Range [0, 1] where 1=one is subset of the other Time: O(n+m), Space: O(n)

func PairwiseDistinctCount ΒΆ

func PairwiseDistinctCount[T Number](vectors [][]T, threshold float64, distFn DistanceFunc[T]) (int, error)

PairwiseDistinctCount counts pairs with distance above threshold. Useful for diversity metrics. Time: O(nΒ²d), Space: O(1)

func ParticleSwarmOptimization ΒΆ

func ParticleSwarmOptimization(
	f OptimizationFunc,
	dimensions int,
	bounds [][]float64,
	swarmSize int,
	iterations int,
	inertia float64,
	cognitive float64,
	social float64,
) []float64

ParticleSwarmOptimization performs PSO Time: O(iterations * swarmSize * d), Space: O(swarmSize * d)

func PearsonCorrelation ΒΆ

func PearsonCorrelation[T Number](a, b []T) (float64, error)

PearsonCorrelation computes Pearson correlation coefficient. Range [-1, 1] where 1=perfect positive, -1=perfect negative, 0=no correlation Time: O(n), Space: O(1)

func PearsonDistance ΒΆ

func PearsonDistance[T Number](a, b []T) (float64, error)

PearsonDistance computes distance based on Pearson correlation. Range [0, 2] where 0=perfect correlation, 2=perfect anti-correlation Time: O(n), Space: O(1)

func PhoneticDistance ΒΆ

func PhoneticDistance(a, b string, encoder func(string) string) int

PhoneticDistance computes distance between phonetic encodings Returns 0 if phonetically similar Time: O(1), Space: O(1)

func QGramDistance ΒΆ

func QGramDistance(a, b string, q int) (int, error)

QGramDistance computes q-gram distance Time: O(n+m), Space: O(n+m)

func RadiusNeighbors ΒΆ

func RadiusNeighbors[T Number](vectors [][]T, radius float64, distFn DistanceFunc[T]) ([][]int, error)

RadiusNeighbors finds all neighbors within radius for each vector. Time: O(nΒ²d), Space: O(n*m) where m=avg neighbors

func RatcliffObershelp ΒΆ

func RatcliffObershelp(a, b string) (float64, error)

RatcliffObershelp computes Ratcliff/Obershelp similarity Also known as Gestalt Pattern Matching Range [0, 1] where 1=identical Time: O(nΒ²), Space: O(n)

func SimulatedAnnealing ΒΆ

func SimulatedAnnealing(
	f OptimizationFunc,
	initial []float64,
	initialTemp float64,
	coolingRate float64,
	iterations int,
	stepSize float64,
) []float64

SimulatedAnnealing performs simulated annealing optimization Time: O(iterations * d), Space: O(d)

func SmithWaterman ΒΆ

func SmithWaterman[T comparable](a, b []T, match, mismatch, gap int) (int, error)

SmithWaterman computes local sequence alignment score. Used for DNA/protein sequence comparison. Time: O(mn), Space: O(mn)

func SmithWatermanString ΒΆ

func SmithWatermanString(a, b string, match, mismatch, gap int) (int, error)

SmithWatermanString computes Smith-Waterman local alignment for strings Returns alignment score Time: O(mn), Space: O(mn)

func SoftDTW ΒΆ

func SoftDTW[T Number](a, b []T, gamma float64) (float64, error)

SoftDTW computes differentiable DTW using soft-min. Useful for machine learning applications. gamma controls smoothness (smaller = closer to DTW). Time: O(mn), Space: O(mn)

func SorensenDice ΒΆ

func SorensenDice(a, b string) (float64, error)

SorensenDice computes SΓΈrensen-Dice coefficient for strings Uses bigrams (2-grams) for comparison Range [0, 1] where 1=identical Time: O(n+m), Space: O(n+m)

func Soundex ΒΆ

func Soundex(s string) string

Soundex computes Soundex phonetic encoding Returns 4-character code for phonetic matching Time: O(n), Space: O(1)

func SpearmanCorrelation ΒΆ

func SpearmanCorrelation[T Number](a, b []T) (float64, error)

SpearmanCorrelation computes Spearman rank correlation. Measures monotonic relationship between variables. Time: O(n log n), Space: O(n)

func TanimotoCoefficient ΒΆ

func TanimotoCoefficient[T Number](a, b []T) (float64, error)

TanimotoCoefficient computes Tanimoto coefficient (generalized Jaccard). For binary vectors: Tanimoto = dot(a,b) / (||a||Β² + ||b||Β² - dot(a,b)) Time: O(n), Space: O(1)

func TanimotoDistance ΒΆ

func TanimotoDistance[T Number](a, b []T) (float64, error)

TanimotoDistance computes Tanimoto distance (1 - Tanimoto coefficient). Time: O(n), Space: O(1)

func TokenSetRatio ΒΆ

func TokenSetRatio(a, b string) (float64, error)

TokenSetRatio computes similarity using set intersection of tokens Range [0, 1] where 1=identical Time: O(n), Space: O(n)

func TokenSortRatio ΒΆ

func TokenSortRatio(a, b string) (float64, error)

TokenSortRatio computes similarity after sorting tokens Useful for comparing similar strings with different word order Range [0, 1] where 1=identical Time: O(n log n), Space: O(n)

func TotalVariation ΒΆ

func TotalVariation[T Float](p, q []T) (float64, error)

TotalVariation computes total variation distance. Maximum difference in probabilities across all events. Range [0, 1] Time: O(n), Space: O(1)

func TverskyIndex ΒΆ

func TverskyIndex(a, b string, alpha, beta float64) (float64, error)

TverskyIndex computes Tversky index (asymmetric Jaccard) alpha and beta control asymmetry Time: O(n+m), Space: O(n+m)

func Validate ΒΆ

func Validate[T Number](a, b []T) error

Validate checks if two vectors have the same dimension

func ValidateWeights ΒΆ

func ValidateWeights[T Number](v []T, weights []float64) error

ValidateWeights checks if weights match vector dimensions

func Vincenty ΒΆ

func Vincenty(a, b Coord) (float64, error)

Vincenty computes geodesic distance using Vincenty formula. More accurate than Haversine for oblate spheroid (WGS-84 ellipsoid). Returns distance in meters. Time: O(1) with iteration, Space: O(1)

func VincentyKm ΒΆ

func VincentyKm(a, b Coord) (float64, error)

VincentyKm computes Vincenty distance in kilometers. Time: O(1) with iteration, Space: O(1)

func Wasserstein1D ΒΆ

func Wasserstein1D[T Number](a, b []T) (float64, error)

Wasserstein1D computes 1D Wasserstein (Earth Mover's) distance. For 1D distributions, this equals the area between CDFs. Time: O(n log n), Space: O(n)

func WeightedEuclidean ΒΆ

func WeightedEuclidean[T Number](a, b []T, weights []float64) (float64, error)

WeightedEuclidean computes weighted Euclidean distance. weights[i] scales the contribution of dimension i. Time: O(n), Space: O(1)

Types ΒΆ

type BatchComputer ΒΆ

type BatchComputer[T Number] interface {
	ComputePairwise(vectors [][]T) ([][]float64, error)
	ComputeToPoint(vectors [][]T, point []T) ([]float64, error)
	ComputeWithContext(ctx context.Context, a, b []T) (float64, error)
}

BatchComputer for efficient batch distance calculations

type Coord ΒΆ

type Coord struct {
	Lat float64 // Latitude in degrees [-90, 90]
	Lon float64 // Longitude in degrees [-180, 180]
}

Coord represents a geographic coordinate (latitude, longitude).

type DistanceFunc ΒΆ

type DistanceFunc[T Number] func(a, b []T) (float64, error)

DistanceFunc is a function that computes distance between two vectors. Note: This name stutters with the package name. Consider using batch.Func or similar patterns in client code to avoid repetition.

type Float ΒΆ

type Float interface {
	~float32 | ~float64
}

Float constraint for floating point types

type GradientFunc ΒΆ

type GradientFunc func([]float64) []float64

GradientFunc computes the gradient of the function

type Graph ΒΆ

type Graph struct {
	// contains filtered or unexported fields
}

Graph represents a weighted graph for distance calculations

func NewGraph ΒΆ

func NewGraph() *Graph

NewGraph creates a new graph

func (*Graph) AStar ΒΆ

func (g *Graph) AStar(source, target int, heuristic func(int, int) float64) (float64, []int)

AStar computes shortest path using A* with heuristic Time: O(E log V) with good heuristic, Space: O(V)

func (*Graph) AddEdge ΒΆ

func (g *Graph) AddEdge(from, to int, weight float64)

AddEdge adds a weighted edge between two nodes

func (*Graph) AddUndirectedEdge ΒΆ

func (g *Graph) AddUndirectedEdge(a, b int, weight float64)

AddUndirectedEdge adds an undirected edge

func (*Graph) BFS ΒΆ

func (g *Graph) BFS(source, target int) (int, []int)

BFS computes shortest path in unweighted graph Time: O(V+E), Space: O(V)

func (*Graph) BellmanFord ΒΆ

func (g *Graph) BellmanFord(source int) (map[int]float64, bool)

BellmanFord computes shortest paths handling negative weights Returns distances and whether negative cycle exists Time: O(VE), Space: O(V)

func (*Graph) CommuteTime ΒΆ

func (g *Graph) CommuteTime(source, target int) float64

CommuteTime computes approximate expected commute time for random walk. WARNING: This is a simplified approximation using shortest path distance. True commute time requires computing hitting times using the fundamental matrix of the random walk, which involves matrix inversion. For accurate commute time, use a specialized graph analysis library. Time: O((V+E)logV), Space: O(V)

func (*Graph) ConnectedComponents ΒΆ

func (g *Graph) ConnectedComponents() [][]int

ConnectedComponents finds connected components Time: O(V+E), Space: O(V)

func (*Graph) Dijkstra ΒΆ

func (g *Graph) Dijkstra(source, target int) (float64, []int)

Dijkstra computes shortest path distance from source to target Returns distance and path. Returns inf if no path exists. Time: O((V+E)logV), Space: O(V)

func (*Graph) FloydWarshall ΒΆ

func (g *Graph) FloydWarshall() map[int]map[int]float64

FloydWarshall computes all-pairs shortest paths Time: O(VΒ³), Space: O(VΒ²)

func (*Graph) GraphDiameter ΒΆ

func (g *Graph) GraphDiameter() float64

GraphDiameter computes the diameter (maximum shortest path) Time: O(VΒ³), Space: O(VΒ²)

func (*Graph) GraphRadius ΒΆ

func (g *Graph) GraphRadius() float64

GraphRadius computes the radius (minimum eccentricity) Time: O(VΒ³), Space: O(VΒ²)

func (*Graph) IsConnected ΒΆ

func (g *Graph) IsConnected() bool

IsConnected checks if graph is connected Time: O(V+E), Space: O(V)

func (*Graph) ResistanceDistance ΒΆ

func (g *Graph) ResistanceDistance(source, target int) float64

ResistanceDistance computes approximate effective resistance between nodes. WARNING: This is a simplified approximation using shortest path distance. A full implementation requires computing the Moore-Penrose pseudoinverse of the graph Laplacian matrix, which is computationally expensive. For accurate resistance distance, use a specialized linear algebra library. Time: O((V+E)logV), Space: O(V)

type Individual ΒΆ

type Individual struct {
	Genes   []float64
	Fitness float64
}

Individual represents a genetic algorithm individual

type Metric ΒΆ

type Metric interface {
	Name() string
	Distance(a, b any) (float64, error)
	IsSymmetric() bool // Some metrics like KL divergence are asymmetric
	IsMetric() bool    // Satisfies metric space axioms (triangle inequality, etc.)
}

Metric interface for any distance metric

type Number ΒΆ

type Number interface {
	~int | ~int8 | ~int16 | ~int32 | ~int64 |
		~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64 |
		~float32 | ~float64
}

Number constraint for generic numeric types

type OptimizationFunc ΒΆ

type OptimizationFunc func([]float64) float64

OptimizationFunc represents a function to minimize/maximize

type Options ΒΆ

type Options struct {
	Normalize   bool      // Normalize result to [0,1]
	Weights     []float64 // Dimension weights
	Parallel    bool      // Use parallel computation for batch operations
	MaxDistance float64   // Early termination threshold (0 means no limit)
}

Options for configurable distance calculations

type Particle ΒΆ

type Particle struct {
	Position     []float64
	Velocity     []float64
	BestPosition []float64
	BestFitness  float64
	Fitness      float64
}

Particle represents a PSO particle

type StringDistanceFunc ΒΆ

type StringDistanceFunc func(a, b string) (int, error)

StringDistanceFunc computes distance between strings

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL