CardinalityEstimation 1.15.0
dotnet add package CardinalityEstimation --version 1.15.0
NuGet\Install-Package CardinalityEstimation -Version 1.15.0
<PackageReference Include="CardinalityEstimation" Version="1.15.0" />
<PackageVersion Include="CardinalityEstimation" Version="1.15.0" />
<PackageReference Include="CardinalityEstimation" />
paket add CardinalityEstimation --version 1.15.0
#r "nuget: CardinalityEstimation, 1.15.0"
#:package CardinalityEstimation@1.15.0
#addin nuget:?package=CardinalityEstimation&version=1.15.0
#tool nuget:?package=CardinalityEstimation&version=1.15.0
CardinalityEstimation
HyperLogLog-based set cardinality estimation library
This library estimates the number of unique elements in a set, in a quick and memory-efficient manner. It's based on the following:
- Flajolet et al., "HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm", DMTCS proc. AH 2007, http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf
- Heule, Nunkesser and Hall 2013, "HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm", https://research.google/pubs/hyperloglog-in-practice-algorithmic-engineering-of-a-state-of-the-art-cardinality-estimation-algorithm/
The accuracy/memory usage are user-selectable. Typically, a cardinality estimator will give a perfect estimate of small cardinalities (up to 100 unique elements), and 97% accuracy or better (usually much better) for any cardinality up to near 2^64, while consuming several KB of memory (no more than 16KB).
Usage
Usage is very simple:
ICardinalityEstimator<string> estimator = new CardinalityEstimator();
estimator.Add("Alice");
estimator.Add("Bob");
estimator.Add("Alice"); // duplicate — already counted
estimator.Add("George Michael");
ulong numberOfuniqueElements = estimator.Count(); // will be 3 — "Alice" was added twice but counted once
Nuget Package
This code is available as the Nuget package CardinalityEstimation. A strong-named build is also published as CardinalityEstimation.Signed.
To install, use the .NET CLI:
dotnet add package CardinalityEstimation
Or, equivalently, the Package Manager Console in Visual Studio:
Install-Package CardinalityEstimation
Controlling Accuracy and Memory
The constructor accepts a b parameter that controls the tradeoff between accuracy and memory:
// b = 14 (default) — ~3% standard error or better, up to ~16 KB per estimator
var estimator = new CardinalityEstimator(b: 14);
// b = 4 — minimum, < 1 KB but high error (up to ~100%)
var tinyEstimator = new CardinalityEstimator(b: 4);
// b = 16 — maximum, ~1% error or better, up to ~64 KB
var preciseEstimator = new CardinalityEstimator(b: 16);
b must be in the range [4, 16]. The standard error for large cardinalities is approximately 1.04 * 2^(-b/2), and memory consumption is bounded by 2^b bytes for the dense representation. The estimator gives perfect counts for the first 100 elements regardless of b, then transitions to a sparse representation, and finally to the dense HyperLogLog representation as cardinality grows.
Hash Functions
By default the estimator uses XxHash128 (from System.IO.Hashing), which is fast and has excellent distribution properties. Two alternatives ship in the box:
Murmur3(CardinalityEstimation.Hash.Murmur3)FNV-1a(CardinalityEstimation.Hash.Fnv1A)
You can supply your own GetHashCodeDelegate to the constructor:
using CardinalityEstimation;
using CardinalityEstimation.Hash;
GetHashCodeDelegate murmur = Murmur3.GetHashCode;
var estimator = new CardinalityEstimator(hashFunction: murmur);
The hash must be a 64-bit hash with good distribution — biased hashes will degrade estimate accuracy. When merging or deserializing estimators, all participants must have been built with the same hash function.
Thread-Safe Usage
CardinalityEstimator is not thread-safe. For concurrent producers, use ConcurrentCardinalityEstimator, which wraps the same algorithm with a ReaderWriterLockSlim:
var estimator = new ConcurrentCardinalityEstimator();
Parallel.ForEach(events, e => estimator.Add(e.UserId));
ulong uniqueUsers = estimator.Count();
Use ConcurrentCardinalityEstimator when multiple threads add to (or read from) the same estimator concurrently. If each thread owns its own estimator and you only combine results at the end, the basic CardinalityEstimator is faster — combine them with Merge (see below).
Merging Estimators
Estimators built with the same b and the same hash function can be merged losslessly. This is the core primitive for distributed and parallel cardinality counting — partition your data, count each shard independently, then merge:
var a = new CardinalityEstimator();
var b = new CardinalityEstimator();
a.Add("Alice"); a.Add("Bob");
b.Add("Bob"); b.Add("Carol");
// In-place merge of b into a
a.Merge(b);
ulong unique = a.Count(); // 3
// Static merge of many estimators into a new one
CardinalityEstimator combined = CardinalityEstimator.Merge(new[] { a, b });
For large numbers of estimators you can merge them in parallel via the extension method in CardinalityEstimatorExtensions:
using CardinalityEstimation;
ConcurrentCardinalityEstimator merged = shardEstimators.ParallelMerge();
ParallelMerge returns a ConcurrentCardinalityEstimator and uses all available cores by default; pass parallelismDegree to cap it.
Serialization
Use CardinalityEstimatorSerializer to persist an estimator to a stream and restore it later (e.g. to checkpoint state, ship it across the wire, or store it in a cache):
var serializer = new CardinalityEstimatorSerializer();
// Serialize
using (var stream = File.Create("estimator.bin"))
{
serializer.Serialize(stream, estimator);
}
// Deserialize — pass the same hash function the estimator was built with
using (var stream = File.OpenRead("estimator.bin"))
{
CardinalityEstimator restored = serializer.Deserialize(stream);
}
The serializer uses a versioned binary format (see DataFormatMajorVersion / DataFormatMinorVersion in CardinalityEstimatorSerializer) so newer minor versions can read older payloads of the same major version.
Zero-Allocation / High-Performance
For hot paths where you already have bytes (e.g. from a buffer pool, a network packet, or stackalloc), CardinalityEstimator implements ICardinalityEstimatorMemory, which exposes Add overloads for Span<byte>, ReadOnlySpan<byte>, Memory<byte>, and ReadOnlyMemory<byte>:
ICardinalityEstimatorMemory estimator = new CardinalityEstimator();
Span<byte> buffer = stackalloc byte[16];
// ...fill buffer...
estimator.Add(buffer); // no allocation
These overloads route through a GetHashCodeSpanDelegate and avoid the byte-array allocation of the legacy path.
Code Coverage
Coverage is collected locally with dotnet-coverage, pinned as a local .NET tool in .config/dotnet-tools.json. From a fresh checkout:
dotnet tool restore
dotnet build
dotnet dotnet-coverage collect --output-format cobertura --output coverage.xml \
"dotnet test --no-build --nologo"
The resulting coverage.xml is gitignored and can be fed into ReportGenerator, Codecov, or any Cobertura-aware viewer.CI does not currently collect coverage — this is a developer-local workflow.
Release Notes
1.15.0
- Hardened
CardinalityEstimatorSerializeragainst DoS via crafted input. - Switched target frameworks to
net8.0/net10.0. - Updated
System.IO.Hashingto10.0.7. - Fixed O(n)
ConcurrentCardinalityEstimatordirect-count storage (ConcurrentBag→ConcurrentDictionary). - Zero-allocation primitive
Addoverloads (stackalloc+ span hash). - Precomputed inverse-powers-of-two table in
Count()(removesMath.Powfrom hot loop). - Bulk-write dense lookup array in serializer.
- Fixed
CardinalityEstimator.Merge(IEnumerable)double-countingCountAdditionsfor the seed element; copy constructor now preservesCountAdditions. - Fixed
ConcurrentCardinalityEstimatorspan-delegate constructor silently discarding the supplied delegate. - Added null-argument validation to
ConcurrentCardinalityEstimator.Add(string)andAdd(byte[])for parity withCardinalityEstimator. - Compute
mvia bit shift (1 << bitsPerIndex) instead of(int)Math.Pow(2, bitsPerIndex)in constructors. - Documented the intentionally empty version-3 branch in
CardinalityEstimatorSerializer.Read. - Consolidated duplicated constants (
DirectCounterMaxElements,StackallocByteThreshold) and helpers (GetAlphaM,GetSubAlgorithmSelectionThreshold,CreateEmptyState) intoHllConstants. - Honored
parallelismDegreeinConcurrentCardinalityEstimator.ParallelMergeand removed deadParallelQueryvariable. - Replaced
GetHashCode-based lock ordering inConcurrentCardinalityEstimator.Merge/Equalswith a unique per-instance ID to eliminate a deadlock window on hash collisions. - Exposed
HashFunction/HashFunctionSpanproperties onCardinalityEstimatorandConcurrentCardinalityEstimatorand made cross-type conversions preserve the source hash functions losslessly. - Replaced the unused
coverlet.collectortest package with adotnet-coveragelocal tool manifest; coverage is now a developer-local opt-in viadotnet tool restore+dotnet dotnet-coverage collect.
1.14.0
- Added support for
Span<byte>,ReadOnlySpan<byte>,Memory<byte>, andReadOnlyMemory<byte>viaICardinalityEstimatorMemory(zero-allocation hot path). - Added
ConcurrentCardinalityEstimatorfor thread-safe usage, plusParallelMerge/SafeMergeextensions. - Updated and expanded XML documentation.
1.13.0
- Switched the default hash function to
XxHash128(fromSystem.IO.Hashing) on .NET 8+ for better speed and distribution.Murmur3andFNV-1aremain available. - Optimized
GetSigmaon .NET 8.
1.12.0
- Targets
net8.0andnet10.0; dropped support for end-of-life .NET versions. - Added the
CardinalityEstimation.Benchmarkproject (BenchmarkDotNet). - Made the hashing classes (
Murmur3,Fnv1A) public.
Keeping things friendly
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- murmurhash (>= 1.0.3)
- System.IO.Hashing (>= 10.0.7)
-
net8.0
- murmurhash (>= 1.0.3)
- System.IO.Hashing (>= 10.0.7)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on CardinalityEstimation:
| Package | Downloads |
|---|---|
|
AutoStat
Records statistics for a stream or collection of objects. |
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.15.0 | 33 | 5/1/2026 |
| 1.14.0 | 2,252 | 9/7/2025 |
| 1.13.0 | 254 | 9/4/2025 |
| 1.12.0 | 276 | 9/4/2025 |
| 1.11.1 | 11,261 | 11/27/2022 |
| 1.11.0 | 6,469 | 6/26/2022 |
| 1.10.0 | 54,332 | 2/1/2021 |
| 1.9.0 | 1,110 | 11/8/2020 |
| 1.8.0 | 1,133 | 10/7/2020 |
| 1.7.0 | 10,189 | 9/24/2018 |
| 1.6.0 | 3,290 | 10/23/2017 |
| 1.5.0 | 2,758 | 7/26/2017 |
| 1.4.0 | 5,005 | 3/19/2017 |
| 1.3.0 | 1,508 | 3/13/2017 |
| 1.2.1 | 1,664 | 8/21/2016 |
| 1.2.0 | 2,037 | 2/20/2016 |
| 1.1.0 | 1,784 | 11/18/2015 |
| 1.0.1 | 2,028 | 2/9/2015 |
| 1.0.0 | 2,317 | 2/9/2015 |
Hardened CardinalityEstimatorSerializer against DoS via crafted input
Switched target frameworks to net8.0/net10.0
Updated System.IO.Hashing to 10.0.7
Fixed O(n) ConcurrentCardinalityEstimator direct-count storage (ConcurrentBag -> ConcurrentDictionary)
Zero-allocation primitive Add overloads (stackalloc + span hash)
Precomputed inverse-powers-of-two table in Count() (removes Math.Pow from hot loop)
Bulk-write dense lookup array in serializer
Fixed CardinalityEstimator.Merge(IEnumerable) double-counting CountAdditions for the seed element; copy constructor now preserves CountAdditions
Fixed ConcurrentCardinalityEstimator span-delegate constructor silently discarding the supplied delegate
Added null-argument validation to ConcurrentCardinalityEstimator.Add(string) and Add(byte[]) for parity with CardinalityEstimator
Compute m via bit shift (1 << bitsPerIndex) instead of (int)Math.Pow(2, bitsPerIndex) in constructors
Documented the intentionally empty version-3 branch in CardinalityEstimatorSerializer.Read
Consolidated duplicated constants (DirectCounterMaxElements, StackallocByteThreshold) and helpers (GetAlphaM, GetSubAlgorithmSelectionThreshold, CreateEmptyState) into HllConstants
Honored parallelismDegree in ConcurrentCardinalityEstimator.ParallelMerge and removed dead ParallelQuery variable
Replaced GetHashCode-based lock ordering in ConcurrentCardinalityEstimator.Merge/Equals with a unique per-instance ID to eliminate a deadlock window on hash collisions
Exposed HashFunction/HashFunctionSpan properties on CardinalityEstimator and ConcurrentCardinalityEstimator and made cross-type conversions preserve the source hash functions losslessly
Replaced unused coverlet.collector test package with a dotnet-coverage local tool manifest