whfmt.Analysis
1.1.1
dotnet add package whfmt.Analysis --version 1.1.1
NuGet\Install-Package whfmt.Analysis -Version 1.1.1
<PackageReference Include="whfmt.Analysis" Version="1.1.1" />
<PackageVersion Include="whfmt.Analysis" Version="1.1.1" />
<PackageReference Include="whfmt.Analysis" />
paket add whfmt.Analysis --version 1.1.1
#r "nuget: whfmt.Analysis, 1.1.1"
#:package whfmt.Analysis@1.1.1
#addin nuget:?package=whfmt.Analysis&version=1.1.1
#tool nuget:?package=whfmt.Analysis&version=1.1.1
whfmt.Analysis
Field-level semantic diff between binary files using whfmt.FileFormatCatalog definitions.
Instead of comparing raw bytes, whfmt.Analysis understands the structure of a file:
- Groups entries by logical key (e.g. filename inside a ZIP)
- Ignores noise fields (timestamps, padding, calculated offsets)
- Surfaces only meaningful structural changes
- Side-by-side hex diff per changed field (BytesA / BytesB / DiffMask)
- Checksum validation — detects corrupted checksums (CRC32 / MD5 / SHA1 / SHA256)
- Structural diff — block-level OnlyInA / OnlyInB / InBoth using MD5 hashes
- Outputs rich text, JSON, CSV, Markdown, or dark-themed HTML reports
Powered by 799 whfmt format definitions covering Archives, Images, Executables, Documents, Audio, Databases, and more.
Full documentation: whfmt-Analysis-guide.md — API reference, architecture, integration guides, and usage examples.
What's New
v1.1.1 — Catalog 1.3.2 alignment
- Catalog bump to
whfmt.FileFormatCatalog 1.3.2(Phase B audit + 2 bug fixes, 799 definitions, schema v3 canonical). - No Analysis API changes — drop-in upgrade from 1.1.0.
v1.1.0 — Hex diff, checksums, structural diff
HexDiff— per-byteBytesA/BytesB/DiffMaskrendered inline for changed binary fieldsChecksumStatus— CRC32 / MD5 / SHA1 / SHA256 stored-vs-computed validation per file (CorruptedCountA,CorruptedCountB)StructuralDiff— block-levelOnlyInA/OnlyInB/InBothusing MD5 block hashesFieldChange.IsCorrupted— surfaces fields with broken checksums in diff outputDiffResultis now an immutable value-object (all propertiesinit-only)CompareAsync()— file-path and stream overloads- New renderers —
DiffRenderer.ToCsv()andDiffRenderer.ToMarkdown()(GitHub-flavored tables with emoji status) - Single-pass field extraction in
FormatDiff— half the JSON iteration cost on large definitions
Install
dotnet add package whfmt.Analysis
Quick Start
using WhfmtAnalysis;
using WpfHexEditor.Core.Definitions;
var catalog = EmbeddedFormatCatalog.Instance;
// Compare two ZIP files semantically
var result = FormatDiff.Compare(catalog, "v1.zip", "v2.zip");
if (result.IsIdentical)
{
Console.WriteLine("Files are semantically identical.");
}
else
{
Console.WriteLine($"Format: {result.FormatName}");
Console.WriteLine($"Changed fields: {result.ChangedCount}");
foreach (var change in result.FieldChanges.Where(c => c.IsChanged))
Console.WriteLine($" {change.FieldName}: {change.ValueA} → {change.ValueB}");
}
API Reference
FormatDiff.Compare()
// From file paths — format auto-detected
DiffResult Compare(IEmbeddedFormatCatalog catalog, string fileA, string fileB)
// From byte arrays — format auto-detected from extension
DiffResult Compare(
IEmbeddedFormatCatalog catalog,
byte[] dataA, string nameA,
byte[] dataB, string nameB)
FormatDiff.CompareAsync()
// Async from file paths
Task<DiffResult> CompareAsync(IEmbeddedFormatCatalog catalog, string fileA, string fileB)
// Async from streams
Task<DiffResult> CompareAsync(
IEmbeddedFormatCatalog catalog,
Stream streamA, string nameA,
Stream streamB, string nameB)
DiffResult
| Property | Type | Description |
|---|---|---|
FileA / FileB |
string |
File names |
SizeA / SizeB |
long |
File sizes in bytes |
FormatName |
string |
Detected format name |
FieldChanges |
IReadOnlyList<FieldChange> |
All field comparisons |
ChangedCount |
int |
Number of changed fields (excl. ignored) |
IsIdentical |
bool |
True when all key fields match |
RawByteDelta |
long |
Size difference in bytes |
ChecksumsA / ChecksumsB |
IReadOnlyList<ChecksumStatus> |
Per-checksum validation |
StructuralDiff |
StructuralDiff? |
Block-level diff (OnlyInA/B/InBoth) |
FieldChange
| Property | Type | Description |
|---|---|---|
FieldName |
string |
Field name from the whfmt definition |
ValueA / ValueB |
string |
Hex-formatted values |
IsChanged |
bool |
True when the values differ |
IsIgnored |
bool |
True for noise fields (timestamps, etc.) |
IsCorrupted |
bool |
True when a checksum field is invalid |
HexDiff |
HexDiff? |
Per-byte diff mask for binary fields |
DiffRenderer
// Plain text (console-friendly, includes hex diff + checksum + structural sections)
string text = DiffRenderer.RenderText(result);
// JSON (for tooling / CI pipelines)
string json = DiffRenderer.RenderJson(result);
// CSV (Field,ValueA,ValueB,IsChanged,IsIgnored,IsCorrupted,DifferentBytes)
string csv = DiffRenderer.ToCsv(result);
// GitHub Markdown table with emoji status
string md = DiffRenderer.ToMarkdown(result);
// Dark-themed HTML (for reports)
string html = DiffRenderer.RenderHtml(result);
Output Examples
Text
whfmt.Analysis — Semantic Binary Diff
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Format : ZIP
File A : build-v1.zip (1,204,812 bytes)
File B : build-v2.zip (1,209,044 bytes)
Delta : +4,232 bytes
Key fields:
✓ entry_count 00000023 → 00000024 CHANGED
✓ compression_method 00000008 → 00000008 unchanged
✓ crc32 A3F2019C → B7C3A4D1 CHANGED
Ignored: timestamps (2), offsets (4)
Result : 2 field(s) changed
JSON
{
"formatName": "ZIP",
"fileA": "build-v1.zip",
"fileB": "build-v2.zip",
"sizeA": 1204812,
"sizeB": 1209044,
"rawByteDelta": 4232,
"isIdentical": false,
"changedCount": 2,
"fieldChanges": [
{ "fieldName": "entry_count", "valueA": "00000023", "valueB": "00000024", "isChanged": true, "isIgnored": false },
{ "fieldName": "crc32", "valueA": "A3F2019C", "valueB": "B7C3A4D1", "isChanged": true, "isIgnored": false }
]
}
Supported Formats with Semantic Diff
| Format | Key Fields | Ignored |
|---|---|---|
| ZIP | entry_name, compression_method, crc32, entry_count | timestamps, offsets |
| PNG | width, height, bit_depth, ihdr_crc | - |
| PE/EXE | machine_type, entry_point_rva, size_of_image | time_date_stamp, checksum |
| pdf_version, page_count, root_obj_id | creation_date, producer | |
| MP3 | mpeg_version, bitrate, sample_rate, id3 tags | - |
| SQLite | page_size, schema_format, user_version | change_counter |
All 799 catalog formats are supported for raw-byte fallback. Formats with a diff block in their .whfmt definition get full semantic key-field comparison.
CI Integration
# .github/workflows/binary-diff.yml
- name: Semantic binary diff
run: |
dotnet script diff.csx artifacts/v1/app.exe artifacts/v2/app.exe
// diff.csx
#r "nuget: whfmt.Analysis, 1.0.0"
using WhfmtAnalysis;
using WpfHexEditor.Core.Definitions;
var result = FormatDiff.Compare(EmbeddedFormatCatalog.Instance, Args[0], Args[1]);
Console.WriteLine(DiffRenderer.RenderText(result));
if (!result.IsIdentical) Environment.Exit(1);
Architecture
whfmt.Analysis
├── FormatDiff — entry point, format detection, single-pass field extraction
├── DiffResult — immutable value-object result model
├── FieldChange — per-field comparison (IsCorrupted, HexDiff)
├── ChecksumStatus — stored vs computed checksum per entry
├── StructuralDiff — block-level OnlyInA / OnlyInB / InBoth
└── DiffRenderer — text / JSON / CSV / Markdown / HTML output
Depends on: whfmt.FileFormatCatalog 1.3.2+ (zero other dependencies, cross-platform net8.0).
Related Packages
| Package | Description |
|---|---|
| whfmt.FileFormatCatalog | 799 format definitions — required dependency |
| whfmt.Validate | dotnet tool — validate binary files from the CLI |
| whfmt.Fuzz | Format-aware binary fuzzer for parser testing |
| whfmt.CodeGen | dotnet tool — generate C# parser classes from .whfmt |
License
GNU AGPL v3.0 — © 2016–2026 Derek Tremblay / Pulsar Informatique
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net8.0
- whfmt.FileFormatCatalog (>= 1.3.2)
- WpfHexEditor.Core.Contracts (>= 1.0.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
1.1.1 — Catalog bump to whfmt.FileFormatCatalog 1.3.2 (Phase B audit + 2 bug fixes, 799 definitions, schema v3 canonical). No Analysis API changes. 1.1.0 — Format-aware diff over whfmt.diff blocks (keyFields, ignoreFields, groupBy). 1.0.0 — Initial release. FormatDiff.Compare() with field-level grouping, ignore lists, and JSON/text/HTML rendering.