Technical Research|December 19, 2025|45 min read

Hidden Execution Debt Inside SQL Server

Decoding the Statistics Blob — and Why It Matters

The Invisible Layer of Execution Risk

Modern systems rarely fail where we can see them.

They fail in the layers we rely on but never inspect.

In SQL Server, one of those layers is the internal statistics blob — persisted as STATS_STREAM. It governs cardinality estimation, which determines execution plans, which in turn govern performance, cost, and risk exposure.

Every production system using SQL Server depends on this structure.

Almost no one understands its internal format.

At PeopleNotTech, we define Execution Debt as the risk that accumulates when system-critical complexity outpaces visibility.

The statistics blob is a precise example of that pattern.

This research makes that layer visible.

Why This Matters

This is not curiosity-driven reverse engineering.

When internal structures are opaque:

Query plans become unpredictable
Performance tuning becomes reactive
Root-cause analysis slows
Governance narratives weaken
Operational risk accumulates

Execution Debt does not form in dramatic failures.

It forms in silent opacity.

When organisations cannot inspect the structures driving their systems, they cannot reason about execution risk in a disciplined way.

Understanding the statistics blob is one intervention against that opacity.

What This Research Establishes

Through systematic empirical analysis across hundreds of SQL Server 2025 RTM statistics objects — spanning data types, sampling modes, multi-column configurations, filtered statistics, temporal tables, and feature flag variations — the internal structure of the statistics blob was mapped in full.

This research builds on pioneering community work that first made aspects of the STATS_STREAM format legible, such as Joe Chang's detailed documentation of SQL Server query optimizer internals and statistics behaviour.

The result is a stable, field-level structural model of the statistics blob format. All structural fields except the runtime checksum can be deterministically reconstructed and validated against engine-generated artefacts. The checksum field is enforced by SQL Server and requires runtime instrumentation for algorithmic derivation. That boundary does not affect read-only parsing or structural validation.

This is not folklore.

It is a structural specification derived from observation and validation.

The Broader Pattern

The statistics blob is not interesting because it is obscure.

It is interesting because it is foundational.

It is:

Undocumented
Operationally central
Strictly validated
Rarely inspected
Capable of influencing large-scale execution behaviour

That combination is where Execution Debt forms.

When teams cannot see what governs their execution layer, they rely on emergent behaviour rather than structural clarity. Execution risk then migrates:

From engine internals
To performance firefighting
To cultural strain
To board-level exposure

Technical opacity becomes organisational fragility.

Methodological Position

This work was not a one-off decode.

It was conducted using an AI-assisted reverse-engineering methodology based on:

Controlled input generation
Systematic output observation (regardless of pass/fail)
Multi-variable behavioural fingerprint formation
Iterative hypothesis refinement
Observation-based specification synthesis

Rather than attempting to infer undocumented source code, this approach constructs a behavioural model derived solely from observed system responses.

The output of the process is a machine-readable derived specification suitable for validation, analysis, and governance.

This methodology is repeatable across opaque systems — not limited to SQL Server.

Intellectual Property Position

The methodology underpinning this research — specifically the AI-assisted process of controlled input generation, behavioural fingerprint formation, iterative refinement, and specification synthesis — is the subject of a filed UK patent application.

The patent application covers:

AI-assisted controlled input generation
Multi-variable behavioural fingerprint modelling
Iterative hypothesis refinement loops
Observation-based specification synthesis
Production of machine-readable technical debt reduction artefacts

The claims are not limited to SQL Server or database engines.

They apply to opaque or undocumented systems more broadly.

This document describes a structural case study. It does not disclose protected implementation details beyond what is necessary to describe observed format behaviour.

Why This Belongs at PeopleNotTech

PeopleNotTech exists to surface hidden human and technical risk.

The statistics blob is technical.

The consequences of not understanding it are organisational.

Execution Debt is never purely technical. It migrates into decision quality, delivery speed, confidence in infrastructure, and governance resilience.

We do not publish frameworks we cannot live inside.

This research is a case study in making invisible execution layers legible.

Where This Goes Next

The structural understanding of opaque system behaviour enables:

Statistics health diagnostics
Structural integrity scoring
Execution risk mapping
Governance-ready reporting
Drift detection against baseline behavioural fingerprints

We are developing structured Execution Debt diagnostics for opaque systems.

Authority is not opinion.

Authority is visibility.

And visibility is how execution risk becomes governable.

Value	Binary	Interpretation
0x00	00000000	Baseline uniform distribution, single-column, no special features
0x01	00000001	Filtered statistics
0x02	00000010	Memory-optimized table
0x0a	00001010	Persisted sample
0x0b	00001011	Ascending distribution or temporal table
0x38	00111000	Two-column statistics
0x40	01000000	Indexed view
0x68	01101000	Trace Flag 9612 enabled, or three-or-more-column statistics
0x76	01110110	Temporal history table
0xa2	10100010	Incremental statistics with sampling

Value	Interpretation
0x11	Baseline statistics (initial creation)
0x19	Statistics after INSERT/UPDATE operations
0x13	CHAR type statistics
0x1b	CHAR type after updates
0x40	Columnstore multi-column statistics
0xb1	Incremental statistics
0xb9	Incremental statistics with sampling

Type ID	SQL Type	Storage	Range High Key Size
48	TINYINT	1 byte	1 byte
52	SMALLINT	2 bytes	2 bytes
56	INT	4 bytes	4 bytes
127	BIGINT	8 bytes	8 bytes
59	REAL	4 bytes	4 bytes
62	FLOAT	8 bytes	8 bytes
60	MONEY	8 bytes	8 bytes
122	SMALLMONEY	4 bytes	4 bytes
106	DECIMAL/NUMERIC	Variable	Variable
61	DATETIME	8 bytes	8 bytes
58	SMALLDATETIME	4 bytes	4 bytes
40	DATE	3 bytes	3 bytes
41	TIME	Variable	Variable
42	DATETIME2	Variable	Variable
43	DATETIMEOFFSET	Variable	Variable
167	VARCHAR	Variable	Variable
175	CHAR	Fixed	Fixed
231	NVARCHAR	Variable	Variable
239	NCHAR	Fixed	Fixed
165	VARBINARY	Variable	Variable
173	BINARY	Fixed	Fixed
35	TEXT	Variable	Variable
99	NTEXT	Variable	Variable
34	IMAGE	Variable	Variable