NQ · research

Touch rate ≠ close direction — the audit that demoted three Tier 1s in 24 hours

Three flagship signals fired at 80%+ for years. An audit against the right metric — close-direction at RTH, not excursion-touch — demoted two to Tier 2 magnets and killed a third entirely. The methodology bug, the fixes, and the rule of thumb.

updated 2026-04-21

shipped methodologyaudittouch-vs-closeamdlon-sweeppwhl

The question the old backtests were answering

Every flagship on this board used to ship with probabilities from a backtest that looked roughly like:

hit = (df.high >= target) | (df.max_favorable_excursion >= target)

That tells you: did price ever touch the target during the window?

That is a touch rate / magnet rate. It is not “does the session close in this direction.” When you draw a line on a chart labeled AMD BEAR 92%, a trader reading it assumes the 92% is a directional conviction. In every one of these three models, it wasn’t.

Three models audited, three different failure modes

Over 24 hours we re-ran the audit against a proper close-direction metric (close at RTH 16:00 vs entry, baseline BULL 53.5% / BEAR 46.5%):

Model 14 — AMD

Touch rate 88–92% ✓. Close direction: 47–53%, pure noise. Demoted Tier 1 → Tier 2, but the touch rate is real so we kept it as a magnet. Re-audit later recovered a proper 65% touch-at-120-min number with the right conditioning — see AMD.

Model 9 — LON SWEEP H / L

Touch rate 77–88% ✓. Close direction:

Cohort	Touch %	Close %	Edge
LON SWEEP H, bull_signals ≥ 3	90.5%	54.9% BULL	+1.4%
LON SWEEP L, bear_signals ≥ 3	86.4%	50.0% BEAR	+3.5%

+1.4pt of edge dressed up as 90%. Demoted Tier 1 → Tier 2 magnet. Probabilities kept as valid touch rates.

Model 9 — LON REV MID

This one was worse. The model fired a reversion line at 58–61% after a sweep, betting price would revert to session midpoint and the day would close opposite the sweep.

Cohort	n	Touch mid	Close in signal direction
Swept H → LON REV MID BEAR @ 58%	926	82.4%	30.2% BEAR (−16pt anti-predictive)
Swept L → LON REV MID BULL @ 61%	869	85.5%	35.3% BULL (−18pt anti-predictive)

The signal was actively betting the wrong way. Sweeps predict continuation, not reversion. Deleted from both .cs files.

Model 13 — PWH/PWL RETEST

The largest single bug found.

Cohort	Current prob	Reality	Delta
BULL RETEST, RTH	62%	55.4%	−7pt
BEAR RETEST, RTH	72%	40.9%	−31pt, ANTI-predictive
BULL RETEST, overnight	56%	35%	no edge
BEAR RETEST, overnight	64%	28–41%	no edge

BEAR RTH RETEST at 40.9% means 59% of signals reverse. Three of four cohorts were disabled. Only BULL RTH survives at 55%.

The BREAK state (price closes past PWH/L without retest) was verified clean at 99.5–100% touch and stayed Tier 1.

Running tally after 24 hours

Model	Status	Prob change
14 AMD	Tier 1 → 2 magnet	kept 88–92% as touch
9 LON SWEEP H/L	Tier 1 → 2 magnet	kept 77–88% as touch
9 LON REV MID	Deleted	58/61% was anti-predictive
13 PWH/L RETEST	3 of 4 cohorts disabled	only BULL RTH 55% survives
10 OB MID MAGNET	Recalibrated	92 → 70
10 OB BULL/BEAR	Validated ✓	kept 82–83% (real +18–21pt close-dir edge)
10 OB STRETCH B	Validated ✓	64–66% touch

The OB family was the positive surprise: its 10:30 positioning rule filters for days with genuine directional bias on top of the touch rate. 18–21pt of real close-direction edge above baseline.

Why this happened

Three reinforcing problems:

df.max_favorable_excursion >= target is a deceptively intuitive primitive. It answers “did price get there” not “did the session resolve there.”
Live visualizations reward touch rates. A chart line that gets tagged 90% of the time feels predictive even when the close direction is a coin flip.
Nobody re-audits a flagship. Once AMD shipped at 92%, nothing forced a second look until we built the close-direction script as part of a different investigation.

The rule of thumb

Any backtest claim above 70% must be re-measured as:

“What fraction of sessions closed on the signal side at a fixed horizon (RTH close, or +N minutes)?”

If the original code used any of these patterns, assume it’s a magnet/touch metric until proven otherwise:

.any()
max_favorable_excursion
ever_touched
continued_after_retest
(high >= target) | (low <= target)

A magnet signal is still useful — it tells you where price is likely to visit. But it is a different question from “which way does the day close” and should never be presented with the same probability label without explicit distinction. We now tag every model page as either close-direction, liquidity-magnet, or liquidity-sweep (continuation) to keep the two questions visually separated.

What shipped

audit_lon_sweep_close.py, audit_pwhl_retest_close.py, audit_ob_range_close.py as permanent audit tooling
Tier demotions + line deletion in PredictionModel.cs / PredictionModelWeb.cs
Every new flagship (VWAP REGIME, SB CONT, 0809 HOLD, PRELON SWEEP) ships with close-direction numbers as the primary metric, touch rate only as a secondary
Model-page category taxonomy so readers never confuse the two

The 24-hour audit was the single most valuable day of work on this project. The stack lost three “Tier 1” signals. What remains is honest.