Squall · the open scoreboard

When we say 73 percent,
this is how often 73 percent verifies.

Every alert is graded against ground truth. The number on your phone means something because you can check it here, every night, going back as far as we have records. The scoreboard is the product.

Headline numbers
Reliability diagram
Predicted versus observed, binned by tier.
The dashed diagonal is perfect calibration. Each dot is a bin of alerts; the y-axis is how often that bin actually verified as a tornado. Closer to the diagonal is better. Dot size is proportional to the number of alerts in the bin.

The upper-right bins (predicted 0.7 to 0.9) sit visibly below the diagonal. Every one of those "misses" was a verified severe storm (hail, downburst, damaging wind) that did not produce a tornado. That gap is the radar-only ceiling on tornado-vs-severe discrimination; the next reduction has to come from new sensing (GLM lightning jump, ProbSevere v3, dual-pol), not threshold tuning.
By EF bucket
EFnPODFARMean lead (min)Brier
Live feed · most recent 10
Hurricane: rapid-intensification predictor
Squall predicts rapid intensification (30+ kt strengthening in 24 h) hours before NHC.

Verified retrospectively on the 2015-2024 Atlantic + East-Pacific HURDAT2 best-track corpus. Model: logistic regression on trajectory features only (current intensity, prior 12 / 24 h intensity change, latitude, motion, calendar position). Trained on 1990-2014 storms (857 storms), held-out on 2015-2024 (436 storms, 12,830 observation points, 653 confirmed RI events).

Reference: NHC SHIPS-RII operational baseline, DeMaria et al 2021 NHC RI verification report (AUC ~0.78-0.83, POD ~0.45, FAR ~0.70). Squall's gain over the operational baseline is driven by perfect-history retrospective evaluation; real-time performance against ATCF advisories is the next verification milestone. Inner-core lightning (GLM), SST, and shear features are not yet in the fit; we expect those to materially improve real-time POD.