Sigñal is in early beta. Scores reflect reviewed public marketing claims and visible evidence references — not product efficacy, safety, or regulatory status.
On this page
Overview The formula Six dimensions Location multipliers Claim intensity Category add-ons Confidence score Ranking Signal thresholds AI role What we don't score Disputes & corrections Update cadence
Version 1.3 · April 2026

Scoring methodology

Everything about how Sigñal measures the gap between what brands claim and what they visibly support — published in full. Any brand, researcher, or journalist can audit this. Nothing is hidden.

What this methodology measures: the alignment between public-facing marketing claims and visibly available supporting information at the time of scan. It does not assess product efficacy, safety, or regulatory status. It does not assume intent.

Overview

Signal is a claims-intelligence and evidence-alignment system. It measures the gap between what wellness brands publicly claim, what evidence they visibly provide, and how clearly those claims are connected to that evidence.

The Evidence Score (0–100) reflects that alignment. Higher score = stronger evidence alignment between what a brand claims and what it visibly supports. The score is composite, weighted, location-sensitive, and category-adjusted. It is calculated every time a page is scanned and stored in the Sigñal Index.

Green Signal
70 – 100
Claims appear proportionate to available evidence. Visible support is present and clearly connected.
Yellow Signal
45 – 69
Evidence exists, but is not clearly connected to all claims. Gaps are present and notable.
Red Signal
0 – 44
Claims extend beyond visible support. The gap between what is claimed and what is shown is material.

The formula

gap_score = (base_gap × location_factor) + intensity_penalty + category_addons
// internal gap measure, clamped to 0–100 (higher = worse alignment)

base_gap = Σ (dimension_gapi × weighti)
// 60% deterministic + 40% AI analysis, blended per dimension

location_factor = 0.5 + (avg_claim_location_multiplier / 3.0) × 1.0
intensity_penalty = Σ (intensity_level × (100 − visible_support_score) × 0.15) × (location_mult / 3.0)
category_addons = Σ category_specific_checks // capped at +15 total

Evidence Score = 100 − gap_score // displayed score: higher = better

Worked example

A supplement brand with ingredient-level studies on a /science page, but no finished-product trial and a hero claim of "clinically proven" with no visible citation from the product page.

// Step 1: dimension gap scores × weights → base_gap (higher = worse alignment)

Evidence Alignment      48 × 0.25 = 12.0
Claim Clarity          42 × 0.15 =  6.3
Evidence Quality       52 × 0.15 =  7.8
Consumer Distortion Risk  38 × 0.15 =  5.7
Claim Strength         46 × 0.10 =  4.6
Evidence Accessibility  55 × 0.10 =  5.5
base_gap = 41.9 // Step 2: location factor — hero headline + benefit bullets dominant
location_factor = 1.15  // weighted avg of zone multipliers
41.9 × 1.15 = 48.2

// Step 3: intensity penalty — "clinically proven" (high intensity) with no hero-page citation
intensity_penalty = +2.8

// Step 4: category add-ons — supplements: no finished-product trial visible (+3)
category_addons = +3.0
gap_score = 48.2 + 2.8 + 3.0 = 54
Evidence Score = 100 − 54 = 46 → Yellow Signal
// Ingredient studies exist but the finished-product "clinically proven" claim has no visible support

Six dimensions

Each dimension measures a specific aspect of evidence alignment and receives an internal gap score from 0–100 (higher = larger gap between claims and visible support). These sub-scores feed the formula above and are inverted in the final Evidence Score. Scores are blended from deterministic rules (60%) and AI analysis (40%).

DimensionWeightWhat it measures
Evidence Alignment25%How well visible evidence matches the strength and specificity of claims made. A brand with peer-reviewed studies clearly connected to its claims scores well here regardless of where those studies are hosted on the domain.
Claim Clarity15%How precisely claims are stated — whether outcomes are quantified, durations specified, and subject populations identified. Vague language with no visible support scores higher risk than specific language with visible backing.
Evidence Quality15%The verifiability of cited sources — clickable study links, author names, DOIs, COA links. Evidence that exists but cannot be inspected is scored differently from evidence that is fully transparent.
Consumer Distortion Risk15%Combined signal of regulatory proximity (disease-treatment language), comparison fairness ("3× faster" without a baseline), and trust marker integrity (badges or testimonials used as proof without a link to underlying evidence).
Claim Strength10%How plausibly the described biological or physiological mechanism is grounded in visible references. A mechanism claim with no citations scores higher risk than one with linked human trial data.
Evidence Accessibility10%How easily a consumer can locate the supporting evidence from the page they are reading. Evidence buried several clicks away or hidden behind login gates scores higher than evidence linked directly from the claim.
Note on internal analysis: The scoring engine uses seven sub-dimensions internally for analysis fidelity. The six dimensions above are the public-facing surface derived from those sub-scores. The full internal model is available to researchers on request.

Location multipliers

Claims are weighted by where they appear on the page. A "clinically proven" claim in the hero headline carries 6× more weight than the same claim buried in the footer, because that's where brands put the claims they most want you to believe.

Hero / above-fold3.0×
Benefit bullets / CTAs2.0×
Science / clinical section2.0×
Comparison charts / tables2.0×
FAQ / testimonials1.5×
Footer / disclaimer0.5×
Hidden / collapsed content0.3×

Claim intensity ladder

Claims are classified by intensity. Higher-intensity language with lower visible support generates a proportionally larger penalty — because the gap between what's claimed and what's shown is wider.

Low intensity — baseline risk
supportsmay helpdesigned toformulated forpromoteshelps maintainmay support
These claims have a low evidence requirement. Modest language, modest scrutiny.
Medium intensity — moderate penalty
improvesboostsenhancesincreasesreducesoptimizesregulatesstrengthens
These claims imply measurable outcomes. Visible evidence becomes more important.
High intensity — heavy penalty
clinically provenscientifically provenguaranteedtreatspreventsreversescuresrewires3× fasterinstant results
High-intensity language with no visible evidence generates the largest penalty. These claims carry an implicit promise that demands visible proof.

Category-specific add-ons

Each product category has additional risk checks applied after the base score. These are capped at a combined +15 total to the final score.

Supplements

Neurotech & brain devices

Recovery devices (red light, PEMF, cold therapy, massage)

Wearables & trackers

Confidence score

Separate from the Evidence Score, confidence reflects how reliably the scan captured the page's content. Low confidence forces a "Limited Signal" state regardless of score.

Ranking

Within each category, products are sorted by Evidence Score descending — highest score = Rank #1. Tie-breakers in order:

Low-confidence scans are placed at the bottom with "Rank Pending" until confidence improves.

Signal thresholds

The Evidence Score maps to three signals. Signal does not tell consumers what to buy. Signal shows what holds.

The role of AI

The scoring engine is deterministic — parsing, weighting, math, caching, and ranking are all rule-based and fully auditable. Claude (Anthropic) provides the analyst layer only: nuanced language understanding, per-dimension scores, and human-readable explanations.

Claude's scores are blended in at 40% weight per dimension. The deterministic engine contributes 60%. No score is determined solely by AI output — and all Claude responses are validated against Signal's language requirements before being stored or displayed.

Signal's AI layer operates under strict language constraints. It is never permitted to speculate about intent, render verdicts about honesty, or use: fraud, scam, lie, fake, false, illegal, deceptive, misleading (as a verdict), noncompliant, or "doesn't work."

Approved language includes: "This claim extends beyond visible support", "Evidence exists, but is not clearly connected to this claim", "Claims appear proportionate to available evidence", "visible support appears limited", "source transparency appears incomplete."

What Signal does not measure

Signal is not a review platform. Signal is not a medical authority. Signal is not a product efficacy judge.

Signal measures the gap between what brands claim, what evidence they provide, and how clearly those claims are connected to that evidence. Nothing more.

A red signal means claims extend beyond visible support on the page at the time of scan — not that a product is unsafe, ineffective, or that any intent to mislead exists.

Disputes & corrections

Sigñal is built on public data crawled at a point in time. Crawlers have limits: JavaScript-rendered pages may be partially captured, evidence hosted on subdomains may be missed, and content changes after a scan are not reflected until the next scan.

If a score is materially wrong — because our crawler missed a /science page, misread a claim, or the page has changed — brands and third parties can request a correction.

How to dispute a score

What we correct: crawling errors, missed public evidence pages, outdated scans. What we do not change based on disputes: methodology weights, scoring thresholds, or how the system categorizes claim intensity. Those are fixed rules, published above.

Update cadence

Questions about the methodology? Email signal@bhvd.com.

Version 1.3 · Last updated April 2026 · Sigñal