How we source and score every number

Company Status shows its work. A number without a source does not exist on this site: it is enforced in our database, not by editorial policy. This page explains exactly where numbers come from, how disagreements between sources are handled, and how the confidence score next to every fact is computed.

Every fact carries its receipts

Click the small citation chip next to any number and you will see every source behind it: the link, the publisher, when it was published, and what kind of source it is. If two sources disagree, we show both, including the value the disagreeing source reported. Nothing is silently discarded.

Source types and their weights

Not all sources are equal. An official filing outranks a news article, which outranks a social media post. These weights feed the confidence score:

Source type	Weight	Examples
Official filing	1	Companies House, SEC EDGAR
Company statement	0.9	Company site, press release, verified submission
News	0.75	Established tech and business press
Job board	0.6	Greenhouse, Lever, Ashby boards
Estimate	0.5	Data vendors, analyst estimates
Social media	0.35	Posts by founders or employees

The confidence formula

Confidence is deterministic and recomputed nightly. No AI model assigns it.

confidence = clamp(W_source × C_corroboration × R_recency × P_conflict, 0.05, 1.0)

W_source: the highest weight among sources that agree on the value (table above).
C_corroboration: +0.1 for each additional independent source (different publisher), capped at 1.25.
R_recency: facts age. Each metric has a half-life (employee counts: 180 days, open roles: 30 days, funding: 365 days). The factor never drops below 0.3.
P_conflict: −0.15 for each source that disagrees with the published value, floored at 0.5.

Scores at or above 0.7 show a green dot, between 0.45 and 0.7 amber, below 0.45 grey with a low-confidence badge. The full factor breakdown is visible in every citation popover.

When sources disagree

If a new source reports a value close to what we already show (within a small tolerance per metric), it corroborates the existing fact and confidence rises. If it diverges beyond the tolerance, the value from the higher-weight source wins, the disagreeing source stays attached and visible, confidence drops, and the conflict is queued for human review. Corrections supersede old values rather than overwrite them, so history stays intact.

How data is collected

Automated collectors run on fixed schedules: job boards twice a week, news every weekday, headcounts weekly from filings and press, funding weekly and on news triggers, and a discovery system scans launch platforms, GitHub, and marketplaces daily for companies we do not track yet. Each collector stores the page it read, so every fact is auditable end to end.

Expert reviews

Reviews marked verified were provided directly to us by the named expert, whose identity we confirmed. Quoted reviews are public statements, always attributed and linked to where they were made, and never shown with a verification badge.

Corrections

Spotted something wrong? Email nicola.raimondo@companystatus.ai. Corrections are applied by superseding the value, with the prior value and its sources preserved.