About & Methodology

The Honest Copy

Legislative Transparency Through Data
Our Methods

How the Detectors Work

The Honest Copy uses deterministic algorithms to identify statistical outliers in U.S. congressional legislation. A "signal" is not an accusation — it is a data point that deviates from baseline patterns observed across the full legislative corpus.

Statistical outlier means a measurement that falls outside two standard deviations from the median for its category. For fiscal concentration, this means the ratio of institutional-directed dollar amounts to individual beneficiary language exceeds 2:1 in a bill that references named beneficiaries. For stealth velocity, it means introduction-to-enactment in under 60 days with fewer than 5 cosponsors.

Every signal enters a human review queue. An editor examines the source documents, verifies that the data was parsed correctly, and decides whether the anomaly warrants publication. No signal is auto-published. The algorithms detect; editors decide.


Data Pipeline

From Government APIs to Signal Queue

1. Ingestion — Bill text, status, cosponsorship, and fiscal entity data are downloaded from official government APIs (Congress.gov, GovInfo) and processed into structured records.

2. Offline scoring — Pre-trained classifiers and rule-based detectors assign balance scores (body-weight vs. institutional-weight) and compute timeline metrics for each bill.

3. Signal generation — Bills that exceed detection thresholds enter the signal queue with a confidence score between 0 and 1.

4. Editorial review — Queued signals are reviewed by an editor who reads primary sources before any publication decision.

Detection Thresholds

Minimum institutional allocation$1,000,000
Institutional-to-individual ratio2:1
Maximum velocity (introduction to law)60 days
Maximum cosponsors (velocity detector)5

Editorial Standards

Independence & Corrections

No auto-publish. Every anomaly flagged by our detectors is reviewed by a human editor before it reaches readers. Publication requires verification against primary source documents.

No partisan framing. Signals are reported with bill identifiers, dollar amounts, timelines, and source links. We do not characterize legislation as good or bad. Readers draw their own conclusions from the data.

Corrections. If we publish an error, we correct it promptly and note the correction at the top of the affected article with the date and nature of the change.

Independence. The Honest Copy is editorially independent. Our data pipeline is powered by What The Vote, an open civic data platform. Neither advertisers nor data partners influence editorial decisions.

The algorithms detect; editors decide.

Source Registry

Where Our Data Comes From

Source Data Used Update Cadence
Congress.gov Bill text, status, cosponsors, committees, actions Daily
GovInfo Enrolled bill XML, appropriations text Daily
U.S. Census Bureau Demographic context (poverty, income, population) Annual (ACS)
FEC Campaign finance records, donor disclosures Quarterly
Continued

The Honest Copy

Legislative Transparency Through Data
The Fast Track

Velocity Detection: Methodology

Congressional bills follow a well-documented procedural timeline: introduction, committee referral, markup, floor vote, conference (if bicameral), and presidential action. The median time from introduction to enactment across the 113th–119th Congresses is approximately 340 days. Most bills never reach enactment at all.

Our velocity detector identifies bills that completed this entire lifecycle in under 60 days — roughly one-sixth of the median — while carrying fewer than 5 cosponsors. Low cosponsorship does not imply wrongdoing; emergency appropriations and naming bills routinely pass with few cosponsors. But the combination of speed and low visibility places these bills in a statistical minority worth examining.

Speed isn't inherently suspicious — but speed combined with low visibility warrants a second look.
Editor's Note

Stories pending editorial review. Signals that clear the review process will appear here with full sourcing and context.


The Ledger

Fiscal Allocation Detection: Methodology

The fiscal concentration detector compares two measurements within each bill: beneficiary weight (the frequency and prominence of language naming individual beneficiaries — families, workers, seniors, veterans, students) and institutional weight (dollar amounts associated with institutional recipients — federal agencies, contractors, corporations, grant recipients).

When a bill's institutional-directed dollar amounts exceed its individual allocations by a ratio of 2:1 or greater — and the bill's text references named beneficiaries — the detector generates a signal. The minimum institutional allocation threshold is $1,000,000 to filter routine administrative appropriations.

All signals require editorial verification against primary sources before publication.
Editor's Note

Stories pending editorial review. Signals that clear the review process will appear here with full sourcing and context.


Advertisement
What The Vote

Track every bill, every vote, every representative.
The civic data platform powering The Honest Copy.

Visit wtfvote.us
Procedural Anomalies in American Legislation

The Honest Copy

Legislative Transparency Through Data
Legislative Intelligence

16 Detectors. 101,688 Bills Scanned. 1,776+ Anomalies Detected.

The work of democracy doesn't end on Election Day — it begins the morning after. An automated transparency audit across 7 congresses surfaces the daily legislative moves that shape policy between cycles: bills rewritten after introduction, same-day votes, dead legislation revived behind closed doors, and the quiet procedural decisions that rarely make the news.

Most civic attention concentrates on elections — who wins, who loses, which party holds power. But the legislation that shapes daily life is written, amended, and enacted in the months and years between those races. Committee markups, floor amendments, conference reports, and procedural votes happen every week Congress is in session. Most of it goes unnoticed.

The Honest Copy was built to watch the work that happens between elections. Our 16 detectors scan 101,688 bills for patterns that warrant public attention: fiscal concentration, committee bypasses, party-line votes, omnibus bundling, quiet repeals, same-day stampedes, and legislation statistically determined dead that suddenly revives.

We compare introduced bill text against the version that passed — highlighting every word added, removed, or replaced. We track which bills are alive, which are dead, and which came back from the grave. We measure the gap between who a bill names as beneficiaries and where the money actually flows.

This is not about left or right. It is about open or closed. Every signal enters a human review queue where an editor verifies the evidence against primary sources, checks for routine procedural explanations, and makes the editorial call. The algorithms detect; people decide.


Late Bulletin From the Wire · Pending Editorial Review
Shortcut Enacted with minimal committee review 119-s4530
Stampede Voted on the same day it was introduced 119-hconres66
Stampede Voted on the same day it was introduced 119-hconres74
Stampede Voted on the same day it was introduced 119-hres6
Stampede Voted on the same day it was introduced 119-hres4
Stampede Voted on the same day it was introduced 119-hres313
Advertisement
What The Vote

Track every bill, every vote, every representative.
The civic data platform powering The Honest Copy.

Visit wtfvote.us