One Hundred
Years
of —
A century leaves fingerprints.
We read them.

These datasets track where institutions failed, where systems drifted, and where patterns hid in plain sight. One hundred years of American life — measured, visualized, and published open.

The CrossingDHS · 1892–2026
Featured · Vol. I, entry 06
From 1892 to 1995, the US averaged 17,000 deportations per year. After 1996, the system scaled by an order of magnitude.
1996
1892195019962026
Five volumes. Each one a question history can finally answer.
The Archive
Vol. I · 10 of 10 live
The Reckoning
A century of American institutional failure. The data does not take sides. It keeps score.
Each entry pairs a primary dataset with the moment its line bends. Ten entries form one argument: that the twentieth century is best understood not as a sequence of events but as a sequence of slopes.
01
The Sighting
NUFORC · 1905–2023
112,000 witness narratives treated as a sociolinguistic corpus. Vocabulary tracks culture, not the sky. Independent witnesses converge. Thirty archetypes emerge.
On November 7, 1957, 43 people across six states independently reported the same object. They did not know each other. Their descriptions match.
02
The Given
SSA · 1880–2024
104,000 names across 144 years. Every name is a cultural wave — rise, peak, decay. The shape encodes information invisible to people living through it.
In 1972, 3.2% of all baby girls were named Jennifer. By 1988, there were none. The decay curve fits a radioactive isotope.
03
The Color Line
Seamheads · 1920–2024
The barrier held forty-seven years. 2,300+ Negro Leagues careers, finally counted. A civil rights document written in statistics.
Josh Gibson hit .466 in 1943. That is the actual single-season batting record. It was always the record. It just wasn't being counted.
1947
04
The Atmosphere
NOAA GHCN-Daily · 1895–2026
12,847 stations. 125 years of daily observations. The strangest day for any place you can name. Whether "once a century" still means once a century.
Portland reached 116°F on June 28, 2021 — nine degrees past its all-time record. The model assigns this a probability that rounds to zero.
116°F
05
The Trigger
CDC / FBI UCR · 1900–2024
A century of American gun violence. 134 laws tracked across 50 states. The rate peaked, then fell by more than half. Almost nobody knows.
The gun homicide rate peaked in 1974 at 10.2 per 100,000. By 2024 it was 4.7 — a 54% decline. Most people believe the opposite.
06
The Crossing
DHS Yearbooks · 1892–2026
A century of arrivals, a century of departures. What the record shows about who was welcome, and when, and why. The 1996 law shows up as a vertical line.
From 1892 to 1995, the US averaged 17,000 deportations per year. After 1996, the system scaled by an order of magnitude.
1996
07
The Untreated
NIMH / SAMHSA · 1955–2026
The deinstitutionalization of American psychiatry. 558,922 beds became fewer than 40,000. The incarceration line rose as the bed count fell. The lines crossed in 1972.
The three largest psychiatric facilities in America are not hospitals. They are county jails.
1972
08
The Red Line
HOLC / Mapping Inequality · 1935–2024
239 cities graded by a federal agency in the 1930s. Ninety years later, the grades still predict homeownership, wealth, and life expectancy at the block level.
The homeownership gap between Black and white Americans is wider today than in 1960, when housing discrimination was still legal.
WHITEBLACK
09
Dead Air
FCC Broadcast License DB · 1934–2026
The Telecommunications Act of 1996 didn't deregulate radio. It ended it. One company owned 1,200 stations within five years. Local news vanished.
In 1996 there were 5,100 independent radio owners. Five years later, 3,800. One law. One signature.
1996
10
The Black Box
NTSB · 1926–2026
How flight became the safest form of travel. The cockpit voice recorder changed everything. Every crash investigated. Every cause found. Every fix implemented.
The last fatal crash of a major US commercial airline was 2009. Sixteen years. The investigation methodology is why.
2009
Updated 22 May 2026 · Source CSVs on GitHub
Coming · Volumes II–V
Vol. II · Announced
The Strange
A century of things that don't fit. Mass hysteria. Classified documents. Anomalous signals. Ghost geographies. Serious methodology. Uncanny data.
Vol. III · Announced
The Self
Personal scale. Every entry is about the visitor. Your name. Your neighborhood. Your ancestors. Your birth year's cultural fingerprint.
Vol. IV · 2027
The Future
The first hundred years of the internet. A century of climate forcing. Same methodology. Forward-facing questions.
Vol. V · 2028
The Beautiful
A century of American aesthetic culture as data. Color in painting. Sentiment in music. Architecture mapped against economic conditions.
About the project
Why this exists.
One Hundred Years is an independent research and visualization project. Each entry takes a single dataset — one that spans at least a century of American life — and asks what patterns become visible only at that scale.
The project is built by one person. There is no newsroom, no institution, no funding. The work is its own justification. Every dataset is public. Every methodology is published. Every line of code is open.
The name comes from the conviction that a century of data is enough to see things that are invisible at shorter timescales — and that most of these datasets have never been visualized with the care they deserve.
Press & contact
For press inquiries, interviews, or collaboration proposals:
hello@onehundredyears.report
You are free to excerpt, quote, and embed any visualization from this project with attribution. Formal citation format is below.
Methodology
How the analysis works.
M1Station Anomaly Engine
Per-station daily anomalies computed against a 1901–2000 baseline. Each station's record is cleaned, gap-filled where coverage allows, and scored against its own historical distribution. No interpolation between stations.
Used in · The Atmosphere
M2Corpus Linguistics Pipeline
TF-IDF temporal windowing, UMAP dimensionality reduction, HDBSCAN clustering. Vocabulary tracked per-decade. Semantic similarity scored via sentence embeddings. Archetype discovery via narrative structure templates.
Used in · The Sighting
M3Cultural Signal Detection
Name frequency time series modeled as rise-peak-decay curves. Half-life extraction, contagion modeling, and cross-name correlation. Trigger event detection via changepoint analysis on first-derivative signals.
Used in · The Given
Each entry publishes its own methodology tab with full technical detail, known limitations, and confidence ratings. The methods listed here are summaries.
Data sources
Where the data comes from.
GHCN-DailyNOAA / NCEI~12,847 stations04 · Atmosphere
NUFORC Narrative ArchiveNUFORC111,961 reports01 · Sighting
SSA Baby NamesSocial Security Admin104,819 names02 · Given
Negro Leagues DatabaseSeamheads2,300+ careers03 · Color Line
CDC / FBI UCRCDC / FBI1900–202405 · Trigger
DHS / INS YearbooksDHS / National Archives1820–202406 · Crossing
NIMH / SAMHSANIMH / BJS1900–202407 · Untreated
HOLC Redlining MapsMapping Inequality239 cities08 · Red Line
FCC Broadcast License DBFCC1934–202409 · Dead Air
NTSB Aviation DatabaseNTSB~90,000 accidents10 · Black Box
License
All source code is released under the MIT License. Original datasets and derived data files are released under CC0 1.0 (public domain).
You may use, modify, and redistribute any part of this project for any purpose, including commercial use, without asking permission.
Raw data
Cleaned datasets for each live entry are available in the project's GitHub repository under data/.
Browse on GitHub →
How to cite
Haynes, J. (2026). One Hundred Years: [Entry Title]. onehundredyears.report. Retrieved [date].
Replace [Entry Title] with the specific entry name and [date] with your access date. BibTeX available in each entry's methodology tab.
Dispatches
Get notified when a new entry goes live.
No spam. No tracking pixels. One email per entry — roughly every few months. Unsubscribe anytime.
"The data does not take sides.
It keeps score."

Every dataset is open. Every model is documented. Every finding is labeled — HIGH CONFIDENCE, CANDIDATE, or SPECULATIVE. Nothing is overstated. Code is MIT licensed. Data is CC0. No ads. No tracking. No paywall. Ever.

onehundredyears.report · Open data · MIT · CC0