Methodology

1. The Data

We use multiple independent archaeological databases that were built for entirely different purposes. The methodology page describes seven used in the original analysis; the merged paper uses ten (adding EAMENA, DARE, and Historic England):

The Megalithic Portal (62,000 sites) is a community-maintained catalogue of ancient and megalithic sites, overwhelmingly concentrated in Western Europe and the British Isles. It was not designed for great-circle analysis — it was built by amateur archaeologists who wanted to document standing stones, barrows, and dolmens near their homes.

The Pleiades Gazetteer (34,000 sites) is an academic database from the classical world (Greek, Roman, Near Eastern). It was built by professional classicists and ancient historians. Its coverage is Mediterranean-centered, with very different geographic biases than the Portal.

p3k14c radiocarbon database (37,000 sites) contributes independently dated sites from the global radiocarbon record, providing temporal anchoring independent of typological dating.

DARE (Digital Atlas of the Roman Empire) (27,000 sites) provides comprehensive coverage of Roman-era settlements, military installations, and infrastructure across the Mediterranean basin and northern Europe. It was built by classical archaeologists focused on the Roman world's spatial distribution.

OSM (OpenStreetMap historical/archaeological sites) (350,000 sites) aggregates crowdsourced data on archaeological sites and heritage locations globally, capturing community-contributed local knowledge from regions underrepresented in academic databases.

DARMC (Digital Atlas of Roman and Medieval Civilizations) (4,200 sites) covers ancient ports, shipwrecks, and medieval settlement patterns across Eurasia, developed by historians tracking urbanization dynamics across centuries of cultural change.

Wikidata (archaeological/heritage sites) (106,000 sites) represents a modern collective effort to document world heritage and archaeological sites, pulling from multiple institutional sources and creating a global index independent of any single research tradition.

Together these databases contain approximately 259,000 unique archaeological sites (over 550,000 raw database entries) from every inhabited continent, spanning from ~9600 BCE (Gobekli Tepe) to the medieval period. The land-constrained spatial test in Section 5 uses a 180,000-site subset (5 databases with monument/settlement classification). No database was created with any awareness of the great circle hypothesis, and their geographic and disciplinary biases are largely independent of one another.

Key point: If a pattern appears in multiple databases with different geographic biases, built by different communities, for different purposes — it is unlikely to be an artifact of how any single database was constructed.

2. The Circle

A great circle is the largest possible circle you can draw on a sphere — like the equator, or a line of longitude, but tilted at any angle. Any two points on Earth define a great circle. It represents the shortest path between any two points on its circumference.

This particular great circle was proposed by Jim Alison in 1995, long before this analysis. It is defined by its pole at 59.682°N, 138.646°W (in southeastern Alaska). The circle passes through or near Giza, Nazca, Easter Island, Angkor Wat, Persepolis, and Mohenjo-daro.

Critically, the circle was fixed before any statistical testing began. We did not search for the "best" circle — we tested the one that was already proposed. This eliminates the problem of "p-hacking" by trying many circles and picking the best one.

3. The Baseline

The central question is: do more sites fall near this circle than we'd expect by chance?

To answer this, we need a baseline — what "chance" looks like. A naive approach would assume sites are sprinkled uniformly across the Earth. But sites are not uniformly distributed. They cluster in habitable regions: river valleys, coastlines, temperate zones. Europe has far more catalogued sites than Antarctica.

Analogy: Imagine scattering tennis balls across a hilly park. They'll roll into valleys and collect along paths — they won't be uniform. If you then draw a line and count how many tennis balls are near it, you need to compare against what a random line through the same hilly park would collect — not a line through a flat, featureless field.

Our baseline is a distribution-matched Monte Carlo simulation (land-constrained). We generate 200 random great circles that have the same relationship to the actual distribution of sites as our test circle does. Specifically:

Take the real pole position and jitter it by a random offset (up to 2°)
Count how many sites fall within 50 km of that random circle
Repeat 200 times to build a distribution of "expected" counts

This preserves the geographic clustering of sites — if our circle happens to pass through a site-dense region like the Mediterranean, the random circles will too. The question becomes: does the proposed circle collect more sites than other circles in similar regions?

4. The Z-Score

The Z-score measures how far the observed count is from the expected count, in units of standard deviation. A Z-score of 0 means "exactly average." A Z-score of 3 means "three standard deviations above average" — very unlikely by chance.

In plain English: Under land-constrained Monte Carlo (N=10,000 trials), per-database Z-scores are: Megalithic Portal Z = 8.77, Pleiades Z = 6.74 (monuments), XRONOS Z = 5.91, p3k14c Z = 2.09. The land-constrained baseline rejects ocean placements via a Natural Earth 1:10M land mask, eliminating the coastal-bias inflation present in earlier unconstrained baselines. For context, the Higgs boson discovery required a Z-score of 5. Medical trials typically require Z ≥ 2. The corrected monument signal is highly significant across every database.

Even after land-constrained correction, the monument Z-scores (6.74–8.77) remain far above conventional significance thresholds. The monument–settlement divergence (D = 9.65) is the most diagnostic result: 0 of 10,000 random circles produce a comparable divergence.

5. The Settlement Test

A crucial objection: maybe the circle just passes through a geographically favorable region (fertile river valleys, coastlines) where all kinds of sites concentrate, not specifically ancient monuments.

To test this, we used the Pleiades Gazetteer (which classifies sites by type) and ran the same test separately on:

Monumental sites (temples, pyramids, tombs, sanctuaries): Z = 6.74, enrichment = 2.52× (land-constrained baseline)
Settlement sites (villages, farmsteads, ports, villas): Z = −2.91 (anti-clustered)

Monumental sites cluster on the circle at 2.52× the expected rate under land-constrained baselines (Z = 6.74, p < 10⁻¹¹). Settlements are anti-clustered along the corridor (Z = −2.91), the opposite of the monument signal. If geography alone explained the pattern, settlements would show the same clustering. They don't.

Verdict: The enrichment is specific to ancient monumental architecture, not to human habitation in general. Geographic coincidence is ruled out as an explanation.

5b. Coastal Bias Correction (D16)

A community member identified a potential methodological flaw in the unconstrained Monte Carlo baseline: synthetic jittered sites could land in the ocean, inflating Z-scores for coastal monument concentrations (especially Egypt, where the Nile corridor sits near the Mediterranean coast).

The land-constrained variant rejects ocean placements using a Natural Earth 1:10M land mask, producing a more conservative and accurate baseline. After this correction the core findings survive across all eight databases.

Corrected Key Numbers (land-constrained baselines):

Monument enrichment	2.52× (Z = 6.74)
Settlement Z-score	−2.91 (anti-clustered)
Monument–settlement divergence	9.65
Random circles cross-validation	Monument Z = 3.45 (independent method)
Deep-time enrichment	Z = 4.35 (unaffected by correction)

Spatial thinning sensitivity: Monument Z-scores are sensitive to spatial thinning radius (Z drops to 0.09 at 25 km thinning). The absolute Z-scores reflect dense clustering within monument complexes (especially Memphis). The monument–settlement divergence and bandwidth peak (2.83× at 20 km) are robust to thinning.

6. Limitations

We are transparent about the limitations of this analysis:

Database bias: The Megalithic Portal is heavily Western European. Many sites in Africa, Asia, and the Americas are underrepresented. The Pleiades Gazetteer is Mediterranean-centered. The true signal may be stronger or weaker than what these databases capture.
Dating uncertainty: Many sites have approximate or debated dates. Our "prehistoric vs. later" split uses broad categories, not precise dating.
No causal mechanism: We demonstrate a statistically significant spatial pattern. We do not propose a mechanism for why it exists. The pattern is consistent with multiple hypotheses — from ancient survey knowledge to coincidental independent development along climatically favorable latitudes.
Single circle: We test one specific circle. A full multiple-testing analysis across all possible great circles would require different statistical treatment (and has been partially done — the proposed circle remains exceptional).
Monte Carlo resolution: The land-constrained baseline uses N = 10,000 trials, which stabilises by N ≈ 5,000 (see 05_corrected_statistics_table.csv). Reported Z-scores (6.74–8.77) are stable across N ≥ 1,000.
Supplementary sites: We added 114 well-known archaeological sites to fill gaps in the Portal's coverage. All results are reported both with and without these additions. The Portal-only result (Z = 23.71) is already overwhelming.

Per-Database Z-Scores (Land-Constrained)

Z-scores at the 50 km threshold under land-constrained Monte Carlo (N = 10,000 trials). See Section 5b for the correction protocol.

Dataset	Sites	Monument Z (50 km)	Enrichment
UNESCO ancient	871	1.92	—
p3k14c	36,693	2.09	1.76×
XRONOS	28,127	5.91	3.62×
Pleiades (pre-2000 BCE)	778	6.45	2.0×
Pleiades monuments	406	6.74	2.52×
Megalithic Portal merged	61,913	8.77	2.52×

Temporal Analysis

Older sites cluster more strongly than newer ones — a 2.5× ratio in Z-score:

Period	Sites	Z (50 km)	Enrichment
Prehistoric	35,324	20.86	3.79×
Later periods	25,851	8.30	2.60×

Type Enrichment

Which types of sites concentrate most strongly on the circle (top 10 by percentage within 50 km):

Type	Total	Within 50 km	% on circle
Hill Figure or Geoglyph	45	11	24.4%
Pyramid / Mastaba	195	32	16.4%
Ancient Temple	894	55	6.2%
Ancient Mine / Quarry	288	14	4.9%
Ancient Palace	70	2	2.9%
Carving	438	12	2.7%
Misc Earthwork	644	16	2.5%
Sculptured Stone	479	11	2.3%
Standing Stones	1,066	23	2.2%
Ancient Village/Settlement	4,698	80	1.7%

Pattern: Geoglyphs (24%) and pyramids (16%) — the most monumental site types — are the most concentrated. Settlements and domestic sites cluster least. The enrichment scales with the monumentality of the structure.