Part of Deep Time Research Institute · All Research · Papers

Methodology

1. The Data

We use seven independent archaeological databases that were built for entirely different purposes:

The Megalithic Portal (62,000 sites) is a community-maintained catalogue of ancient and megalithic sites, overwhelmingly concentrated in Western Europe and the British Isles. It was not designed for great-circle analysis — it was built by amateur archaeologists who wanted to document standing stones, barrows, and dolmens near their homes.

The Pleiades Gazetteer (34,000 sites) is an academic database from the classical world (Greek, Roman, Near Eastern). It was built by professional classicists and ancient historians. Its coverage is Mediterranean-centered, with very different geographic biases than the Portal.

p3k14c radiocarbon database (37,000 sites) contributes independently dated sites from the global radiocarbon record, providing temporal anchoring independent of typological dating.

DARE (Digital Atlas of the Roman Empire) (27,000 sites) provides comprehensive coverage of Roman-era settlements, military installations, and infrastructure across the Mediterranean basin and northern Europe. It was built by classical archaeologists focused on the Roman world's spatial distribution.

OSM (OpenStreetMap historical/archaeological sites) (350,000 sites) aggregates crowdsourced data on archaeological sites and heritage locations globally, capturing community-contributed local knowledge from regions underrepresented in academic databases.

DARMC (Digital Atlas of Roman and Medieval Civilizations) (4,200 sites) covers ancient ports, shipwrecks, and medieval settlement patterns across Eurasia, developed by historians tracking urbanization dynamics across centuries of cultural change.

Wikidata (archaeological/heritage sites) (106,000 sites) represents a modern collective effort to document world heritage and archaeological sites, pulling from multiple institutional sources and creating a global index independent of any single research tradition.

Together these seven databases contain over 600,000 sites from every inhabited continent, spanning from ~9600 BCE (Gobekli Tepe) to the medieval period. No database was created with any awareness of the great circle hypothesis, and their geographic and disciplinary biases are largely independent of one another.

Key point: If a pattern appears in seven databases with different geographic biases, built by different communities, for different purposes — it is unlikely to be an artifact of how any single database was constructed.

2. The Circle

A great circle is the largest possible circle you can draw on a sphere — like the equator, or a line of longitude, but tilted at any angle. Any two points on Earth define a great circle. It represents the shortest path between any two points on its circumference.

This particular great circle was proposed by Jim Alison in 1995, long before this analysis. It is defined by its pole at 59.682°N, 138.646°W (in southeastern Alaska). The circle passes through or near Giza, Nazca, Easter Island, Angkor Wat, Persepolis, and Mohenjo-daro.

Critically, the circle was fixed before any statistical testing began. We did not search for the "best" circle — we tested the one that was already proposed. This eliminates the problem of "p-hacking" by trying many circles and picking the best one.

3. The Baseline

The central question is: do more sites fall near this circle than we'd expect by chance?

To answer this, we need a baseline — what "chance" looks like. A naive approach would assume sites are sprinkled uniformly across the Earth. But sites are not uniformly distributed. They cluster in habitable regions: river valleys, coastlines, temperate zones. Europe has far more catalogued sites than Antarctica.

Analogy: Imagine scattering tennis balls across a hilly park. They'll roll into valleys and collect along paths — they won't be uniform. If you then draw a line and count how many tennis balls are near it, you need to compare against what a random line through the same hilly park would collect — not a line through a flat, featureless field.

Our baseline is a distribution-matched Monte Carlo simulation (land-constrained). We generate 200 random great circles that have the same relationship to the actual distribution of sites as our test circle does. Specifically:

  1. Take the real pole position and jitter it by a random offset (up to 2°)
  2. Count how many sites fall within 50 km of that random circle
  3. Repeat 200 times to build a distribution of "expected" counts

This preserves the geographic clustering of sites — if our circle happens to pass through a site-dense region like the Mediterranean, the random circles will too. The question becomes: does the proposed circle collect more sites than other circles in similar regions?

4. The Z-Score

The Z-score measures how far the observed count is from the expected count, in units of standard deviation. A Z-score of 0 means "exactly average." A Z-score of 3 means "three standard deviations above average" — very unlikely by chance.

In plain English: Our Z-score of 25.85 means the number of sites near this circle is nearly 26 standard deviations above what random circles collect. For context, the Higgs boson discovery — one of the most important physics results of the century — required a Z-score of 5. Medical trials typically require Z ≥ 2.

At Z = 25.85, the probability of this occurring by chance is astronomically small — far below 10−100. This is not a marginal result. It is one of the strongest statistical signals ever reported in spatial archaeology.

5. The Settlement Test

A crucial objection: maybe the circle just passes through a geographically favorable region (fertile river valleys, coastlines) where all kinds of sites concentrate, not specifically ancient monuments.

To test this, we used the Pleiades Gazetteer (which classifies sites by type) and ran the same test separately on:

Monumental sites cluster on the circle at 2.52× the expected rate under land-constrained baselines (Z = 6.74, p < 10−11). Settlements are anti-clustered along the corridor (Z = −2.91), the opposite of the monument signal. If geography alone explained the pattern, settlements would show the same clustering. They don't.

Verdict: The enrichment is specific to ancient monumental architecture, not to human habitation in general. Geographic coincidence is ruled out as an explanation.

5b. Coastal Bias Correction (D16)

A community member identified a potential methodological flaw in the original distribution-matched Monte Carlo baseline: synthetic jittered sites could land in the ocean, inflating Z-scores for coastal monument concentrations (especially Egypt, where the Nile corridor sits near the Mediterranean coast).

We tested this, confirmed it inflated our headline statistic by 43%, corrected it, and the core findings survived. The land-constrained variant rejects ocean placements using a Natural Earth 1:10M land mask, producing a more conservative and accurate baseline.

Corrected Key Numbers (land-constrained baselines):

Monument enrichment2.52× (Z = 6.74)
Settlement Z-score−2.91 (anti-clustered)
Monument–settlement divergence9.65
Random circles cross-validationMonument Z = 3.45 (independent method)
Deep-time enrichmentZ = 4.35 (unaffected by correction)

Spatial thinning sensitivity: Monument Z-scores are sensitive to spatial thinning radius (Z drops to 0.09 at 25 km thinning). The absolute Z-scores reflect dense clustering within monument complexes (especially Memphis). The monument–settlement divergence and bandwidth peak (2.83× at 20 km) are robust to thinning.

6. Limitations

We are transparent about the limitations of this analysis:

Dataset Escalation

As we add more data, the signal grows — the opposite of what happens with statistical flukes:

Dataset Sites Z (25 km) Z (50 km) Z (100 km) Z (200 km)
UNESCO ancient 871 1.22 1.92 0.22 2.74
Portal only 61,870 23.22 23.71 19.88 18.54
Merged 61,913 24.59 25.85 23.28 21.29
Pleiades (all) 34,470 3.84 0.19 2.89 8.61
Pleiades (pre-2000 BCE) 778 10.68 6.45 4.02 5.27

Temporal Analysis

Older sites cluster more strongly than newer ones — a 2.5× ratio in Z-score:

Period Sites Z (50 km) Enrichment
Prehistoric 35,324 20.86 3.79×
Later periods 25,851 8.30 2.60×

Type Enrichment

Which types of sites concentrate most strongly on the circle (top 10 by percentage within 50 km):

Type Total Within 50 km % on circle
Hill Figure or Geoglyph451124.4%
Pyramid / Mastaba1953216.4%
Ancient Temple894556.2%
Ancient Mine / Quarry288144.9%
Ancient Palace7022.9%
Carving438122.7%
Misc Earthwork644162.5%
Sculptured Stone479112.3%
Standing Stones1,066232.2%
Ancient Village/Settlement4,698801.7%

Pattern: Geoglyphs (24%) and pyramids (16%) — the most monumental site types — are the most concentrated. Settlements and domestic sites cluster least. The enrichment scales with the monumentality of the structure.