Authors: Benjamin Ducke (Archäologishes Institut), Fleur Schweigart (Deutsches Archäologisches Institut, Zuse Institute Berlin), Robin Chemnitz (Zuse Institute Berlin)

Description: Archaeological and spatial (GIS) data on the Romanisation of northern Africa (146 BC to c. 400 AD): Includes archaeological observations at geographical locations ("sites") that pertain to architectural, structural and administrative features of the remains of Roman era settlements within the territorial boundaries of modern Tunesia, plus topographical support data from various free sources.

Purpose: The archaeological data in this package provides a detailed picture of the available material evidence for the process of Romanisation in the settlement system of ancient Tunesia. All relevant archaeological sites, available in the published literature, were compiled and attributed with observations that indicate typical Roman urban design structures. In addition, historical evidence about administrative rankings of settlements were included for all known cases. In its present form, enriched with auxillary data on both modern and historic landscape features, the data provided here constitutes a uniquely comprehensive and rich source of archaeological data for a well-defined historical epoche and an culturally rich geographical region. Some of the questions that motivated the original data collection:

(a) How accurately can one model the time dynamics and spreading patterns of the Romanisation process, based on the available data? How do we best account for missing and incomplete observations (uncertainty) in such models?

(b) What is the most effective way of extracting the temporal component from the different types of evidence (inscriptions on milestones, registries of bishoprics, etc.) and create plausible distributions to data those sites that have no direct time attribution?

(c) How do the different types and ranks of settlements relate to the natural landscape and to other sites in their region? Are there simple patterns of locational preference that can be extracted?

(d) Are there simple yet flexible methods that allow us to plausibly reconstruct those arts of the Roman road network that have not been preserved? How do we assess the plausibility of a reconstruction?

(e) Is it possible to infer the spreading pattern of the process of Christianisation in the fully Romanised settlement system of ancient Tunisia?

(f) Due to its archaeological nature, the data is both incomplete/fragmentary and very uncertain in its temporal dimension. Therefore, it will not be possible to directly extract reliable information from the data that could answer research questions such as the above with sufficient certainty. Instead, the data supports a model-based approach, where computed outcomes are compared to the archaeological observation to assess their plausibility.


(a) Scale effects: data derives from different original scales; e.g. modern country boundaries do not match the accuracy of place locations.

(b) Edge effect: borders of Tunisia are well-defined towards N and E (open sea), as well as S (inhospitable desert), but not towards W (modern Algeria); this will affect the results of models towards the western boundary of the study area.

(c) Similar to the above: modern country boundaries are meaningless; many more sites lie in other modern territories (Algeria!)

(d) What constitutes a "site"? The data is a mixture of actual municipalities and disjunct discoveries ("near") that were once part of the "parent site".

(e) Quality checking was not enough to ensure that all data is consistent (e.g. there might still be identical sites with differently spelled names; many Tunesian places have more than one Western transcription).

(f) Contexts and mixed levels of observation: a good example examples are aligned sites that each have "has_aquedu" set to "1", but no other "Roman attributes" vs. sites that represent roman towns with aqueducts and many other features.

(g) Temporal dynamics: They are in there, but very well hidden... See "branches/ temporal_activation" for additional temporal support.


Connectivity: design a simple generator for a plausible roads network between all major sites (use observed remains of network to test your reconstruction).

Spatial pattern analysis: design a model to predict the locations of lower ranking sites based on the locations of highest ranking sites (or vice versa).

Cluster detection: find an advanced measure of site density that takes into account the fact that lower ranking sites cluster along linear structures (roads, rivers, valleys, etc.).

Diffusion modelling: find a simple model that will explain the spatiotemporal sequence of the emergence of bishoprics.

Contact: Benjamin Ducke (

Download: here

License: The data provided here has been released under liberal (open data) licenses. Some data is in the public domain. Please consult the LICENSE file in each subfolder of this package.


Explanatory slides from the Opening Day: here


back to the data set collection