Direkt zur Hauptnavigation springen Direkt zum Inhalt springen Jump to sub navigation

3 Limitations of interpreting genetic data in light of human identity

Archaeogenetic data can provide valuable insights for understanding past human societies, population movements or the spread of pathogens. However, there are important limitations and caveats with a purely genetic reconstruction of the human past. Genomic data are generally too complex to be processed, visualized and communicated without significant reduction of dimensionality. In doing so, scientists introduce subjective choices and simplifications that may not represent the full complexity of the data. Geneticists need to be particularly aware of these limitations and biases, and develop a habit of making them transparent. One such limitation arises from the need to group individuals for data analysis, which may hide heterogeneity at the individual-level. Another issue is how to label such groups without promoting inappropriate associations. In this chapter we want to discuss how to recognize inappropriate connections between genetic information and human identity (Section 3.1), cover limitations regarding grouping individuals (Section 3.2) and, finally, reflect on the naming of present-day and ancient groups and their related ancestries (Section 3.3).

3.1 Genetic information and human identity

Many questions that archaeogenetic analyses can address touch upon human identity. For example, on the individual level, genetic information can inform about a person’s biological sex, biological relatedness, genetic ancestry and, to some extent, physical characteristics. While these features can be meaningful, they can also be misinterpreted. For example, biological sex does not necessarily reflect a person’s gender identity, biological relatedness does not necessarily reflect kinship, and genetic ancestry does not necessarily reflect an individual’s group identity. Researchers need to be aware of these limitations, and of potential biases that arise from their own subjective experiences when describing and interpreting individual-level data.

Similarly, at the group level reconstructing genetic ancestries from ancient genomes can provide meaningful insights into past human movements and interaction. While it is valuable to contextualise such processes with historical sources, linguistic patterns or archaeological affiliations, such contextualisation poses the danger of applying naive concepts of group identity. For example, while historically attested groups like the ancient Philistines (Feldman et al. 2019), Saxons (Schiffels et al. 2016) or Vikings (Margaryan et al. 2020) can motivate meaningful genetic investigations into specific time periods or contexts, it is important to avoid simplistically equating genetic signals or genetically defined groups with historical “peoples”. Researchers need to be transparent about the complexities involved in identifying such historically-named groups.

3.2 Grouping Individuals and Discretization

Human biological diversity is continuous, but many genetic analyses require grouping of individuals into “populations”. In the biological sense, a population is an abstract model in which individuals in one isolated group mate freely with others from the same group. Such a model is never strictly realised, because no group is strictly isolated, and historically genetic ancestry has been distributed along geographical gradients or involves long-distance migrations (Winther et al. 2015). In addition, individuals don’t mate at random, which creates additional structure within a group, also violating the abstract concept of a population. While grouping individuals may be still necessary for many analysis methods, researchers need to be aware of the underlying assumptions and limitations of this approach.

An important drawback of such population groupings in archaeogenetics is the fact that these usually represent only a momentary snapshot in time. Therefore, what we call a specific “population” may cease to be a useful concept after a certain period of time due to the fluid nature of human migrations. Because of high human mobility, groups at any one point in time are typically mixtures of populations from preceding time periods and geographic locations. This poses a challenge when describing this process; we need to apply meaningful labels (see below), but at the same time we need to make sure those labels capture the transient nature of such groupings. In addition, any such transient groups are represented only by a finite number of samples, which also may reflect biases in the archaeological record (for example, inhumations are accessible to genetic analyses, while cremations are not) (Perreault 2019).

Admixture models, where ancestry of a population under study is modelled as a decomposed set of discrete source populations, are an example of necessary discretizations in population genetic analyses. Such an analysis may result in simplistic ideas of mixed (the target) vs. pure (the sources) populations, which is inappropriate since all groups appear as mixed when being modelled with even earlier sources, and thus no population can be seen as pure.

3.3 Naming of ancient and present-day groups

It is often tempting to name a genetic population after an archaeological culture, and in some cases this can be very helpful to contextualize genetic results. However, as described above, equating genetic groupings with archaeological cultures can promote naive concepts of group identity, or falsely imply that culture only spreads through movements of people. We also must acknowledge that assignments of archaeological cultures are themselves biased and imperfect, and their adoption may transfer these imperfections into the genetic interpretations (Shennan 1989). There is currently no consensus on how to name sampled groups in analyses and scientific publications. For both historic and prehistoric populations, we have recently reviewed current practices (Eisenmann et al. 2018), and concluded that especially for prehistoric populations, a primarily geographic-temporal system should be preferred, which avoids ascribing cultural affinity to biological populations.

A different issue concerns labeling of ancient individuals. We usually label individuals by the code associated with the archaeological collection or by our internal laboratory database. In some cases, however, human remains have been given nicknames, especially when these remains originate from unusual or extraordinary archaeological contexts (e.g. “Ötzi the Iceman” or “The Ancient One/Kennewick Man”). While such names can in principle facilitate the communication of scientific results to a popular audience and hence can make the past more accessible, in general, we refrain from this practice. Our position is that these individuals had a name in life, and it is not our place to rename them.

There are also issues with naming present-day groups. A first caveat is the fact that national or ethnic boundaries do not typically delineate genetic ancestries. For example, “French”, as used in the Human Genome Diversity Project collection of publicly available DNA samples (https://cephb.fr/en/hgdp_panel.php), does not represent the entire genetic diversity in France today (Birney et al. 2021). A second caveat is that in many cases, even choosing a label can be challenging. Self-identified ethnicity, cultural affiliation, and biological ancestry do not necessarily overlap and the latter two can be fluid and multi-layered. In some cases, certain names have been established by repeated use. However, such conventional terminology may not be appropriate, since a significant number of these names may in some way be offensive and/or misleading. Particularly during the colonial era, the self-given names of indigenous groups were often replaced by colonial administrations with some continuing to be used until today (e.g. the colonial name “Mbuti Pygmy” for a particular group of central African rain forest foragers, whereas their self-chosen term is “Mbuti” or “Bambuti”).

When referring to living populations, researchers should try to use the name by which members of that group self-identify. However, while a certain name might have been chosen by a group as a self-identification, it is not necessarily appropriate for people not belonging to the group to use it. There are cases where there is no universally recognized, appropriate and respectful name that all members of a given group share. This is especially true for indigenous umbrella terms and ethnonyms. We acknowledge that assigning population labels runs the risk of offending certain sections of a community. This cannot be fully solved satisfactorily, and needs to be addressed on a case-by-case basis. In some cases, following official representative bodies for specific ethnic groups may be a solution. For example, in Finland, Russia, Norway and Sweden, the Saami Council (https://www.saamicouncil.net), a joint body representing the Saami ethnic group makes the use of the name “Saami” unproblematic. More problematic examples include the different usage of the terms “First Peoples” or “Indigenous Peoples”, which are used differently in Canada and the US. In these cases authors must make a decision and should at least mention that their use of names is not universally accepted.


Birney, Ewan, Michael Inouye, Jennifer Raff, Adam Rutherford, and Aylwyn Scally. 2021. “The Language of Race, Ethnicity, and Ancestry in Human Genetic Research.” arXiv [q-bio.PE]. arXiv. http://arxiv.org/abs/2106.10041.

Feldman, Michal, Daniel M. Master, Raffaela A. Bianco, Marta Burri, Philipp W. Stockhammer, Alissa Mittnik, Adam J. Aja, Choongwon Jeong, and Johannes Krause. 2019. “Ancient DNA Sheds Light on the Genetic Origins of Early Iron Age Philistines.” Science Advances 5(7). https://doi.org/10.1126/sciadv.aax0061.

Margaryan, Ashot, Daniel J. Lawson, Martin Sikora, Fernando Racimo, Simon Rasmussen, Ida Moltke, Lara M. Cassidy, et al. 2020. “Population Genomics of the Viking World.” Nature 585 (7825): 390–96. https://doi.org/10.1038/s41586-020-2688-8

Perreault, Charles. 2019. The Quality of the Archaeological Record. University of Chicago Press.

Stephan Shennan, 1989, "Introduction: Archaeological Approaches to Cultural Identity". In Archaeological Approaches to Cultural Identity, 1-32, London.

Schiffels, Stephan, Wolfgang Haak, Pirita Paajanen, Bastien Llamas, Elizabeth Popescu, Louise Loe, Rachel Clarke, et al. 2016. “Iron Age and Anglo-Saxon Genomes from East England Reveal British Migration History.” Nature Communications 7 (January): 10408. doi.org/10.1038/ncomms10408

Winther, Rasmus Grønfeldt, Ryan Giordano, Michael D. Edge, and Rasmus Nielsen. 2015. “The Mind, the Lab, and the Field: Three Kinds of Populations in Scientific Practice.” Studies in History and Philosophy of Biological and Biomedical Sciences 52 (August): 12–21. https://doi.org/10.1016/j.shpsc.2015.01.009