The Thousand Polish Genomes—A Database of Polish Variant Allele Frequencies

Elżbieta Kaja, Adrian Lejman, Dawid Sielski, Mateusz Sypniewski, Tomasz Gambin, Mateusz Dawidziuk, Tomasz Suchocki, Paweł Golik, Marzena Wojtaszewska, Magdalena Mroczek, Maria Stępień, Joanna Szyda, Karolina Lisiak-Teodorczyk, Filip Wolbach, Daria Kołodziejska, Katarzyna Ferdyn, Maciej Dąbrowski, Alicja Woźna, Marcin Żytkiewicz, Anna Bodora-TroińskaWaldemar Elikowski, Zbigniew J. Król, Artur Zaczyński, Agnieszka Pawlak, Robert Gil, Waldemar Wierzba, Paula Dobosz, Katarzyna Zawadzka, Paweł Zawadzki, Paweł Sztromwasser*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Although Slavic populations account for over 4.5% of world inhabitants, no centralised, open-source reference database of genetic variation of any Slavic population exists to date. Such data are crucial for clinical genetics, biomedical research, as well as archeological and historical studies. The Polish population, which is homogenous and sedentary in its nature but influenced by many migrations of the past, is unique and could serve as a genetic reference for the Slavic nations. In this study, we analysed whole genomes of 1222 Poles to identify and genotype a wide spectrum of genomic variation, such as small and structural variants, runs of homozygosity, mitochondrial haplogroups, and de novo variants. Common variant analyses showed that the Polish cohort is highly homogenous and shares ancestry with other European populations. In rare variant analyses, we identified 32 autosomal-recessive genes with significantly different frequencies of pathogenic alleles in the Polish population as compared to the non-Finish Europeans, including C2, TGM5, NUP93, C19orf12, and PROP1. The allele frequencies for small and structural variants, calculated for 1076 unrelated individuals, are released publicly as The Thousand Polish Genomes database, and will contribute to the worldwide genomic resources available to researchers and clinicians.

Original languageEnglish
Article number4532
JournalInternational Journal of Molecular Sciences
Issue number9
StatePublished - 1 May 2022
Externally publishedYes


  • Polish genomes
  • allele frequency
  • allelic distribution
  • genome
  • population genomics
  • variant
  • whole-genome sequencing


Dive into the research topics of 'The Thousand Polish Genomes—A Database of Polish Variant Allele Frequencies'. Together they form a unique fingerprint.

Cite this