Visualization and analysis of medically relevant tandem repeats in nanopore sequencing of control cohorts with pathSTR [RESOURCES]

Wouter De Coster1,2, Ida Höijer3, Inge Bruggeman2, Svenn D'Hert2,4, Malin Melin3, Adam Ameur3 and Rosa Rademakers1,2 1Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium; 2Department of Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium; 3Department of Immunology, Genetics and Pathology, SciLifeLab, Uppsala University, 751 85 Uppsala, Sweden; 4Neuromics Support Facility, VIB Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium Corresponding author: wouter.decosteruantwerpen.be Abstract

The lack of population-scale databases hampers research and diagnostics for medically relevant tandem repeats and repeat expansions. We attempt to fill this gap using our pathSTR web tool, which leverages long-read sequencing of large cohorts to determine repeat length and sequence composition in a healthy population. The current version includes 1040 individuals of The 1000 Genomes Project cohort sequenced on the Oxford Nanopore Technologies PromethION. A comprehensive set of medically relevant tandem repeats has been genotyped using STRdust and LongTR to determine the tandem repeat length and sequence composition. PathSTR provides rich visualizations of this data set and the feature to upload one's data for comparison along the control cohort. We demonstrate the implementation of this application using data from targeted nanopore sequencing of a patient with myotonic dystrophy type 1. This resource will empower the genetics community to get a more complete overview of normal variation in tandem repeat length and sequence composition and, as such, enable a better assessment of rare tandem repeat alleles observed in patients.

Footnotes

[Supplemental material is available for this article.]

Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.279265.124.

Freely available online through the Genome Research Open Access option.

Received March 4, 2024. Accepted August 2, 2024.

Comments (0)

No login
gif