Full-length RNA transcript sequencing traces brain isoform diversity in house mouse natural populations [RESOURCES]

Wenyu Zhang1,2,3, Anja Guenther3,4, Yuanxiao Gao5, Kristian Ullrich3, Bruno Huettel6, Aftab Ahmad1, Lei Duan1, Kaizong Wei1 and Diethard Tautz3 1Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an 710129, China; 2Research and Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen 518063, China; 3Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Ploen 24306, Germany; 4Research Group Behavioral Ecology of Individual Differences, Max Planck Institute for Evolutionary Biology, Ploen 24306, Germany; 5School of Mathematics and Data Science, Shaanxi University of Science and Technology, Xi'an 710021, China; 6Max-Planck-Genome-Centre Cologne, MPI for Plant Breeding Research, Cologne 50829, Germany Corresponding authors: wyzhangnwpu.edu.cn, tautzevolbio.mpg.de Abstract

The ability to generate multiple RNA transcript isoforms from the same gene is a general phenomenon in eukaryotes. However, the complexity and diversity of alternative isoforms in natural populations remain largely unexplored. Using a newly developed full-length transcript enrichment protocol with 5′ CAP selection, we sequenced full-length RNA transcripts of 48 individuals from outbred populations and subspecies of Mus musculus, and from the closely related sister species Mus spretus and Mus spicilegus as outgroups. The data set represents the most extensive full-length high-quality isoform catalog at the population level to date. In total, we reliably identify 117,728 distinct isoforms, of which only 51% were previously annotated. We show that the population-specific distribution pattern of isoforms is phylogenetically informative and reflects the segregating single nucleotide polymorphism (SNP) diversity between the populations. We find that ancient housekeeping genes are a major source of the overall isoform diversity, and that the generation of alternative first exons plays a major role in generating new isoforms. Given that our data allow us to distinguish between population-specific isoforms and isoforms that are conserved across multiple populations, it is possible to refine the annotation of the reference mouse genome to a set of about 40,000 isoforms that should be most relevant for comparative functional analysis across species.

Received February 20, 2024. Accepted September 10, 2024.

Comments (0)

No login
gif