Full-resolution HLA and KIR gene annotations for human genome assemblies [METHODS]

Ying Zhou1, Li Song2 and Heng Li1,3 1Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA; 2Department of Biomedical Data Science, Dartmouth College, Hanover, New Hampshire 03755, USA; 3Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA Corresponding author: hlijimmy.harvard.edu Abstract

The human leukocyte antigen (HLA) genes and the killer cell immunoglobulin-like receptor (KIR) genes are critical to immune responses and are associated with many immune-related diseases. Located in highly polymorphic regions, it is difficult to study them with traditional short-read alignment-based methods. Although modern long-read assemblers can often assemble these genes, using existing tools to annotate HLA and KIR genes in these assemblies remains a nontrivial task. Here, we describe Immuannot, a new computation tool to annotate the gene structures of HLA and KIR genes and to type the allele of each gene. Applying Immuannot to 56 regional and 212 whole-genome assemblies from previous studies, we annotate 9931 HLA and KIR genes and found that almost half of these genes, 4068, have novel sequences compared with the current Immuno Polymorphism Database (IPD). These novel gene sequences are represented by 2664 distinct alleles, some of which contained nonsynonymous variations, resulting in 92 novel protein sequences. We demonstrate the complex haplotype structures at the two loci and report the linkage between HLA/KIR haplotypes and gene alleles. We anticipate that Immuannot will speed up the discovery of new HLA/KIR alleles and enable the association of HLA/KIR haplotype structures with clinical outcomes in the future.

Received January 12, 2024. Accepted May 22, 2024.

Comments (0)

No login
gif