Duplication hotspots, rare genomic disorders, and common disease

https://doi.org/10.1016/j.gde.2009.04.003Get rights and content

The human genome is enriched in interspersed segmental duplications that sensitize approximately 10% of our genome to recurrent microdeletions and microduplications as a result of unequal crossing over. We review the recent discovery of recurrent rearrangements within these genomic hotspots and their association with both syndromic and nonsyndromic diseases. Studies of common complex genetic disease show that a subset of these recurrent events plays an important role in autism, schizophrenia, and epilepsy. The genomic hotspot model may provide a powerful approach for understanding the role of rare variants in common disease.

Introduction

The development of cytogenetic techniques, including high resolution karyotyping and fluorescence in situ hybridization (FISH), in the early 1980s resulted in the identification of microdeletions responsible for Prader–Willi (15q11–q13 deletions) [1] and Smith–Magenis (17p11.2 deletions) [2] syndromes. The term genomic disorder was originally introduced to describe conditions resulting from nonallelic homologous recombination (NAHR) or unequal crossing over between segmental duplications (a.k.a. low copy repeats) [3••]. Over the next decade, continued efforts to fine-map recurrent deletions implicated NAHR for recurrent rearrangements in Charcot–Marie–Tooth disease [4], hereditary neuropathy with liability to pressure palsies [5], and Prader–Willi [6], Angelman [7], Smith–Magenis [8], velocardiofacial [9], Williams–Beurens [10], and Sotos [11] syndromes as well as spinal muscular atrophy [12] and juvenile nephronophthisis type I [13] (Figure 1) to name a few. Molecular diagnosis became possible but relied on first, suspecting a specific disorder based on clinical features and second, using a targeted FISH assay for the chromosomal region to confirm the suspected diagnosis  a ‘phenotype first’ approach.

Advances in technology  most notably the introduction of array comparative genomic hybridization (CGH) and single nucleotide polymorphism (SNP) microarrays  now allow rapid evaluation of many targeted loci or the entire genome for submicroscopic deletions and duplications. A significant advantage of these approaches is that a suspected diagnosis is not necessary before performing the diagnostic test. The application of both targeted and whole-genome technologies to large series of patients with mental retardation (MR) or developmental delay [14•, 15, 16, 17, 18•, 19], autism [20, 21, 22, 23•, 24, 25], congenital anomalies [26, 27•, 28, 29], and schizophrenia [30, 31, 32•] has had several important consequences. First, the rate of discovery of novel disorders has increased dramatically. Since 2005, 18 new genomic disorders involving 12 regions of the genome have been described, more than doubling the number of disorders described in the previous 20 years (Table 1). Perhaps more importantly, whole-genome approaches have led to a remarkable shift from a ‘phenotype first’ to a ‘genotype first’ definition of genomic disorders. Whereas previously, disorders were described using clinical features, new disorders are described by their genomic rearrangement and clinical features are compared among patients after a common rearrangement is identified. As the diversity of phenotypes evaluated for pathogenic copy number changes expands, so does the phenotypic diversity associated with at least a subset of recurrent rearrangements  in fact, for some of the rearrangements described below, the ‘phenotype first’ approach would have been nearly impossible.

The underlying genomic architecture in each of the genomic disorders identified to date is similar: a stretch of unique sequence (50 kb–10 Mb) flanked by large (>10 kb), highly homologous (>95%) segmental duplications that provide the substrate for NAHR. In 2002, we used these criteria to identify rearrangement ‘hotspots’  regions predicted to be susceptible to recurrent rearrangement based on the flanking genomic architecture [33••]  and developed a targeted array CGH assay to evaluate copy number variation in both affected and unaffected individuals. An updated map of predicted hotspots and associated disorders is shown in Figure 1; there are now 21 discrete regions of the genome that undergo recurrent rearrangement, resulting in 33 diseases, and at least 10 additional diseases are the result of NAHR in regions of the genome that are flanked by duplications but do not meet our strict definition of a hotspot.

Section snippets

Mental retardation syndromes

The majority of the genomic disorders identified before 2006 were characterized by developmental delay, learning disability, and/or MR. Interestingly, the genetic basis for MR is still unknown in well over 50% of clinical cases. Therefore, many studies have been aimed at identifying submicroscopic copy number changes in this population [14•, 15, 16, 17, 18•, 19], and it is now estimated that large microdeletions and microduplications underlie >15% of MR. We note that many potential pathogenic

Non-MR genomic disorders

Although neurocognitive and neurobehavioral diseases appear to be enriched for genomic disorders, this may simply be a result of ascertainment bias. Recent investigations of other diseases suggest that recurrent genomic rearrangements also underlie some disorders that do not include cognitive deficits as a primary phenotype. Array CGH studies of individuals with thrombocytopenia-absent radius (TAR) syndrome found that 30/30 affected probands shared a ∼500-kb deletion on chromosome 1q21.1 [27].

Genomic disorders defying syndromic classification

One of the most intriguing developments over the past two years has been the discovery of at least three new recurrent microdeletions that are enriched in multiple neuropsychiatric diseases but elude syndromic classification. Although each microdeletion was first identified in a series of individuals with similar phenotypes, the application of whole-genome copy number variation analysis to a wider range of neurocognitive disorders has revealed unprecedented phenotypic diversity.

Genomic hotspot model of common and rare disease

A slight majority of the rearrangements that have been shown to be disease-causing are mediated by segmental duplications. This is simply a consequence of the fact that duplicated sequences promote recurrent rearrangements (Figure 3) requiring far fewer patients and controls to be tested in order to prove pathogenicity when compared to large copy number variants (CNVs) not flanked by segmental duplications. The wide range of phenotypes associated with rearrangements of 16p11.2, 1q21.1, and

Future directions and conclusions

As we forge ahead in this ‘genotype first’ era of rapid CNV discovery, we should anticipate the need to screen large disease cohorts (10 000–50 000 affected individuals) in order to assess the pathogenicity of other rare CNVs, especially those not flanked by segmental duplications. Some of these numbers may be achieved by leveraging CNV datasets from seemingly disparate disease cohorts (i.e. autism, MR, schizophrenia, and epilepsy). Until such large supracollaborations are established, targeting

References and recommended reading

Paper of particular interest, published within the period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgments

We thank Ginger Cheng for assistance in preparation of Figure 1. We apologize to our colleagues whom we could not cite due to the limited number of allowed references. Dr. Eichler is an investigator of the Howard Hughes Medical Institute and is supported in part by the NIH grant HD043569. Dr. Mefford is supported in part by NIH grant HD043376.

References (58)

  • D.A. Koolen et al.

    Clinical and molecular delineation of the 17q21.31 microdeletion syndrome

    J Med Genet

    (2008)
  • M.C. Zody et al.

    Evolutionary toggling of the MAPT 17q21.31 inversion region

    Nat Genet

    (2008)
  • L. Willatt et al.

    3q29 microdeletion syndrome: clinical and molecular characterization of a new syndrome

    Am J Hum Genet

    (2005)
  • N. Brunetti-Pierri et al.

    Recurrent reciprocal 1q21.1 deletions and duplications associated with microcephaly or macrocephaly and developmental and behavioral abnormalities

    Nat Genet

    (2008)
  • H.C. Mefford et al.

    Recurrent rearrangements of chromosome 1q21.1 and variable pediatric phenotypes

    N Engl J Med

    (2008)
  • A.J. Sharp et al.

    A recurrent 15q13.3 microdeletion syndrome associated with mental retardation and seizures

    Nat Genet

    (2008)
  • D.T. Miller et al.

    Microdeletion/duplication at 15q13.2q13.3 among individuals with features of autism and other neuropsychiatric disorders

    J Med Genet

    (2009)
  • van Bon BWM, Mefford HC, Menten B, Koolen DA, Sharp AJ, Nillesen WM, Innis JW, de Ravel TJL, Mercer CL, Fichera M, et...
  • M.G. Butler et al.

    Clinical and cytogenetic survey of 39 individuals with Prader–Labhart–Willi syndrome

    Am J Med Genet

    (1986)
  • A.C. Smith et al.

    Interstitial deletion of (17)(p11.2p11. 2) in nine patients

    Am J Med Genet

    (1986)
  • J.R. Lupski et al.

    DNA duplication associated with Charcot–Marie–Tooth disease type 1A

    Cell

    (1991)
  • R. Carrozzo et al.

    Inter- and intrachromosomal rearrangements are both involved in the origin of 15q11–q13 deletions in Prader–Willi syndrome

    Am J Hum Genet

    (1997)
  • K.S. Chen et al.

    Homologous recombination of a flanking repeat gene cluster is a mechanism for a common contiguous gene deletion syndrome

    Nat Genet

    (1997)
  • N. Kurotaki et al.

    Fifty microdeletions among 112 cases of Sotos syndrome: low copy repeats possibly mediate the common deletion

    Hum Mutat

    (2003)
  • J. Melki et al.

    De novo and inherited deletions of the 5q13 region in spinal muscular atrophies

    Science

    (1994)
  • S. Saunier et al.

    Characterization of the NPHP1 locus: mutational mechanism involved in deletions in familial juvenile nephronophthisis

    Am J Hum Genet

    (2000)
  • B.B. de Vries et al.

    Diagnostic genome profiling in mental retardation

    Am J Hum Genet

    (2005)
  • D.A. Koolen et al.

    A new chromosome 17q21.31 microdeletion syndrome associated with a common inversion polymorphism

    Nat Genet

    (2006)
  • G. Sagoo et al.

    Array CGH in patients with learning disability (mental retardation) and congenital anomalies: updated systematic review and meta-analysis of 19 studies and 13,926 subjects

    Genet Med

    (2009)
  • Cited by (174)

    • En Route to Completion: What Is An Ideal Reference Genome?

      2022, Genomics, Proteomics and Bioinformatics
    • Molecular drivers of human cerebral cortical evolution

      2020, Neuroscience Research
      Citation Excerpt :

      Segmental duplications tend to occur in specific locations of the genome, becoming into the hot-spots of copy number variation (CNV) (Liu et al., 2012; Malhotra and Sebat, 2012; Sudmant et al., 2013). Interestingly, these CNV hot-spots are frequently associated with congenital neurodevelopmental and psychiatric diseases, suggesting that HS gene duplications may have impacts on brain development and function, and constitute another significant driver of brain evolution (Coe et al., 2012; Dennis and Eichler, 2016; Grayton et al., 2012; Kaminsky et al., 2011; Mefford and Eichler, 2009; Stankiewicz and Lupski, 2010; Sudmant et al., 2010; Weischenfeldt et al., 2013). This fits well with a traditional hypothesis that a phenotypic evolution is driven by gene duplication (Ohno, 1999, 1970).

    View all citing articles on Scopus
    View full text