IBIP seminar

Monday, November 2, 2015

The pangenome of Brachypodium distachyon: Estimating the true genomic diversity of a species

Bruno Contreras-Moreira

Bruno Contreras-Moreira1,2 et Sean B Gordon3, Wendy Schackwitz3, Joel Martin3, Shengqiang Shu3, Jeremy Phillips3, Kerrie Barry3, Mike Freeling4, David L. Des Marais5, Ludmila Tyler6, Ana Caicedo6, Luis Mur7, John Doonan7, Pilar Catalan8, John P. Vogel3
1 Estación Experimental de Aula Dei (EEAD-CSIC), 50059 Zaragoza, Spain
2 Fundación ARAID, 50018 Zaragoza, Spain
3 DOE Joint Genome Institute, Walnut Creek, CA 94598, USA
4 UC Berkeley, USA
5 Harvard University, Cambridge, MA, USA
6 University of Massachusetts, Amherst, MA, USA
7 Aberystwyth University, UK
8 University of Zaragoza, Huesca, Spain (bcontreras@eead.csic.es)

The genetic diversity of a species is the sum of the diversity found in all individuals of that species. Many studies have attempted to estimate the diversity of a species by resequencing diverse accessions and aligning the reads to a reference genome. While this approach readily identifies SNPs and small indels, it underestimates total genomic diversity because highly divergent regions align poorly to the reference and, of course, any sequence not found in the reference will be missed entirely. Thus, the true extent of diversity within a species is largely unknown. De-novo genome assemblies and independent annotation can be used to more accurately estimate the true genomic diversity within a species. We applied this approach using 54 Brachypodium distachyon accessions to create a pan-genome that contains all the diversity found in the accessions sequenced. Our results indicate that the pan-genome is substantially larger than the genome of any individual accession, twice as large by some measures. Systematic comparison of the individual genomes identified a set of core-genes found in all sequenced lines and a larger set of shell genes present only in some accessions. Our results also characterize the variability of conserved non-coding sequences among individuals of this species, which we are correlating with genome-wide expression patterns and gene variability. Together, the core genes, shell genes and all the non-coding sequences constitute the pan-genome of B.distachyon. Overall, this work supports a dynamic view of genomes within a species, which is updated with every new accession studied.


Contact : Tou-Cheu Xiong

Contacts IBIP :
Sabine Zimmermann
Alexandre Martiniere
Christine Granier
Chantal Baracco