Antonio Augusto Franco Garcia - ESALQ/USP

Antonio Augusto F. Garcia
A Augusto F Garcia has a PhD in Genetics and Plant Breeding (Luiz de Queiroz College of Agriculture - ESALQ, University of São Paulo - USP) and a postdoc in Statistical Genetics (Bioinformatics Research Center, North Carolina State University, USA). He is Associate Professor at Department of Genetics (ESALQ/USP), and leads a Statistical Genetics Laboratory with around 10 students (undergrads, MSC, PhDs and postdocs). He has been working on the development of statistical models to have a better understand of the genetic architecture of quantitative traits and to implement molecular breeding in several crops, including maize, sorghum and (specially) sugarcane. He participates of the National Institute of Science and Technology (INCT) of Bioethanol (FAPESP and CNPq). His research achievements include the development of a software (OneMap, and R package) that is used worldwide for building integrated genetic maps; several statistical methods for QTL mapping (including studies on heterosis and interaction G x E); and also methods for genotyping and mapping polyploids. He is editor of journals Theoretical and Applied Genetics and BMC Genetics.​

Genetic maps: useful tools for breeding and genomic studies

Genetic (or linkage) maps are bi-dimensional representations of distances and order of loci on chromosomes. Although keeping a direct relation with genome sequencing, they are build based on different principles, involving probabilistic distributions, phenomenon such as interference, and the presence of crossing over; all of these are inferred from segregating populations. Even with the recent availability of a large number of sequenced genomes, maps are very useful in a number of situations, once they can be used to understand genetic properties of populations, predict the extension of linkage disequilibrium, help to locate loci controlling the variation of quantitative traits (QTL), and to study the genetic architecture of quantitative traits. In fact, genome assembly and building genetic maps can mutually benefit to each other, once they provide similar information but based on different and complementary information. Also, the majority of plant species do not have sequenced genomes, so genetic maps are good alternatives to sequencing.
In this talk, I will discuss in details the statistical and computational principles involved in building linkage maps, showing the recent advances regarding marker loci based on high-throughput genotyping technologies (including SNPs) and new algorithms to deal with this scenario. I will also cover recent approaches for building maps for o utcrossing and polyploid plant species (such as sugarcane and forage crops).

Cathal Seoighe - National University of Ireland Galway

Cathal Seoighe

Cathal Seoighe studied theoretical physics in Trinity College Dublin before undertaking a PhD in molecular evolution/bioinformatics, under the supervision of Prof. Ken Wolfe at the Smurfit Institute of Genetics. Following a brief period as a post-doctoral researcher at the Royal College of Surgeons in Dublin in 2000 he took up a position as director of training at the South African National Bioinformatics Institute (SANBI), in a joint appointment with the South African Medical Research Council. He became a senior lecturer in bioinformatics at SANBI, and subsequently associate professor and director of the computational biology group at the University of Cape Town in 2004. In 2009 he joined NUI Galway as Stokes Professor of Bioinformatics. His research interests include the sources and consequences of variation in genomics data, including the origins of variation themselves (genetics, sample composition etc.) as well as the development of methods to analyze high throughput data from heterogeneous samples. The group also has interests in diverse areas of molecular evolution from probabilistic modeling of viral sequence evolution to evolutionary aspects of gene
expression regulation and mRNA splicing.



Inferring ancient mutator alleles from haplotype data

Abstract: The rate of germline mutation varies widely between species but little is known about the extent of variation in the germline mutation rate between individuals of the same species. Here we demonstrate that an allele that increases the rate of germline mutation can result in a distinctive signature in the genomic region linked to the affected locus, characterized by a number of haplotypes with a locally high proportion of derived alleles, against a background of haplotypes carrying a typical proportion of derived alleles. We searched for this signature in human haplotype data from phase 3 of the 1000 Genomes Project and report a number of candidate mutator loci, several of which are located close to or within genes involved in DNA repair or the DNA damage response. To investigate whether mutator alleles remained active at any of these loci, we used de novo mutation counts from human parent-offspring trios in the 1000 Genomes and Genome of the Netherlands cohorts, looking for an elevated number of de novo mutations in the offspring of parents carrying a candidate mutator haplotype at each of these loci. We found some support for two of the candidate loci, including one locus just upstream of the BRSK2 gene, which is expressed in the testis and has been reported to be involved in the response to DNA damage.

Miguel Rocha - University of Minho

Miguel Rocha

Miguel Rocha is currently an Associate Professor at the Informatics Department (http://www.di.uminho.pt) in the School of Engineering, University of Minho, Portugal.
He is also a researcher within the Centre of Biological Engineering (CEB) (
http://ceb.uminho.pt), where he co-leads a research team in Bioinformatics and Systems Biology, integrated within the newly formed Biosystems group (http://ceb.uminho.pt/biosystems/Labs?lab=1), that currently involves over 25 researchers. He is the author of around 120 publications in international journals and in peer-reviewed conferences, from which around 90 are indexed in ISI Web of Science. Also, over the last few years he has been the PI and has collaborated in several funded research projects by the Portuguese FCT, European Commission and private companies.
He currently teaches courses at the undergraduate, master and doctoral levels in the areas of Bioinformatics, Machine Learning/ Data Mining and basic Computer Science. He is on the board of the master course in Bioinformatics, a degree that he co-founded in 2007 and from which he was the first Director (2007 to 2010). He was also the Director of the Computer Sciences and Technologies Centre (CCTC) from 2010 to 2013.
Furthermore, he is one of the founders and the Chief Technological Officer of the spin-off company Silico Life (
http://www.silicolife.com), created in 2010, that offers Bioinformatics and Computational Biology solutions for the industry. The company has won a national prize in entrepreneurship (Atreve-te 2010 contest).
Miguel Rocha graduated in Systems and Informatics Engineering (1995) from the University of Minho, the institution where he also did the Master in Informatics (1998) and the PhD in Informatics (2004).



Computational tools for metabolomics data mining: pplications in natural products and food research

Metabolomics data plays an important role in the functional analysis of biological systems, allowing a direct focus on the metabolism of microbes, plants and other complex organisms by addressing the measurement of the amounts of metabolites in biological samples. In this talk, we will discuss some of the computational tools developed in our work for the analysis of this type of data, encompassing data coming from measurement technologies such as gas or liquid chromatography coupled with mass spectrometry, nuclear magnetic resonance and other types of spectral data, as infra-red, ultra-violet visible or Raman. The R specmine package will be highlighted, as well as the web-based interfaces for its improved utilization.
We will also address a number of applications of metabolomics data analysis and mining, related to the analysis of natural products (e.g. propolis) or food research, both in agriculture (e.g. cassava, maize or rice) and aquiculture.

Dario Grattapaglia - EMBRAPA

Dario Grattapaglia

1990 ‐ 1994 Ph.D. Genetics and Forestry (co‐major) North Carolina State University, Raleigh, NC, USA ‐ Advisor: Prof. Ronald Sederoff

Elected to the Honor Society Phi Kappa Phi, Chapter 33, 1992
1981 ‐ 1985 B.S. Forest Engineering, University of Brasilia, Brazil 

1994 ‐ present Research Scientist ‐ Project Leader, Plant Genetics Laboratory EMBRAPA – Genetic Resources and Biotechnology

2000 ‐ present Professor – Graduate Program in Genomic Sciences and Biotechnology Catholic University of Brasilia
2016‐ present Adjunct Professor ‐ Department of Forestry and Environmental Resources
North Carolina State University, USA
1996 ‐ present Co‐founder and director of Heréditas ‐ Tecnologia em Analise de DNA Private DNA analysis laboratory for human plant and animal forensics and plant molecular breeding
1995 ‐ present Associate Faculty at University of Brasilia ‐ Dept. of Molecular Biology
1994 Assistant Professor at UNESP State University of São Paulo ‐ Dep. of Genetics
1985 ‐ 1990 Research Scientist – Dept. of Plant Cell Biology ‐ Bioplanta Tecnologia de Plantas Ltda. (Joint venture between Native Plant Inc. ‐ South Lake City and BAT ‐ British American Tobacco) ‐ Campinas, São Paulo, Brazil


The convergence of genomics and quantitative genetics: genome-wide prediction of complex traits in forest trees

Planted forests supply woody biomass in a sustainable fashion that would otherwise come from deforestation and forest degradation. They also provide environmental services in erosion and water cycle regulation and act as long-term carbon sinks. The challenge of tree improvement programs is, however, the long interval between the breeding investment and the deployment of improved material. After twenty-five years of forest tree genomics research, and despite important advances in QTL mapping and association genetics (AG), genomics has not had any significant impact in operational tree breeding. Reasons include the limitations of early genomic technologies, the genetic heterogeneity of largely undomesticated trees and, mainly, the overly optimistic and rather naïve outlook about the architecture of complex traits. The advent of high throughput genotyping technologies coupled to genomic selection (GS) have provided a new paradigm to integrate genomics and quantitative genetics into breeding. By fitting thousands of genome-wide markers concurrently in predictive models, GS can capture most of the ‘missing heritability’ of complex traits that QTL and AG classically fail to explain. The milestone publication of the Eucalyptus grandis genome allowed us to develop a multi-species SNP genotyping chip based on whole-genome resequencing of 240 Eucalyptus tree genomes of 12 species. Using this genotyping platform in a number of industrial breeding programs in Brazil, we have shown that GS accuracies can match or surpass conventional phenotypic selection for growth and wood properties traits. GS significantly reduces the length of a breeding cycle by applying ultra-early selection of genomically multi-trait ranked seedlings, precluding the progeny trial stage. Top ranked seedlings can be subject to early flower induction and inter-mated to create the next breeding generation and/or immediately propagated and deployed as clones in validation field trials. We have found, however, that GS predicts poorly across unrelated populations and variable environments, therefore requiring breeding-population-specific GS models. Genome-wide prediction brings a new perspective to the understanding of quantitative trait variation in forest trees and shall make genomics finally find its way into applied breeding. Strategic and logistics aspects of operational GS adoption are now the challenges faced for its full integration into routine tree breeding operation.

Tulio de Oliveira - University of KwaZulu Natal

Tulio de Oliveira

My research interests concentrates on viral evolution under selection pressure created in the transmission, acquisition and drug escape processes. A particular point of interest is on the study of virus' origins and the effect of network of transmissions in the spread of epidemics in Africa and other continents. I also enjoy developing bioinformatics software applications and running, KRISP, a next generation sequencing (NGS) and bioinformatics laboratory. Prof. Tulio de Oliveira is a bioinformatician that has been working with HIV research since 1997. He has received his PhD at the Nelson R Mandela School of Medicine, UKZN, South Africa. He was a Marie Curie research fellow at the University of Oxford, U.K. He is current a Research Fellow of the U.K. Royal Society, the President: South African Society for Bioinformatics (SASBi) and the director of the newly stablished, KwaZulu-Natal Research and Innovation Sequencing Platform (KRISP), UKZN, Durban, South Africa.


KRISP – DNA Sequencing and Bioinformatics to Answer some of the key Global Health Questions: Pathogen transmission and effective interventions design.

The incidence and prevalence of HIV infection in young women in Africa is extremely high. We did a large-scale community-wide DNA sequencing and phylogenetic study to examine the underlying HIV transmission dynamics and the source and consequences of high rates of HIV infection in young women in South Africa. From June 11, 2014, to June 22, 2015, we enrolled 9812 participants, 3969 of whom tested HIV positive. HIV prevalence (weighted) was 59.8% in 2835 women aged 25-40 years, 40.3% in 1548 men aged 25-40 years, 22.3% in 2224 women younger than 25 years, and 7.6% in 1472 men younger than 25 years. HIV genotyping was done in 1,589 individuals. In 90 transmission clusters, 123 women were linked to 103 men. Of 60 possible phylogenetically linked pairings with the 43 women younger than 25 years, 18 (30.0%) probable male partners were younger than 25 years, 37 (61.7%) were aged 25-40 years, and five (8.3%) were aged 41-49 years: mean age difference 8.7 years (95% CI 6.8-10.6; p<0.0001). For the 92 possible phylogenetically linked pairings with the 56 women aged 25-40 years, the age difference dropped to 1.1 years (95% CI -0.6 to 2.8; p=0.111). 16 (39.0%) of 41 probable male partners linked to women younger than 25 years were also linked to women aged 25-40 years. 78.5% of the men were unaware of their HIV-positive status, 76 (96.2%) were not on antiretroviral therapy, and 29 (36.7%) had viral loads of more than 50 000 copies per mL. Sexual partnering between young women and older men, who might have acquired HIV from women of similar age, is a key feature of the sexual networks driving transmission. DNA sequencing and phylogenetic analysis identify the source of transmission. Expansion of treatment and combination prevention strategies that include interventions to address age-disparate sexual partnering is crucial to reducing HIV incidence and enabling Africa to reach the goal of ending AIDS as a public health threat. This results were published in the Lancet HIV (2017) and received major coverage from Science and Nature and became the building block of UNAIDS 2016 report. More info: http://www.krisp.org.za

Manuel Ruiz - CIRAD

Manuel Ruiz

Phd in bioinformatics, University of Montpellier, France

Head of the Bioinformatics team "Data Integration" of the Joint Research Unit Genetic Improvement and Adaptation of Mediterranean and Tropical Plants, Montpellier, France (2007-now), and Associate Member of Staff of the CIAT (Centro Internacional de Agricultura Tropical), Cali, Colombia (2013-now)

Scientific manager of the South Green Bioinformatics Platform

His research interests are: (i) Knowledge representation, management of data related to genetics and plant genomics, (ii) Semantic Web for information systems interoperability, and (iii) Comparative genomics and automatic annotations of genomes.

He is an author of 38 scientific publications in international peer-reviewed journals.



The South Green Bioinformatics platform, a comprehensive resource for crop genomics

Analysis and visualization of massive genomics datasets are an ongoing trend in plant sciences. The South Green Bioinformatics platform provides an ecosystem of tools that were originally developed as independent entities to fulfill the need for specific projects or crops, but have evolved over time to generic tools to comprehensively study crop genomics.
We have built a large panel of public information systems dedicated to specialized datasets (markers, genes, gene families, transcriptomes, genotypes, phenotypes, etc.) and crop-specific resources called Genome Hubs. Target users of bioinformatic analytical workflows are usually divided between people who use command-line and those who do not. We addressed both categories by offering complementary solutions, like Galaxy-based and command-line applications.
Various groups used the South Green infrastructure to obtain their data and results, and were able to publish high-quality biological information, on Coffee genome, Banana, Cocoa, African rice or large transcriptome resources. Tools developed for these studies are adaptable to a wide range of other organisms.