Chris Smith machine learning for ecology

Research

Machine learning tools for spatial demographic inference

northamerica

Genetic variation is shaped in part by a population’s ability to disperse, and by the density of individuals in the habitat. Obtaining estimates for such parameters is important for studying range shifts in response to climate change, genomic clines across hybrid zones, phylogeography, and the spread of adaptive alleles through space. A promising strategy for inferring demographic parameters is using spatial genomic data. However, current genetics-based methods have constraints that prevent their use in many species, leaving a critical gap in our methods toolbox.

I have developed machine learning approaches for estimating dispersal rate from population genetic data. These methods can be used with single nucleotide polymorphism datasets, making it possible to infer dispersal rate for species with limited genomic resources. Most recently I developed a tool for estimating maps of population density and dispersal across a landscape. This method is useful for identifying barriers to migration, source-sink dynamics, or population-dense areas, and I have applied it to publicly available North American grey wolf data.

Software:

disperseNN
disperseNN2
mapNN

Relevant publications:

Smith et al. 2023, Genetics
Smith and Kern 2023, BMC Bioinformatics
Smith et al. 2024, Molecular Ecology Resources

         

Inferring the timing of population differentiation

barnswallow

To conserve biodiversity it is important to understand how populations diverge into new species. In collaboration with Dr. Rebecca Safran’s lab, I analyzed the timescale of divergence between barn swallow subspecies (Hirundo rustica) using whole genome sequencing data. My analysis indicated subspecies divergence to be an order of magnitude more recent than the previously published estimate that was based on mitochondrial DNA. In this research area, I have an ongoing collaboration with the Safran Lab where we are analyzing the geographic population history of the barn swallow subspecies using phylogenetic approaches.

Relevant publications:

Smith et al. 2018, Molecular Ecology
Smith and Flaxman, 2020, Molecular Ecology Resources

         

Alternative splicing during population differentiation

splicing

With Dr. Nolan Kane and colleagues I found differentially-spliced mRNA isoforms between wild and domesticated sunflowers (Helianthus annuus). Ours is one of the first studies to explore transcriptome-wide splicing differentiation between closely related, non-human populations. Next, I discovered incorrectly-spliced transcripts in hybrid sunflowers. The erroneous transcripts were negatively associated with seedling growth rate, and many were regulated by multiple alleles with nonadditive interactions. These findings suggest that splicing errors could be the molecular manifestation of small-effect genetic incompatibilities. To characterize the role of splicing in population divergence more generally, we must study additional diverging populations or species. I am currently working with the Kane Lab to study divergent splice forms in a dune-adapted population of H. petiolaris.

Relevant publications:

Smith et al. 2018, PNAS
Smith et al. 2021, Evolution
Innes et al. 2023, Heredity

         

Host-microbe interactions

microbes

To explain differences in gut microbial communities we must determine how processes regulating microbial community assembly (colonization, persistence) differ among hosts and affect microbiota composition. With the Bolnick Lab, I studied natural populations of threespine stickleback (Gasterosteus aculeatus), a small fish, to identify major pathways of microbial colonization of the gastrointestinal tract. Using high-throughput 16S rRNA sequencing and bioinformatics, I found that sticklebacks from different lakes harbored different gut microbes, and that after controlling for food-associated and environmental microbes the gut microbiota appeared to be under genetic control. Next, I collaborated with the Bolnick Lab to examine changes in gut microbiota in response to macro-parasite infection. Our findings in sticklebacks are important for understanding the factors at play during population differentiation and for understanding the factors that may shape the gut microbiota more generally.

Relevant publications:

Smith et al. 2015, ISME J
Ling et al. 2020, ISME J

         

Knowledge-guided machine learning for estimating methane uptake by soil microbes

Methane is arguably the most important greenhouse gas to mitigate, however the natural sources and sinks for methane are not well understood. With Dr. Youmi Oh I am developing a machine learning approach for quantifying the methane soil sink that incorporates (i) information about the underlying biogeochemical process, (ii) chamber measurements, (iii) tower measurements, and (iv) satellite measurements.