Fish Modeling

Ontario Aquatic Ecosystem Classification Biotic Modeling

The goal of this project is to use data from Ontario’s Aquatic Ecosystem Classification (AEC) together with Flowing Waters Information System (FWIS) to develop predictive models of fish communities across Ontario. The biomass and density of 16 fish taxa are modeled using AEC variables, as well as landcover summaries from the Ontario Landcover Compilation (OLC). Landcover types were summarized into different groups, as well as a Landscape Disturbance Index (LDI), which quantifies the potential impacts of different landcover types to aquartic ecosystems. Predictions of current populations are available across all tributaries to the Great Lakes and St. Lawrence River. Predictions to simulated 'Reference' landscapes are available as well. Reference landscapes were approximated by removing urban and agricultural landcovers, and proportionally increasing all natural landcovers in the catchment to account for those removed. Additionally, the LDI was set to nearly 0 for urban and agriculture landcovers in the simulated 'reference' landscapes.

Results presented here are preliminary.

The code for the project is available on GitHub
LighGBMLSS models (An extension of LightGBM to probabilistic modelling) were used.
SHAP scores were used for model interpretation
Models were fit to Zero-adjusted Gamma distributions
Aquatic ecosystem classification (AEC) for Ontario
Ontario Land Cover Compilation v.2.0
Landscape data were processed using the ihydro R package

Observed and Predicted Fish Distributions

The map below shows the locations of observed and predicted fish biomass and densities. When multiple observations are present on a segment, the median is shown. By default, the observed values are shown, and model predictions can be shown from the layers menu.

Predictions of observed populations are available across all tributaries to the Great Lakes and St. Lawrence River. Predictions to simulated 'Reference' landscapes are available as well. Reference landscapes were approximated by removing urban and agricultural landcovers, and proportionally increasing all natural landcovers in the catchment to account for those removed. Additionally, the LDI was set to nearly 0 for urban and agriculture landcovers in the simulated 'reference' landscapes. Finally, the difference between observed and reference communities are mapped, and expressed as (Current - Reference; i.e., positive values indicate present day predictions are higher than simulated reference, and negative values indicate present day predictions are lower than simulated reference).

Selecting a stream segment with either the 'Current' or 'Reference' layers selected will show the predictor values associated with that segment for the 'Current' or 'Reference' predictions respectively. Selecting a stream segment in the 'Current' or 'Reference' predicted layers will also present a 'SHAP Breakdown' figure showing the contributions of each predictor variable to the final predicted outcome. A positive effect on the SHAP score of the mean suggests that predictor value is increasing the mean prediction, whereas a positive effect on the SHAP score of the presence/absence suggests that the predictor values is increasing the likelihood of a presence. Once a reach is selected, that reach will also be highlighted in the 'Predictor Response' tab.

Layers

Stream Lines Observed Predicted - Current Predicted - Reference (Current - Reference)

Colour Breaks

Consistent Province-Wide Breaks

Equal Quantile

Distributions of Predictor Variables

The map below shows the spatial distributions of predictors used to model the current distributions.

Predictor

Model Performance

The figure below shows Observed vs Predicted values. The data shown are filtered to the data selected on the side panel. The solid black line is fit to the 50th percentile of each observations predicted conditional distribution, and the blue lines are fit to the 25th and 75th percentiles. All values are shown on the log-scale.

Predictor Importance

The figure below shows the relative importance of each predictor in describing whether or not taxa are present or absent from a sample, as well as the mean. The data shown are filtered to the data selected on the side panel.

Predictor Response Surfaces

The figure below shows the effect of a predictor variable on the presence/absence and mean predicted outcome. The data shown are filtered to the data selected on the side panel. The predicted effects can be colour by a separate variable to identify interactions among predictors. A positive effect on the SHAP score of the mean suggests that predictor value is increasing the mean prediction, whereas a positive effect on the SHAP score of the presence/absence suggests that the predictor values is increasing the likelihood of a presence. If a reach was selected in the 'Fish Map' tab, it will be highlighted with a red circle.

Predictor

Colour