AI Case Study

Universidad de la Republica de Uruguay researchers successfully identify varying types of bat species using machine learning to evaluate audio signals

Researchers from the Universidad de la Republica de Uruguay identify bat species in Uruguay based on audio using a random forest model. Prior to the research there was not exist a comprehensive database of audio classification for Uruguayan bats. The intended use of the findings is to allow wind farms to assess the movement of bat populations and ultimately be able to avoid mass bat casualties.


Public And Social Sector

Education And Academia

Project Overview

From PS Mag: "Bats emit ultrasound, and the pulse rate of a bat that is just flying differs from the rate it emits when it locates prey and tries to hunt it... Another advantage of acoustics is the acoustic records are more reliable and complete than the population studies of bats made with the traditional mist netting technique. Once scientists compile a database of sound waves, any subsequent analysis of the species using airspace around the turbines and their activities is faster and more rigorous. And most importantly: processes are standardized. Botto and his group have created both a reference library and a specific algorithm for analyzing sounds of the bats of Uruguay and other regions of South America with similar population characteristics." The best model was then implemented in a publicly available web tool for researchers to use.

Reported Results

The random forest (RF) model was assessed as the best at identifying the bats. According to the research paper: "This web application is the first tool for automated identification of bats in the area, which is especially relevant as the identification procedures may be sensitive to both geographical variation and species diversity. Given the rapid increase of wind farms in the country and the subsequent need of improved and reproducible environmental impact assessments, our algorithm provides a tool that can be used in Uruguay, with a potential positive and significant impact on bat conservation and policy creation."


According to the research paper, the team used "Random Forests, Support Vector Machines and Artificial Neural Networks algorithms were trained to predict bat species from acoustic variables" and compared the results.
"The RF model was performed considering 5000 trees, and SVM model was trained using a probability model and a Radial Basis Kernel. For the ANN, one hidden layer and 10 neurons were used. For each model, a matrix of predicted probabilities with 217 rows (observations) and 8 columns (species) was obtained. The class (i.e. species) that maximized the probability was retained as the final prediction. An observation was classified as “unknown” when the maximum probability did not reach a defined threshold. For the best performing model, an “unknown” class was created by
setting a threshold on the probabilities of belonging to each class: if the winning class was not predicted with a higher probability than the defined threshold, an “unknown” was assigned as the resulting class. The value of the threshold was optimized considering the model global accuracy in final model. For the optimization process, an error function was calculated."


R And D

Core Research And Development


From the research paper: "According to a recent study, globally, more than 40 species of bats have been affected by almost 300 mass mortality events related to windfarms... To minimize the negative impacts of wind farms on bat populations, exhaustive impact assessment studies should be conducted to determine which species use the airspace around the turbines and in what manner (e.g., migration patterns, feeding areas and massive roost sites). The use of acoustic surveys in these assessments is a powerful tool for determining the species present at each site because they pro- vide large datasets of presence and activity (e.g., hunting) information."



From the research paper: "A database comprising 662 pulses from 96 individuals of 8 species was obtained. This dataset was split into two independent subsets, one for the variable selection (1/3) and a modeling set for the training and testing (2/3) of the models".