AI Case Study
Cornell University researchers increase processing efficiency 300x by developing Adaboost algorithm to automate detection of elephant rumblings
Researchers develop an algorithm to automatically detect elephant rumblings from recorded sounds at certain jungle sites in central Africa. This allows a better understanding of the elephant population density at certain times which would be difficult to obtain through visual methods. The Adaboost algorithm used was an improvement in processing efficiency of over 300x compared to manual processing, and a recall rate of at least 70%.
Public And Social Sector
Education And Academia
"An algorithm was created to automatically detect elephant rumbles in recordings by dividing the sound stream into 100-ms time frames, assigning each frame a score indicating the likelihood that it contains an elephant rumble. This process includes three main steps: pre-processing, feature extraction, and classification of frames... A central goal in the design of this algorithm is to enable practical application in the monitoring of African forest elephant populations. The proposed algorithm is currently in use to monitor seasonal and temporal elephant activity in large tracts of forest and at attractive resources like forest clearings. By regularly retrieving sound files from recording units and applying the proposed algorithm, researchers and park managers can accurately track landscape use and relative elephant density over short time intervals.
The automated detection algorithm described here for locating forest elephant vocalizations in field recordings represents a tool that enables efficient large scale monitoring of wild elephant populations. Although improvements are planned, the proposed algorithm nonetheless performed well when tested on an extremely large dataset (3983h) that included recordings from diverse habitats, including two that were not part of the training data."
"With recall rates above 70%, automated detection is a reliable tool for monitoring elephant activity and represents a greater than 300-fold increase in analysis efficiency over locating elephant vocalizations by hand. This increase in efficiency and performance, particularly in diverse habitats, fulfills the goals laid out before creating the algorithm. At a very acceptable level of 80% recall, at least 70% of events returned by the algorithm were elephant rumbles and at some sites the precision was 90%."
However, the researchers do note the limitations of technique: "Even at lowest classification thresholds, the proposed algorithm did not detect all elephant rumbles that were tagged by human review. While this is nearly universal for automated detection, the characteristics of these false negatives are informative for improving the algorithm. The proposed detector performed remarkably well across the study sites tested but nonetheless performance was significantly lower at the two sites that were completely novel in the test dataset (i.e., no examples from Dzanga or Kessala were in the training data)".
"Two novel sets of feature measurements were developed for this application: (1) “harmonic features,” which quantify the harmonic structure of the audio data within the time frame, and (2) “horizontal features,” which measure the presence of energy distributed horizontally on the spectrogram, i.e., the amount of power present at specific frequencies. To assess the performance of the selected feature sets relative to other commonly used acoustic features, comparative tests were performed using several different suites of measurements. Feature comparison was accomplished by creating several classification mod- els using the Adaboost machine learning algorithm... Classification models were each trained with a different subset of features and the relative performance of each model was evaluated.This classifier assigned each time frame to either class C1 (i.e., a time frame that was thought to contain an elephant rumble), or class C0 (i.e., a time frame that was thought not to contain an elephant rumble).
Three classifiers were considered in the design of this algorithm: linear support vector machines, random forest, and Adaboost. To train a model for each of the candidate classifiers, feature vectors and their corresponding class labels were input into an SVM, random forest, and Adaboost classifier. Performance of all candidate classifiers was highly similar, with maximum F1 scores of the SVM, random forest, and Adaboost classifiers being 0.70, 0.75, and 0.73, respectively. Thus, these methods were viewed as interchangeable for this application. Ultimately, the Adaboost classifier was selected for feature vector classification because of its fast training and testing speed relative to the SVM and random forest."
R And D
Core Research And Development
"African forest elephants (Loxodonta cyclotis) occupy large ranges in dense tropical forests and often use far-reaching vocal signals to coordinate social behavior. Elephant populations in Central Africa are in crisis, having declined by more than 60% in the last decade. Methods currently used to monitor these populations are expensive and time-intensive, though acoustic monitoring technology may offer an effective alternative if signals of interest can be efficiently extracted from the sound stream."
"Over 90% of the recordings used in this study were collected from five sites in Gabon and Central African Republic between 2007 and 2012. Recording sites were each situated in forest clearings... Audio files were recorded by autonomous recording units (ARUs) which recorded audio signals at 16-bit resolution and sampling rates of 2000 or 4000Hz. ARUs can detect elephant rumbles produced within a radius of approximately 0.8 km.
Two datasets were created for the purposes of training and testing the proposed rumble detection-classification system: A 522-h training dataset which was collected from eight study sites and contains 10721 elephant rumbles, and a 3983-h testing dataset that consists of recordings from five sites and includes 110 042 rumbles. In both datasets all elephant rumbles with durations of at least 2 s were manually tagged by members of the ELP team, providing a beginning and ending time for each truth event. Spectrograms were generated with 1024- sample Hanning windows with a 200-point advance, as these parameters allow for favorable time-frequency resolution of rumbles."