audio feature extractor.
AUDIO FEATURE EXTRACTION
OPEN-SOURCE & CROSS-PLATFORM
openSMILE (open-source Speech and Music Interpretation by Large-space Extraction) is an open-source toolkit for audio feature extraction and classification of speech and music signals. openSMILE is widely applied in automatic emotion recognition for affective computing. openSMILE is completely free to use for research purposes. For commercial use, check out our devAIce™ Technology.
openSMILE 3.0 is the third major release, offering major performance improvments. Find a Feature List below. Beginning with this version, binaries and source code are hosted on GitHub. There you find a new documentation in HTML format, listing also the numerous major and minor updates, code refactorings and fixes.
openSMILE 3.0 features a large number of incremental improvements and fixes over the last 2.3 release. Most notably, openSMILE now offers an easy-to-use Python API via opensmile-python. Learn more about what else is new in our announcement blog post . Find a complete Feature List below.
- Standalone opensmile-python library
- New C API with wrappers for Python and .NET
- Modern, revised build process using CMake
- Support for the iOS platform
- Updated Android integration
- FFmpeg audio source component
The history of openSMILE began in 2008 at the Technical University Munich (TUM) in the scope of the SEMAINE project, which was funded by the EU. Back then, Florian Eyben, Martin Wöllmer and Björn Schuller – all later part of the audEERING management – started openSMILE. The goal of SEMAINE was to design an automated virtual agent with affective and social skills. openSMILE served the purpose of a real-time speech and emotion analysis component in this system.
From 2011 to 2013, Florian Eyben and Felix Weninger further developed openSMILE while working on their PhD theses at Technical University Munich. Erik Marchi made major contributions for the EU project ASC-Inclusion.
Since 2013, audEERING holds the rights for openSMILE and develops it further. The software continues to be made freely available for academic usage. openSMILE 3.0 is the latest major update and was published in 2020.
- Fast: 27k features can be extracted with an RTF of 0.08
- Cross-platform (Windows, Linux, Mac, Android, iOS)
- Fast and efficient incremental processing in real-time
- High modularity and reusability of components
- Plugin support
- PCM WAVE files (read/write)
- Any media file format supported by FFmpeg (read)
- Live sound recording and playback via PortAudio
- Live sound recording via OpenSL ES/Core Audio on Android/iOS
Feature file formats
- Comma-separated value (CSV) files (read/write)
- WEKA ARFF files (read/write)
- Hidden Markov Toolkit (HTK) parameter files (read/write)
- LibSVM feature file format (write)
- Windowing Functions (Hamming, Hann, Gauss, Sine, …)
- Fast-Fourier Transform
- Pre-emphasis filter
- FIR filterbanks
- Signal energy
- Voice quality (Jitter, Shimmer)
- Line Spectral Pairs (LSP)
- Spectral Shape descriptors
- Pitch classes (semitone spectrum)
- CHROMA and CENS features
- Weighted differential
- Mean-variance normalisation
- Range normalisation
- Delta-regression coefficients
- Vector operations
- Moving average filters
Statistical functionals (feature summaries)
- Means, Extremes
- Linear and quadratic regression
- DCT coefficients
- Modulation spectrum