Open-source audio feature extraction
openSMILE (open-source Speech and Music Interpretation by Large-space Extraction) is an open-source toolkit for audio feature extraction and classification of speech and music signals. openSMILE is widely applied in automatic emotion recognition for affective computing. openSMILE is completely free to use for research purposes. For commercial use, check out our devAIce™ Technology.
openSMILE 3.0 is the third major release, offering big performance improvements. You can find a feature list below. Beginning with this version, binaries and source code are hosted on GitHub. There, you find a new documentation in HTML format, listing also the numerous updates, code refactorings, and fixes.
2,650 +Citations in scientific publications
The new features
of openSMILE 3.0
openSMILE 3.0 features a large number of incremental improvements and fixes over the last 2.3 release. Most notably, openSMILE now offers an easy-to-use Python API via opensmile-python. Learn more in this blog post. You can find a complete feature list below.
- Stand-alone openSMILE Python library
- New C API with wrappers for Python and .NET
- Modern, revised build process using CMake
- Support for the iOS platform
- Updated Android integration
- FFmpeg audio source component
A short history
openSMILE started in 2008 at the Technical University Munich (TUM), developed by Florian Eyben, Martin Wöllmer and Björn Schuller – all later part of the audEERING. In the scope of the EU-funded SEMAINE project, the goal was to design a virtual agent with affective and social skills. openSMILE served the purpose of a real-time speech and emotion analysis component in this system.
From 2011 to 2013, Florian Eyben and Felix Weninger further developed openSMILE at TUM. Erik Marchi made major contributions for the EU project ASC-Inclusion.
Since 2013, audEERING holds the rights for openSMILE and develops it further. The software is remains freely available for academic usage.
Key features of
Below you can find a list with most of the features openSMILE includes. For more information, check out the documentation.
- Resource efficient: 27k features can be extracted with an RTF of 0.08
- Cross-platform (Windows, Linux, Mac, Android, iOS)
- Fast and efficient incremental processing in real-time
- High modularity and reusability of components
- Plugin support
- PCM WAVE files (read/write)
- Any media file format supported by FFmpeg (read)
- Live sound recording and playback via PortAudio
- Live sound recording via OpenSL ES/Core Audio on Android/iOS
- Comma-separated value (CSV) files (read/write)
- WEKA ARFF files (read/write)
- Hidden Markov Toolkit (HTK) parameter files (read/write)
- LibSVM feature file format (write)
- Windowing Functions (Hamming, Hann, Gauss, Sine, …)
- Fast-Fourier Transform
- Pre-emphasis filter
- FIR filterbanks
- Signal energy
- Voice quality (Jitter, Shimmer)
- Line Spectral Pairs (LSP)
- Spectral Shape descriptors
- Pitch classes (semitone spectrum)
- CHROMA and CENS features
- Weighted differential
- Mean-variance normalisation
- Range normalisation
- Delta-regression coefficients
- Vector operations
- Moving average filters
- Means, Extremes
- Linear and quadratic regression
- DCT coefficients
- Modulation spectrum
devAIce® Emotion and
Learn more about our devAIce®. audEERING’s lightweight technology for emotion detection, scene detection and many other purposes.
audEERING not only further develops openSMILE, but is the worldwide leading innovator in audio AI. Learn more about the company.