openSMILE 3.0
The open-source audio feature extraction toolkit powered by audEERING®.
Open-source audio feature extraction
openSMILE (open-source Speech and Music Interpretation by Large-space Extraction) is an open-source toolkit for audio feature extraction and classification of speech and music signals. openSMILE is widely applied in automatic expression recognition for affective computing.
openSMILE is free to use for research purposes. It is written purely in C++, has a fast, efficient, and flexible architecture, and runs on desktop, mobile, and embedded platforms such as Linux, Windows, macOS, Android, iOS, and Raspberry Pi.
See also the standalone opensmile Python package for an easy-to-use wrapper when working in Python.
openSMILE 3.0
on GitHub
openSMILE 3.0 is the third major release, offering big performance improvements. You can find a feature list below. Beginning with this version, binaries and source code are hosted on GitHub. There, you find a new documentation in HTML format, listing also the numerous updates, code refactorings, and fixes.
150,000 +Downloads
2,650 +Citations in scientific publications
The new features
of openSMILE 3.0
openSMILE 3.0 features a large number of incremental improvements and fixes over the last 2.3 release. Most notably, openSMILE now offers an easy-to-use Python API via opensmile-python. Learn more in this blog post. You can find a complete feature list below.
- Stand-alone openSMILE Python library
- New C API with wrappers for Python and .NET
- Modern, revised build process using CMake
- Support for the iOS platform
- Updated Android integration
- FFmpeg audio source component
A short history
of openSMILE
Since 2013, audEERING holds the rights for openSMILE and develops it further. The software is remains freely available for academic usage.
The audio classifier
openSMILE Key features
Below you can find a list with most of the features openSMILE includes. For more information, check out the documentation.
- Resource efficient: 27k features can be extracted with an RTF of 0.08
- Cross-platform (Windows, Linux, Mac, Android, iOS)
- Fast and efficient incremental processing in real-time
- High modularity and reusability of components
- Plugin support
- PCM WAVE files (read/write)
- Any media file format supported by FFmpeg (read)
- Live sound recording and playback via PortAudio
- Live sound recording via OpenSL ES/Core Audio on Android/iOS
- Comma-separated value (CSV) files (read/write)
- WEKA ARFF files (read/write)
- Hidden Markov Toolkit (HTK) parameter files (read/write)
- LibSVM feature file format (write)
- Windowing Functions (Hamming, Hann, Gauss, Sine, …)
- Fast-Fourier Transform
- Pre-emphasis filter
- FIR filterbanks
- Autocorrelation
- Cepstrum
- Signal energy
- Loudness
- Mel-/Bark-/Octave-spectra
- MFCC
- PLP-CC
- Pitch
- Voice quality (Jitter, Shimmer)
- Formants
- LPC
- Line Spectral Pairs (LSP)
- Spectral Shape descriptors
- Pitch classes (semitone spectrum)
- CHROMA and CENS features
- Weighted differential
- Mean-variance normalisation
- Range normalisation
- Delta-regression coefficients
- Vector operations
- Moving average filters
- Means, Extremes
- Moments
- Segments
- Samples
- Peaks
- Linear and quadratic regression
- Percentiles
- Durations
- Onsets
- DCT coefficients
- Zero-crossings
- Modulation spectrum
devAIce® Expression and
Scene Detection
Learn more about our devAIce®. audEERING®’s lightweight technology for expression detection, scene detection and many other purposes.
Who is
audEERING?
audEERING® not only further develops openSMILE, but is the worldwide leading innovator in Audio & Voice AI. Learn more about the company.