Resource efficient: 27k features can be extracted with an RTF of 0.08 Cross-platform (Windows, Linux, Mac, Android, iOS) Fast and efficient incremental processing in real-time High modularity and reusability of components Plugin support

PCM WAVE files (read/write) Any media file format supported by FFmpeg (read) Live sound recording and playback via PortAudio Live sound recording via OpenSL ES/Core Audio on Android/iOS

Comma-separated value (CSV) files (read/write) WEKA ARFF files (read/write) Hidden Markov Toolkit (HTK) parameter files (read/write) LibSVM feature file format (write)

Windowing Functions (Hamming, Hann, Gauss, Sine, …) Fast-Fourier Transform Pre-emphasis filter FIR filterbanks Autocorrelation Cepstrum

Mean-variance normalisation Range normalisation Delta-regression coefficients Vector operations Moving average filters

openSMILE 3.0

The open-source audio feature extraction toolkit powered by audEERING®.

Open-source audio feature extraction

openSMILE (open-source Speech and Music Interpretation by Large-space Extraction) is an open-source toolkit for audio feature extraction and classification of speech and music signals. openSMILE is widely applied in automatic expression recognition for affective computing.

openSMILE is free to use for research purposes. It is written purely in C++, has a fast, efficient, and flexible architecture, and runs on desktop, mobile, and embedded platforms such as Linux, Windows, macOS, Android, iOS, and Raspberry Pi.

See also the standalone opensmile Python package for an easy-to-use wrapper when working in Python.

For commercial use, click here to get our AI voice translator devAIce®.

openSMILE 3.0
on GitHub

openSMILE 3.0 is the third major release, offering big performance improvements. You can find a feature list below. Beginning with this version, binaries and source code are hosted on GitHub. There, you find a new documentation in HTML format, listing also the numerous updates, code refactorings, and fixes.

150,000 +Downloads

2,650 +Citations in scientific publications

The new features
of openSMILE 3.0

openSMILE 3.0 features a large number of incremental improvements and fixes over the last 2.3 release. Most notably, openSMILE now offers an easy-to-use Python API via opensmile-python. Learn more in this blog post. You can find a complete feature list below.

Stand-alone openSMILE Python library
New C API with wrappers for Python and .NET
Modern, revised build process using CMake
Support for the iOS platform
Updated Android integration
FFmpeg audio source component

A short history
of openSMILE

openSMILE started in 2008 at the Technical University Munich (TUM), developed by Dr. Florian Eyben, Martin Wöllmer and Prof. Björn Schuller – all later part of the audEERING. In the scope of the EU-funded SEMAINE project, the goal was to design a virtual agent with affective and social skills. openSMILE served the purpose of a real-time speech and emotion analysis component in this system. From 2011 to 2013, Dr. Florian Eyben and Felix Weninger further developed openSMILE at TUM. Erik Marchi made major contributions for the EU project ASC-Inclusion.

Since 2013, audEERING holds the rights for openSMILE and develops it further. The software is remains freely available for academic usage.

The audio classifier
openSMILE Key features

Below you can find a list with most of the features openSMILE includes. For more information, check out the documentation.

Fundamentals

Resource efficient: 27k features can be extracted with an RTF of 0.08
Cross-platform (Windows, Linux, Mac, Android, iOS)
Fast and efficient incremental processing in real-time
High modularity and reusability of components
Plugin support

Audio input/output

PCM WAVE files (read/write)
Any media file format supported by FFmpeg (read)
Live sound recording and playback via PortAudio
Live sound recording via OpenSL ES/Core Audio on Android/iOS

Feature file formats

Comma-separated value (CSV) files (read/write)
WEKA ARFF files (read/write)
Hidden Markov Toolkit (HTK) parameter files (read/write)
LibSVM feature file format (write)

Signal processing

Windowing Functions (Hamming, Hann, Gauss, Sine, …)
Fast-Fourier Transform
Pre-emphasis filter
FIR filterbanks
Autocorrelation
Cepstrum

Speech-related features

Signal energy
Loudness
Mel-/Bark-/Octave-spectra
MFCC
PLP-CC
Pitch
Voice quality (Jitter, Shimmer)
Formants
LPC
Line Spectral Pairs (LSP)
Spectral Shape descriptors

Music-related features

Pitch classes (semitone spectrum)
CHROMA and CENS features
Weighted differential

Data processing

Mean-variance normalisation
Range normalisation
Delta-regression coefficients
Vector operations
Moving average filters

Statistical functionals (feature summaries)

Means, Extremes
Moments
Segments
Samples
Peaks
Linear and quadratic regression
Percentiles
Durations
Onsets
DCT coefficients
Zero-crossings
Modulation spectrum

devAIce® Expression and
Scene Detection

Learn more about our devAIce®. audEERING®’s lightweight technology for expression detection, scene detection and many other purposes.

Learn more ›

Who is
audEERING?

audEERING® not only further develops openSMILE, but is the worldwide leading innovator in Audio & Voice AI. Learn more about the company.

Learn more ›

openSMILE 3.0

Open-source audio feature extraction

For commercial use, click here to get our AI voice translator devAIce®.

openSMILE 3.0 on GitHub

150,000 +Downloads

2,650 +Citations in scientific publications

The new featuresof openSMILE 3.0

A short history of openSMILE

The audio classifieropenSMILE Key features

devAIce® Expression and Scene Detection

Who is audEERING?