Open Source Image Classifier

Libccv is an open source image classifier and computer vision library.

It runs on Mac OSX, Linux, FreeBSD, Windows, iPhone, iPad, Android, Raspberry Pi. In fact, anything that has a proper C compiler probably can run ccv. The majority (with notable exception of convolutional networks, which requires a BLAS library) of ccv will just work with no compilation flags or dependencies.

Among lots of other things it has a SIFT implementation

The sources are available on github.

openSMILE – Speech and Music Interpretation by Large-space Extraction

openSMILE – Speech and Music Interpretation by Large-space Extraction.

The openSMILE feature extration tool enables you to extract large audio feature spaces in realtime. It combines features from Music Information Retrieval and Speech Processing. SMILE is an acronym for Speech & Music Interpretation by Large-space Extraction. It is written in C++ and is available as both a standalone commandline executable as well as a dynamic library. The main features of openSMILE are its capability of on-line incremental processing and its modularity. Feature extractor components can be freely interconnected to create new and custom features, all via a simple configuration file. New components can be added to openSMILE via an easy binary plugin interface and a comprehensive API.

Here’s the extensive feature list:

  • Cross-platform (Windows, Linux, Mac)
  • Fast and efficient incremental processing in real-time
  • High modularity and reusability of components
  • Plugin support
  • Multi-threading support for parallel feature extraction
  • Audio I/O:
    • WAVE file reader/writer
    • Sound recording and playback via PortAudio library.
    • Acoustic echo cancellation for full duplex recording/playback in an open-microphone setting (via the Speex codec library)
  • General audio signal processing:
    • Windowing Functions (Hamming, Hann, Gauss, Sine, …)
    • Fast-Fourier Transform
    • Pre-emphasis filter
    • Comb filter (available soon)
    • FIR/IIR filter (available soon)
    • Autocorrelation
    • Cepstrum
  • Extraction of speech-related features:
    • Signal energy
    • Loudness
    • Mel-/Bark-/Octave-spectra
    • MFCC
    • PLP-CC
    • Pitch
    • Voice quality
    • Formants
    • LPC
    • Line Spectral Pairs (LSP)
  • Music-related features:
    • Pitch classes (semitone spectrum)
    • CHROMA and CENS features
    • Weighted differential
  • Moving average smoothing of feature contours
  • Moving average mean subtraction and variance normalisation (e.g. for on-line cepstral mean subtraction)
  • On-line histogram equalisation (used for noise robust speech recognition)
  • Delta Regression coefficients of arbitrary order
  • Functionals:
    • Means, Extremes
    • Moments
    • Segments
    • Samples
    • Peaks
    • Linear and quadratic regression
    • Percentiles
    • Durations
    • Onsets
    • DCT coefficients
    • Zero-crossings
  • Popular feature file formats are supported:
    • Hidden Markov Toolkit (HTK) parameter files (read/write)
    • WEKA Arff files (currently only non-sparse) (read/write)
    • Comma separated value (CSV) text (read/write)
    • LibSVM feature file format (write)
  • Fully HTK compatible MFCC, PLP, energy, and delta regression coefficient computation
  • Fast: 27k features can be extracted with an RTF of 0.08