Photo based 3D Reconstruction on Portable Devices

Microsoft Research has released Information about a 3D Reconstruction App that runs on Windows Mobile. As far as the technology is concerned hardly anything new. It’s basically the same as Autodesks 123D Catch that is available on severall platforms with the actual image processing happening on the server.

The App looks very nice though, it comes with an acceleromteter supported visual guide that helps the user to capture all required images for a successfull 3D model construction.

3dscanui

Tweet about this on TwitterShare on Google+Share on FacebookPin on PinterestShare on RedditShare on LinkedInShare on StumbleUponEmail this to someone

What Makes Paris Look Like Paris?

Researchers from France and the USA have developed an algorithm that analyzes Google-Street-View images looking for visual elements, e.g. windows, balconies, and street signs, that are most distinctive for a certain geo-spatial area.

In their SIGGRAPH paper they describe how this information can be used to develop an architectural footprint of a city or city area and compare Paris, Barcelona, Prague and London

Tweet about this on TwitterShare on Google+Share on FacebookPin on PinterestShare on RedditShare on LinkedInShare on StumbleUponEmail this to someone

openSMILE – Speech and Music Interpretation by Large-space Extraction

Link

openSMILE – Speech and Music Interpretation by Large-space Extraction.

The openSMILE feature extration tool enables you to extract large audio feature spaces in realtime. It combines features from Music Information Retrieval and Speech Processing. SMILE is an acronym for Speech & Music Interpretation by Large-space Extraction. It is written in C++ and is available as both a standalone commandline executable as well as a dynamic library. The main features of openSMILE are its capability of on-line incremental processing and its modularity. Feature extractor components can be freely interconnected to create new and custom features, all via a simple configuration file. New components can be added to openSMILE via an easy binary plugin interface and a comprehensive API.

Here’s the extensive feature list:

  • Cross-platform (Windows, Linux, Mac)
  • Fast and efficient incremental processing in real-time
  • High modularity and reusability of components
  • Plugin support
  • Multi-threading support for parallel feature extraction
  • Audio I/O:
    • WAVE file reader/writer
    • Sound recording and playback via PortAudio library.
    • Acoustic echo cancellation for full duplex recording/playback in an open-microphone setting (via the Speex codec library)
  • General audio signal processing:
    • Windowing Functions (Hamming, Hann, Gauss, Sine, …)
    • Fast-Fourier Transform
    • Pre-emphasis filter
    • Comb filter (available soon)
    • FIR/IIR filter (available soon)
    • Autocorrelation
    • Cepstrum
  • Extraction of speech-related features:
    • Signal energy
    • Loudness
    • Mel-/Bark-/Octave-spectra
    • MFCC
    • PLP-CC
    • Pitch
    • Voice quality
    • Formants
    • LPC
    • Line Spectral Pairs (LSP)
  • Music-related features:
    • Pitch classes (semitone spectrum)
    • CHROMA and CENS features
    • Weighted differential
  • Moving average smoothing of feature contours
  • Moving average mean subtraction and variance normalisation (e.g. for on-line cepstral mean subtraction)
  • On-line histogram equalisation (used for noise robust speech recognition)
  • Delta Regression coefficients of arbitrary order
  • Functionals:
    • Means, Extremes
    • Moments
    • Segments
    • Samples
    • Peaks
    • Linear and quadratic regression
    • Percentiles
    • Durations
    • Onsets
    • DCT coefficients
    • Zero-crossings
  • Popular feature file formats are supported:
    • Hidden Markov Toolkit (HTK) parameter files (read/write)
    • WEKA Arff files (currently only non-sparse) (read/write)
    • Comma separated value (CSV) text (read/write)
    • LibSVM feature file format (write)
  • Fully HTK compatible MFCC, PLP, energy, and delta regression coefficient computation
  • Fast: 27k features can be extracted with an RTF of 0.08
Tweet about this on TwitterShare on Google+Share on FacebookPin on PinterestShare on RedditShare on LinkedInShare on StumbleUponEmail this to someone