Audio Processing on iOS using Aubio

Aubio is a tool designed for the extraction of annotations from audio signals. Its features include segmenting a sound file before each of its attacks, performing pitch detection, tapping the beat and producing midi streams from live audio.

It comes prebuilt for iOS in form of a framework that can be just dragged into a Xcode project to get going. This post will touch some basics of working with the aubio framework on iOS. Let’s walk through a simple beat detection task with aubio.

The first step is the easiest

import aubio

To detect beats, we need to create a tempo detector. You need to pass a method, the buffer size, hop size (number of frames between two consecutive runs. A good value is usually buffer size / 2) and the sample rate.

let tempo: COpaquePointer? = new_aubio_tempo()

It’s usually a good idea to ignore long silences. It has a method to do just that that accepts the pointer to the tempo detector and a float silence threshold in dB.

aubio_tempo_set_silence(tempo!, silenceThreshold)

In order to perform detection, we need either to feed raw data to the detector.

If you want to run this on a file, it’s easy using an aubio source:

let samples = new_fvec(512)
let source = new_aubio_source( "/path/to/file.wav", 0, 512)
let out = new_fvec(1)
var read : uint_t = 0
while true {
    aubio_source_do(source, samples, &read )
    aubio_tempo_do(tempo, samples, out)
    if (fvec_get_sample(out, 0) != 0) {
        let beat_time : Float = Float(total_frames) / Float(samplerate)
        puts( String(format: "beat at %.2f", beat_time))
    }
    if (read < 512) {
        break
    }
}
del_fvec(out)
del_aubio_tempo(tempo)
del_aubio_source(source)
del_fvec(samples)

But who wants to run on a file, it boring! aubio is optimized to even run the algorithms on real-time audio coming from the microphone. First step is to get raw data from the microphone. There’s a really nice gist to get this done.

Use this, but update the processMicrophoneBuffer to feed the data to aubio:

func setupAubio(samplerate: UInt32) {
    samples = new_fvec(sampleSize)
    tempo = new_aubio_tempo("default", 1024, sampleSize, samplerate)
    aubio_tempo_set_silence(tempo!, silenceThreshold)
}

func processMicrophoneBuffer(inputDataList : UnsafeMutablePointer<AudioBufferList>, frameCount : UInt32) {
    guard let samples = samples, tempo = tempo else { return }
    let out = new_fvec(2)
    var sampleCount: UInt32 = 0
    for i in 0..<(count/2) {
        let x = Float(dataArray[i+i  ])   // copy left  channel sample
        let y = Float(dataArray[i+i+1])   // copy right channel sample
        
        fvec_set_sample(samples, x*x + y*y, sampleCount)
        sampleCount += 1
        if sampleCount == sampleSize || i == count/2-1 {
            aubio_tempo_do(onset, samples, out)
            if (fvec_get_sample(out, 0) != 0) {
                // Yay! A BEAT!!!
                break
            }
            sampleCount = 0
        }
    }
    del_fvec(out)
}

func stopRecording() {
    if let tempo = tempo, samples = samples {
        del_aubio_tempo(tempo)
        del_fvec(samples)
        self.tempo = nil
        self.samples = nil
    }
}

Note: Make sure you clean up the pointers in stopRecording to avoid memory leaks.

Published 6 May 2017

I build mobile and web applications. Full Stack, Rails, React, Typescript, Kotlin, Swift
Pulkit Goyal on Twitter