What’s that?

Hi! My name is Maximos Kaliakatsos-Papakostas and this is my blog.

  • Do you have an idea to make an iOS app and employ some advanced audio or music techniques?
  • Do you have an app and you want to make it more interactive in terms of audio and music?
  • Do you need to enhance the performance of your app regarding audio and music?

I’ll be happy to discuss possible alternatives with you, help you out with audio and music programming, or even help you implement ideas on any level of algorithmic design and programming.

Follow my blog if you are interested in discussions concerning music apps development or applications of computational sciences in music.

You can find my iOS apps here and my scientific work here.

Inter-app audio compatibility for iOS apps made with libPD

Inter-app audio in iOS apps is a feature that allows compatible apps to either send and/or receive audio to and/or from other compatible apps. So, how can you make your app inter-app audio compatible when you’ve used libPD/pd-for-iOS (Pure Data patch translator for iOS) to build it?

Making an audio app that supports inter-app Audio might be quite troublesome because of the lack of information resources… Especially when it comes to an audio app built with libPD, there seems to be literally no information at all available online! Here’s what I’ve found out while making the inter-app Audio compatible version of Echo Pitch. The most helpful text I’ve found online is the one provided by the www.raywenderlich.com tutorials. Before you start developing, make sure that you have an app able to bridge inter-app audio compatible apps, like garage band or the free app Audreio (which I’m also using for testing).

The steps described below are summarising the aforementioned tutorial pdf, but give additional info on the following:

1) integrating an app built with libPD for iOS (and not with directly using audio units) and

2) making the app both an audio sender and a receiver in the inter-app audio “conversation” between apps.

The steps are the following:

1) Enable inter-app audio int he app capabilities:

app_capabilities.png

2) Tweak the Info.plist file of your app by adding the necessary audio component description addition. You need to right click on the info.plist file and “Open As -> Source Code”. For making the app both a sender (“aurg”) and a receiver (“aurx”) you need to add both descriptions before the last </dict> line:

<key>AudioComponents</key>
    <array>
        <dict>
            <key>manufacturer</key>
            <string>avax</string>
            <key>name</key>
            <string>EchoPitch</string>
            <key>type</key>
            <string>aurx</string>
            <key>subtype</key>
            <string>iasp</string>
            <key>version</key>
            <integer>1</integer>
        </dict>
        <dict>
            <key>manufacturer</key>
            <string>avax</string>
            <key>name</key>
            <string>EchoPitch</string>
            <key>type</key>
            <string>aurg</string>
            <key>subtype</key>
            <string>iasp</string>
            <key>version</key>
            <integer>1</integer>
        </dict>
    </array> Read More 

Mic-in real-time granular synthesis in Pure Data

As with the offline granulator that we’ve discussed in the previous post, finding resources for real-time granular synthesis on PD was for me even harder. As far as I could figure out, the simple approach that is presented here has not been previously discussed, but I think that this approach is the most efficient, lightweight and intuitive. So please have a look at this post and the example patch and I’ll be very happy to receive any comments, suggestions, questions etc…

Goal: we want to have a look at the intuition behind making a patch that allows basic granular synthesis in real-time on the mic-in signal, using some tools that Pure Data offers. Basic granular synthesis in the context of this post includes the following parameters:

  1. Grain starting time: how far back in the mic input will the grains be collected.

  2. Grain length: how long the grains will be.

  3. Playback speed: how quickly will the grain be played, affecting the pitch of the grain as well.

The example patch given in this tutorial is a bit more “dirty” than the patch parts that are shown in the text illustrations because there are some complicated relations that need to be taken care of; those relations are beyond the intuitive purposes of this tutorial, so the interested reader is referred to the example patch, which automatically adjusts the aforementioned plus some additional parameters, e.g. grain panning, volume, option for octave-locking and grain firing rate. By abstracting this patch, bigger patches can be made that allow more impressive results on real-time granular synthesis.

Real-time? Circular buffer! 

A circular buffer or a ring buffer, can be imagined as a round tape that has a recording head, say on top, and while it is turning, new audio is recorded over old audio. How far back will audio start being recorded over is a matter of the tape length. The circular buffer that comes built in with PD is the [delwrite~ ] object. Figure 1 shows a graphical example of a circular buffer and the PD implementation, considering that the tape has a length of 5 seconds.

figure1.png

Figure 1: The circular buffer (ideal for real-time mic-in processing) and it’s implementation in PD.

Play something on this circular buffer.

To make some sound out of this circular buffer, we need to place a play head, implemented with the [vd~ ] object in PD, at some position of this buffer. The delay time (given as a signal in the [vd~ ] object) defines the how far back we need to take sound from – later on this will be the grain starting time. Figure 2 shows the illustration and the PD implementation of a 300ms delay time.

figure2.png

Figure 2: Playing some audio from the circular buffer/tape.

Isolate a (short) part.

Until now we are able to go back in time in the mic input and start playing continuously what we has been played a while ago (300ms in our example). Figure 3 shows how we can actually trigger an amplitude curve that allows only a short part of what was going on 300ms ago to be heard. The volume envelope comprises a cosine function driven by a [line~ ] object and the time of the line object defines how long the audible segment will be – in this example the time segment is 100ms. Shorter segments are probably more usual in granular synthesis, but let’s consider 100ms for this tutorial.

figure3.png

Figure 3: Isolated short part from the tape.

Pitch up (faster) / pitch down (slower).

If we start moving the playhead on the tape, then the pitch of the recorded material will be affected.

  • If we move it closer to the recording head, then the playhead plays the tape’s content faster and therefore we have pitch-up.

  • If we move it away from the recording head, then the playhead plays the tape’s content slower and therefore we have pitch-down.

The illustration on the right in Figure 4 shows both directions, but the PD patch part on the left shows only the slower case. For playing the tape’s content slower we need to gradually move the playhead away from the recording head (increase the delay time in the [vd~ ] object). This is done by the [line~ ] object that substituted the [sig~ ] object on top of the [vd~ ] object. In this example, during the 100ms that the audible segment “survives”, the playhead moves from 300ms (direct initialisation of the [line~ ] object) to 400ms.

figure4.png

Figure 4: Pitch up/down (only pitch down is shown in the PD patch on the left)

That’s a grain!

As Figure 5 indicates, this patch part is what we need to control the three aforementioned parameters (grain starting time, length and playback speed); thus we have a basic implementation of a grain.

  1. Grain starting time is controlled by the parameter value in the green box.

  2. Grain length is controlled by the parameter value in the red boxes.

  3. Grain pitch is controlled by the parameter value in the bleu box (greater value than the one in green box causes pitch down and the vice versa).

figure5.png

Figure 5: That’s a grain!

A simple patch built by a online single grain from mic-in and randomness

The example patch given allows monophonic granular synthesis on mic input with setting some random parameters. This patch implements the ideas of granular synthesis discussed until Figure 5, along with some additional parameters, includes controls for setting the minimum and the range of random variation for the following parameters:

  • Firing rate (how often new grains are fired).

  • Grain starting time.

  • Grain length.

  • Grain speed-pitch.

  • Grain starting volume.

  • Grain panning.

Initially, maximum grain length and maximum delay time are set and almost all horizontal sliders are giving relative percentages based on those maximum values. There are many interrelations between the different parameters in order to avoid glitches and other things that can go wrong. E.g. it is not possible to have the play head 50ms away from the rec head and request (for pitch-up purposes) to move the playhead 100ms towards the rec head since this way the playhead will have to go 50ms in the… future!

I’m also preparing a bigger patch that includes combined abstractions of single grains based on this tutorial patch. I’ll make a new post about bigger patches for both file-based and mic-in-based granulators. Stay tuned!

Granular synthesis on audio file with Pure data

Finding resources about how to make a proper patch for granular synthesis on PD was for me a lot harder than I expected. So I’m sharing what I’ve found and what I’ve figured out by myself in hope that others will save lots of their time when trying to approach granular synthesis with PD.

The patch of this tutorial can be found here and a video that explains what this patch does can be found here.

Goal: what we want to build, is a simple instructional patch that loads a wav file and at random times it takes a single grain (with random parameters) out of the loaded file and plays it. The controllable parameters of the patch will be the following:

  1. Grain density: how many grains per second we expect to listen.

  2. Grain starting time: position of the loaded file that the grain starts.

  3. Grain length: how long the grains will be.

  4. Playback speed: how quickly will the grain be played, affecting the pitch of the grain as well.

Additional parameters could be set, e.g. grain panning or volume, but for the sake of this tutorial we’ll stick to these four.

Basics: We consider a grain to be a (usually very short) segment of an audio file, the amplitude of which is regulated by an envelope. The playback speed of a grain can vary, along with other things that are not considered in this tutorial. Figure 1 shows an audio segment on the left, the envelope on the right an the resulting waveform that is obtained by applying the envelope the audio segment.

In Figure 1 the length of everything is fixed to 100 ms. Additionally, the audio segment does not come from a file but from an [osc~ 100] object.

 

fig1_intro.png

Figure 1: Granular synthesis basics.

Load a file, take a segment and apply envelope.

As a first step, what we need to be able to do is:

  • Load an audio file.

  • Select a starting and an ending point on this file, according to the grain length.

  • Modify the length of the envelope so that it fits that of the segment.

– Load an audio file.

Loading a file is performed in the part of the patch illustrated in Figure 2. Please notice that the patch will need to know later on what the size of the file is ([s fileSize]), and the maximum length that we consider for grains ([s maxLength]). Additionally, notice that the starting index horizontal slider (as any other slider in the patch) are in [0, 1], making all values in the patch relative to the desired/selected file size and maximum grain length.

fig2_loadingFile.png

Figure 2: Loading an audio file to take grains from.

 

– Select a starting and an ending point on this file, according to the grain length.

A suitable object to do it is the [line~ ] object, that will allow us to read from any index to any index of the of audio_file, in any amount of milliseconds we desire. Figure 3 shows a part of a patch where we can select a starting index (left), length (middle), playback time (right) and play it by hitting bang on top. Notice that for setting starting index and length into the [line~ ] object we use an intermediate [f ] object for suppressing the bang while changing these value, since these bangs activate [line~ ] and cause glitches while values in this number boxes are changed. Notice also that priority of bangs is very important when we need to finally play the grain: first we have to set [line~ ] to the starting index directly, second we need to make sure that the desired time duration is indeed as a second value in the list and third set the first list value, activating the [line~ ] object and playing the file.

fig3_playSegment.png

Figure 3: Selecting starting and ending points on the file.

 Modify the length of the envelope so that it fits that of the segment.

Figure 4 shows the integration of an envelope on the right side of the patch part of Figure 3. The envelope is the same as the one illustrated on the right side of Figure 1, but the one in Figure 4 receives a variable time length from the number box that also sets the grain duration time. Notice that priority of setting the envelope values and triggering it is also important. Specifically, first, the envelope [line~ ] is set to 0; then, time is set along with setting the time in the segment’s [line~ ] object; finally, the [line~ ] in the envelope part is instructed to go to 1 within the given time interval directly when the [line~ ] of the segment is activated (by the list that arrives from the [pack 0 0] object on the left).

fig4_envelope_on_segment.png

Figure 4: Modifying the envelope length.

A simple patch built by a single grain and randomness

The example patch that implements the idea of granular synthesis discussed so far is illustrated in Figure 5, along with some annotations that clarify some of the patch parts that haven’t been explained above. The following should be noted:

  • There are horizontal sliders that control the starting index, grain length and speed (as well as an additional slider below the waveform of the loaded array to indicate the part of the file that the grain is taken from).

  • There is an automation part in the bottom-right part where all these parameters are controlled, plus a metro that activates the current grain (at random time intervals).

  • There is a tricky bit in the box that includes the [min ] object: there is a chance that the grain size is larger than the metro firing rate, e.g. if the grain length is 100 ms and the next grain comes in 50 ms (according to metro time), then a glitch is going to happen. To avoid such glitches, we take a grain length the minimum of the slider-selected grain length and the current metro firing rate.

fig5_entirePatch_annotated.png

Figure 5: The complete example patch and some explanatory indications.

Feel free to ask, comment or propose anything! I’m planning to make another tutorial patch on real-time granulation of microphone input using Pure Data’s circular buffer: the [delwrite~ ] object. Additionally, I’m planning to make available two heavy-duty PD patches I’m preparing, that take advantage of (too) many grains and additional controls for file-based and real-time granular synthesis. Stay tuned!

Echo Pitch for iPhone is here!

Avi Bortnick and myself have been designing and developing an app that’s hopefully going to inspire a lot of people who make music using iPhones.

Echo Pitch – a pitch-shifting multi-delay that has 4 synced delays where each delayed signal can be pitched one octave up or down in 1/2 steps. Echo Pitch also has a built-in simple looper, amp simulator, master delay, reverb and can function as a 4-voice harmonizer. Each of the four delays can be routed into the other for incredible, swirling, ambient washes and otherworldly sounds. You really need an interface – like iRig, Fender Slide, Apogge Jam, etc, to use it, or at least headphones. Otherwise, you’ll get gnarly feedback. 

iTunes link is here, while a demo video by Avi Bortnick is below:

 

 

Stay tuned for updates 🙂

Audio programming in iOS: comparing the play-through latencies of libPD and The Amazing Audio Engine

Being an audio/music developer for iOS devices and a guitar player, I had a very simple question:

Which programming framework should I use to achieve minimum latency for developing real-time audio effects processor for iOS devices?

Being acquainted with Pure Data (PD) and having developed some apps using libPD, this seemed like my first option. However, when I came across The Amazing Audio Engine (TAAE), I was impressed both by its accuracy and its efficiency in CPU performance – considering also that it is very easy to use.

So, I made a simple comparison that I think worths sharing! This comparison actually involves counting the latency in a simple play-through scenario, where audio signal passes through an iPhone 6 without any effects applied. In other words, the question is: how much does the audio signal get delayed, simply by passing through an iPhone 6, using lidPD and TAAE as audio “managers”?

The process is rather simple and and is illustrated in the graph below:

pd_taae_experiment_overview

In a few words: I connected two guitar cables to the two audio inputs of a Presonus Firebox sound card. Input 1 was the “test signal” and input 2 the “ground-truth” signal. For testing the latency introduced by libPD and TAAE, the “test signal” was connected to an iPhone 6 before reaching the sound card. For a sanity-check of the process, I also tested both inputs driven directly (without iPhone intervention) into the sound card – just to make sure that there is no inherent latency between the two inputs of the sound card.

For comparing the difference between input 1 (test signal) and input 2 (ground truth), I recorded the both signal in two separate channels in Ableton Live. At some point during the recording I made both cables touch each other, producing an impulse peak in both channels. Then I extracted both signals in separate wav files, imported them in Matlab to measure the distance between the peaks in both signals. I considered this distance (in samples or seconds) as the value of latency introduced by using the iPhone.

Three scenarios were examined:

1) Sanity-check: no iPhone intervening, just to make sure that there is no inherent latency between the two inputs.

Results: sanity-check latency: 0 samples — 0.0000 seconds.

2) Testing libPD latency: test signal passed through an iPhone 6 running only a simple program using a play-through PD patch. Note that the buffer duration in lipPD has to be set up through the AVAudioSessions  – thanks Michael Tyson for bringing that up.

pd_code

 

Results: libPD latency: 2335 samples — 0.0181 seconds

3) Testing TAAE latency: test signal passed through an iPhone 6 running only a TAAE program for simple play-through.

taae_code

Results: TAAE latency: 742 samples — 0.0168 seconds

Here’s a graph summarising the results:

pdtaaeComparison.png

 

It seems that TAAE has slightly less latency (not perceptible). In the future I intent to check wether latency changes when starting to load audio effects to the programs (e.g. equalisation, distortion, etc.), or the CPU performance of programs based on TAAE and libPD for applying real-time audio effects.

Simple implementation of the Boids algorithm in Objective-C: counting time through audio

boidsScreenShot

My first attempt to implement the Boids algorithm for iOS using the Sprite kit was a bit disappointing, since everything was moving really slow when the number of agents increased above around 20 or 30 – both on simulator and device.

Doing some tests for building a very accurate metronome (see previous posts), it became clear to me that audio-level sample-based metronomes are both very accurate and CPU-friendly:

– accurate: since everything was being counted on a 1/44100 of a second “frame-rate” and

– CPU friendly: since there the timing mechanism was somehow assigned on the sound card.

– Potentially even more CPU friendly, by making two different methods for (i) defining the acceleration (ones every, say, 0.3 seconds) and (ii) updating the position according to acceleration (every, say, 0.1 seconds).

Therefore, the thought was simple and straightforward: instead of making metronome clicks on specific time intervals, an app’s view elements could be updated. Take a look at the ABetterMetronome thread and things will become clear 😉

Here’s the code! To run it you need to download The Amazing Audio Engine (TAAE), import it properly and place the TAAE main folder one level above the project’s folder.

A very accurate iOS metronome based on the Amazing Audio Engine and Pure Data

simpleMetro

Making a *very* accurate iOS metronome for has been discussed under two different approaches:

1) using The Amazing Audio Engine (TAAE) approach, as presented in the ABetterMetronome example and

2) using Pure Data (PD) and libPD for iOS as presented in the linked video.

The aforementioned approaches are completely accurate, but in both examples there some things that makes them difficult to be extended to a not-only-metronome kind of app:

1) The TAAE approach is based on synthesising the metronome’s “click” sound sample-by-sample. The ABetterMetronome tutorial does not show how and if we could use other means to produce the metronome “click” sound, or play any other instrument note or sound.

2) In the PD approach, time scheduling is performed “inside” the PD patch. It seems that all the music-side reasoning has to be performed inside the PD patch, while it seems more reasonable to do the musical reasoning part of code in Objective-C or Swift instead.

Therefore, both examples discuss simple cases to make an accurate metronome. However, there is no discussion on how to harness the immense timing accuracy that they both have to iOS programming with more sophisticated musical reasoning. For instance, if you are making an app that decides in real-time about which notes to be played at certain (accurate) time intervals both aforementioned examples make it seem hard – while all the info is in there and it is completely easy!

The attached file is again a simple metronome that combines the TAAE and the PD approaches. However, in this example the programmer can decide on the Objective-C side of the code which notes to be played on the PD side. Hence the block frame counter approach of the TAAE example is combined with the facilities that PD offers, allowing accuracy and creativity to music-based app development.

Bonus: In the attached example, there is also a not-so-pretty but functional screen that updates the current beat of the metronome. It might seem trivial but it is not – when you need to update UI elements inside a AEBlockChannel it has to go through an NSOperationQueue block… Check it out!

Accurate timing in iOS: don’t do it this way!

Following this great tutorial from HDEZ for making an iOS metronome with libPD one can make a very precise metronome.

However, if someone needs Pd only for getting beat-bang messages in Obj-C code then things are not so well…

The linked metronome app except from ugly, is also not accurate. So if you need a steady beat pulse provider for your music app, don’t do it this way! I’ll focus on the accurate timing mechanism I concluded with in a next post. To run it, make sure you properly import libPD as described in these videos.Screen Shot 2015-12-01 at 20.07.45

 

P.S. This project may be no-good for its timing aspects, but it is probably the first implemented example online that discusses sending messages from Pd to Obj-C and not the other way around!

Business through Science: The extra mile in business thinking.

PICT_20151027_094642

Background

While in Malaga, attending the huge ISMIR 2015 conference, I had the chance to see a presentation by Jeff C. Smith – a “tutorial” actually – where he presented a new kind of research that would academically benefit the MIR (Music Information Retrieval) community, giving them some new ground to expand their scientific quests. Jeff, after receiving a PhD in CCRMA (and probably previous to that), works at Smule, a company that creates music-related apps for iOS devices. These apps offer from karaoke singing to simple piano-like interfaces.

What the MIR community does: researchers in MIR-related academic and industrial institutes work on extracting information out of music, e.g. identify chords or onsets out of recordings, create music recommendation systems (like lastFM or Spotify) etc. Every year, during the annual MIR’s conference (ISMIR) they also organise the MIREX (MIR exchange) competitions, where researchers try to build algorithms that beat previous ones on tasks like “find the correct chords from audio” etc. However, the most of these competitions have been well established for may years and the improvements are not so impressive – say, from 95% to 95.5%.

The key-thing in Smule’s strategic planning: Smule being so successful, has many people using their products. So they took the rational step to keep lot’s of user data (like age, gender as well as several app usage data) in their servers in order to be able to analyse them for planning their marketing strategies. Having such a huge database of user data, they also made an analytics tool to be able to acquire any pieces of information they wanted through customised queries.

The extra mile in business thinking: what Jeff presented was the proposal of Smule to throw all the data they have to the community of MIR, which is hungry for new tasks and new research directions. Thereby, MIR researchers, who are ambitious for scientific discoveries, have the chance to initiate new research ventures, trying to figure out what people do and how they are affected when they use these apps. Will this task be scientifically interesting? It will! It’s a huge chance to study human behaviours on musical tasks.

Summarising thoughts:

1) Smule will probably get what they want by offering scientists something that they want.

2) Planning such large scale win-win scenarios seems ingeniously simple!

3) In contrast to some evil-looking industry-science collaborations (see wars, banks), such initiatives remind us that there are ventures that do good to both – and to the society too.

4) Bonus comment: according to the analytics that Jeff presented, there are over 10M users of such apps! For those who care, there is a market for iOS music-games related apps out there.

P.S. The picture is of Asteris Zacharakis and myself (exhibiting 2 out of 4 COINVENT posters) during the ISMIR 2015 conference.