Trimming audio files with Audio::Nama
Our task
The elves' workshop has been as busy as ever this year. To help them pass the time when not gossiping or recounting heroic pranks of Christmases passed, Igvik Graybeard, one of the IT elves, has been piping music to the office, canteen and tool benches using cabling left over from an obsolete phone service.
In the last years the elves have gotten into radio plays, and Igvik wants to edit some old recordings he has to remove unwanted segments. In other words, he wants to extract relevant parts of an audio file, stitching them together into a new file. Igvik runs Linux and prefers to do his work in the terminal wherever possible. We've decided to help.
Our tools
We'll be accomplishing this with Audio::Nama, a multitrack recording mixing and audio-processing application written in perl, and Ecasound, a general-purpose audio engine. We also need git, which Nama uses to manage project state, provide for branching, undo, etc. nama
is the executable program script and also man page, so "man nama" for the docs, or use nama's internal help.
Nama configures Ecasound for recording, mixing and a variety of tasks via an audio network definition called a chain setup. Aside from housekeeping duties, Nama's main task is to regenerate the chain setup (and re-configure the engine) whenever the graph changes due to user input, such as deactivating a track or changing an output -- a bit like how a spreadsheet recalculates as necessary.
The first run of nama
creates a $HOME/nama
directory for project files and a configuration file $HOME/.namarc
.
Nama assumes a linux environment: that you'll be recording or listening from the default ALSA soundcard device, usually your computer's built-in soundcard. You can change this setting in namarc
.
Create a new project, add a track and import an audio file
We'll create a new project called 'christmas'.
$ nama -c christmas
We get a startup banner, then an empty track listing:
No. Name Requested Status Source Destination Vol Pan
===============================================================================
1 Main MON Main bus CH 1/2 100 50
2 Mixdown OFF track Main -- -- --
nama christmas Mixdown >
In the prompt, 'christmas' is the project, 'Mixdown', the current track.
We create a separate track to accommodate our audio file, then import the file.
nama christmas Mixdown > add-track carol
nama christmas carol > import ~/music/christmas-carol.mp3
format: s16_le,2,12000
importing /home/jroth/music/christmas-carol.mp3 as /home/jroth/nama/christmas/.wav/carol_1.wav, converting to s16_le,2,44100,i
No. Name Requested Status Source Destination Vol Pan
===============================================================================
1 Main MON Main bus CH 1/2 100 50
2 Mixdown OFF track Main -- -- --
3 carol PLAY carol_1.wav Main bus 100 50
We can now press SPACE to start/stop playback.
In the track listing above we see that track 'carol' with source carol_1.wav
is set to PLAY. The output of 'carol' feeds the 'Main' track via the 'Main' bus. The 'Main' track serves as the master fader with output to the sound card. Like 'carol', all new tracks start as members of the 'Main' bus.
Define our clips
We will use a specialized type of bus called a 'sequence' to stitch together clips from our radio play. First we need to mark the beginning and end of each clip we want to keep.
We do that by listening to the track and using the clip-here
command (bound to the F1 key) to mark clip starts and ends.
After some trial and error, we have the correct marks. In this example: four five-minute sections of content separated by 40-second breaks.
nama christmas carol > list-marks
0 40.0 clip-start-0020
1 340.0 clip-end-0021
2 380.0 clip-start-0022
3 680.0 clip-end-0023
4 720.0 clip-start-0024
5 1020.0 clip-end-0025
6 1060.0 clip-start-0026
*7 1360.0 clip-end-0027
The final gathering
We can now assemble our clips.
nama christmas carol > gather
No. Name Requested Status Source Destination Vol Pan
===============================================================================
1 Main MON Main bus CH 1/2 100 50
2 Mixdown OFF track Main -- -- --
3 carol MON carol sequence Main bus 100 50
Note that the source for track 3 has changed from carol_1.wav
to carol sequence
.
Having listened to the output and resolved any hiccups (there were none!) we are ready to render it to an audio file.
nama christmas carol > mixdown
Enabling mixdown to file
No. Name Requested Status Source Destination Vol Pan
===============================================================================
1 Main MON Main bus Mixdown 100 50
2 Mixdown REC v1 Main -- -- --
3 carol MON carol sequence Main bus 100 50
Now at: 0:00
Engine is ready.
Press SPACE to start the mixdown.
Engine is running
Engine is finished at 16:27
recorded: Mixdown_1.wav
Setting mixdown playback mode.
No. Name Requested Status Source Destination Vol Pan
===============================================================================
1 Main OFF Main bus -- 100 50
2 Mixdown PLAY Mixdown_1.wav -- -- --
3 carol MON OFF carol sequence -- 100 50
The mixdown step also generates ogg and mp3 encodings with oggenc and lame, respectively, if desired. These files appear in the project directory:
~/nama/christmas/christmas_1.mp3
~/nama/christmas/christmas_1.ogg
~/nama/christmas/christmas_1.wav
is a symlink to ~/nama/christmas/.wav/Mixdown_1.wav
How the sausage is made
Chain setup graph
An Ecasound chain setup is made of signal-processing chains--edges in the processing graph--and loop devices, which are nodes capable of summing their inputs.
A chain represents a stream with one input and one output.
Here is a setup of one chain, omitting general options related to buffer sizes, etc.
# audio inputs
-a:3 -f:s16_le,2,44100 -i:/home/jroth/nama/christmas/.wav/carol_1.wav
# audio outputs
-a:3 -o:alsa,default
There is one chain, with label 3
. Its input is from a .wav file, and its output goes to the default soundcard. There is an -f
argument that describes the audio signal bit width, channel count and sample frequency.
Chains can terminate at an audio file, a sound device or a loop device. A loop device is a buffer stage that can sum the signal from one or more chains. It supplies the sum to another chain (or possibly multiple chains.) A loop device with its sources and an output chain can represent a mixer or bus.
Significantly for our task, a chain can represent an audio clip, part of an audio file that plays at a particular time.
-a:5 -f:s16_le,2,44100 -i:playat,40,select,80,300,/home/jroth/nama/christmas/.wav/carol_1.wav
The 'select,80,300' object extracts 300 seconds of audio starting at 0:80. The 'playat,40' object plays the audio after a delay of 40 seconds.
Chains also perform effects processing. They can host plugins and plugin controllers called chain operators included in Ecasound and provided by numerous LADSPA and LV2 authors.
Basic mixer configuration
Each time a change in track status occurs, Nama generates a Chain setup file and commands Ecasound to load it.
Let's start with the mixer configuration when we imported our radio play.
No. Name Requested Status Source Destination Vol Pan
===============================================================================
1 Main MON Main bus CH 1/2 100 50
2 Mixdown OFF track Main -- -- --
3 carol PLAY carol_1.wav Main bus 100 50
This is the corresponding chain setup:
nama christmas carol > chains
# audio inputs
-a:1 -f:s16_le,16,44100,i -i:loop,Main_in
-a:3 -f:s16_le,2,44100 -i:/home/jroth/nama/christmas/.wav/carol_1.wav
# audio outputs
-a:1 -o:alsa,default
-a:3 -f:s16_le,16,44100,i -o:loop,Main_in
Above we have two chains, labeled 1
, and 3
, corresponding to our tracks 'Main' and 'carol'.
Chain 3
reads carol_1.wav
and sends its output to loop device Main_in
. Chain 1
read the stream from Main_in
and supplies it to the soundcard.
Debug output
By starting Nama with the -L ChainSetup
logging option, we can see the steps from Nama's representation of the routing graph to the final chain setup. Initially, tracks in the graph are allowed to connect directly. Then the graph is simplified. Dead branches are removed. Finally, loop devices are introduced between tracks. I only include the graph where it changes.
[1513] ChainSetup (L 106) Graph after bus routing:
wav_in-carol, carol-Main
[1514] ChainSetup (L 111) Graph after aux sends:
[1514] ChainSetup (L 114) Graph with paths from Main:
wav_in-carol, carol-Main, Main-soundcard_out
[1514] ChainSetup (L 117) Graph with mixdown mods:
[1514] ChainSetup (L 216) Graph after simplify_send_routing:
[1514] ChainSetup (L 218) Graph after remove_out_of_bounds_tracks:
[1514] ChainSetup (L 220) Graph after recursively_remove_inputless_tracks:
[1515] ChainSetup (L 222) Graph after recursively_remove_outputless_tracks:
[1515] ChainSetup (L 122) Graph after pruning unterminated branches:
[1515] ChainSetup (L 126) Graph after adding loop devices:
wav_in-carol, carol-Main_in, Main_in-Main, Main-soundcard_out
[1515] ChainSetup (L 130) Graph with inserts:
So this is our final graph:
wav_in-carol, carol-Main_in, Main_in-Main, Main-soundcard_out
At this point, one end of each edge is a track. An edge--associating one track and one terminal--is sufficient to write an input or output line in the chain setup.
This happens through dispatch on the type of terminal, and generates IO objects, each used to write one line of the chain setup: a different class for each type of terminal.
[1515] ChainSetup (L 360) dispatch: wav_in -> carol
[1515] ChainSetup (L 362) name: carol, endpoint: wav_in, direction: input
[1516] ChainSetup (L 360) dispatch: Main -> soundcard_out
[1516] ChainSetup (L 362) name: Main, endpoint: soundcard_out, direction: output
[1516] ChainSetup (L 360) dispatch: carol -> Main_in
[1516] ChainSetup (L 362) name: carol, endpoint: Main_in, direction: output
[1516] ChainSetup (L 360) dispatch: Main_in -> Main
[1516] ChainSetup (L 362) name: Main, endpoint: Main_in, direction: input
bless( {
chain_id_ => 3,
endpoint_ => "wav_in",
track_ => "carol",
}, 'Audio::Nama::IO::from_wav' )
bless( {
chain_id_ => 1,
endpoint_ => "soundcard_out",
track_ => "Main",
}, 'Audio::Nama::IO::to_alsa_soundcard_device' )
bless( {
chain_id_ => 3,
device_id_ => "loop,Main_in",
endpoint_ => "Main_in",
track_ => "carol",
}, 'Audio::Nama::IO::to_loop' )
bless( {
chain_id_ => 1,
device_id_ => "loop,Main_in",
endpoint_ => "Main_in",
track_ => "Main",
}, 'Audio::Nama::IO::from_loop' )
These classes provide methods that can be overridden by field values annotated on the graph edges or graph nodes, with the edge (chain) values having the highest priority.
Overrides are used for mixdown and anywhere values for an IO object need to differ from those derived from the parent track.
Assembling our audio clips
We'll look at the project after we issued the gather
command and entered <mixdown>, ready to render the sequenced clips. The output of track 'Main' feeds track 'Mixdown' which will write Mixdown_1.wav
.
No. Name Requested Status Source Destination Vol Pan
================================================================================
1 Main MON Main bus Mixdown 100 50
2 Mixdown REC v1 Main -- -- --
3 carol MON carol sequence Main bus 100 50
If we show hidden tracks, we see our four clips, each represented as a track. The source for each is the same audio file.
nama christmas carol > sha # show all tracks
No. Name Requested Status Source Destination Vol Pan
===============================================================================
1 Main MON Main bus CH 1/2 100 50
2 Mixdown OFF track Main -- -- --
3 carol MON carol sequence Main bus 100 50
4 carol-1-carol-v PLAY carol_1.wav carol sequen 100 50
5 carol-2-carol-v PLAY carol_1.wav carol sequen 100 50
6 carol-3-carol-v PLAY carol_1.wav carol sequen 100 50
7 carol-4-carol-v PLAY carol_1.wav carol sequen 100 50
In the chain setup below we see that select
and playat
objects are used in the inputs to define a region of the audio file and play it with a given delay. The values were generated from our list of marks.
# audio inputs
-a:1 -f:s16_le,16,44100,i -i:loop,Main_in
-a:3 -f:s16_le,16,44100,i -i:loop,carol_in
-a:4 -f:s16_le,2,44100 -i:select,40,300,/home/jroth/nama/christmas/.wav/carol_1.wav
-a:5 -f:s16_le,2,44100 -i:playat,300,select,380,300,/home/jroth/nama/christmas/.wav/carol_1.wav
-a:6 -f:s16_le,2,44100 -i:playat,600,select,720,300,/home/jroth/nama/christmas/.wav/carol_1.wav
-a:7 -f:s16_le,2,44100 -i:playat,900,select,1060,300,/home/jroth/nama/christmas/.wav/carol_1.wav
-a:Mixdown -f:s16_le,16,44100,i -i:loop,Main_out
# audio outputs
-a:1 -f:s16_le,16,44100,i -o:loop,Main_out
-a:3 -f:s16_le,16,44100,i -o:loop,Main_in
-a:4,5,6,7 -f:s16_le,16,44100,i -o:loop,carol_in
-a:Mixdown -f:s16_le,2,44100,i -o:/home/jroth/nama/christmas/.wav/Mixdown_1.wav
Some overriding was involved, as we see -a:Mixdown
, a chain identifier that is not a track number. I think there was some technical reason, but it does stand out.
Thanks for reading!
Showcase
Jeanette Claassen has produced a large number of original songs using Nama in combination with various other music software and hardware. She began recording music using Ecasound and migrated to Nama as it became sufficiently featureful.
Notable dependencies
Git::Repository: Manage project history, branching, undo
Log::Log4perl: Logging, debugging
Parse::RecDescent: Command language parser
Object::Tiny: Accessor generator
Role::Tiny: Breaking up large classes
Term::ReadLine::Gnu: Terminal support, tab completion, history, hotkey mappings
Term::TermKey: More hotkey mappings
Graph: Representing, traversing and transforming the signal processing network
AnyEvent: Event loop
Tk: GUI, available with the -g option, use at your own risk!
Data::Section::Simple: Access sections within the __DATA__ block
Memoize: Turn expensive, redundant calculations into cheap lookups
Devel::NYTProf: Profiling to find and solve bottlenecks