Saturday, September 3, 2011

Fun with WAV

Last year, a "Fun with wav" post got a lot of visibility. The author was trying to extract the header from a audio file in the WAV format and output the sum of the remaining data in the file. He gives the disclaimer several times that this was a specific hack and not a generalized solution.

Although the original solution was in C, he received other possible solutions on Reddit. The "winner" by code golf rules is probably Ruby:

data = ARGF.read
keys = %w[id totallength wavefmt format
          pcm channels frequency bytes_per_second
          bytes_by_capture bits_per_sample
          data bytes_in_data sum
]
values = data.unpack 'Z4 i Z8 i s s i i s s Z4 i s*'
sum = values.drop(12).map(&:abs).inject(:+)
keys.zip(values.take(12) << sum) {|k, v|
      puts "#{k.ljust 17}: #{v}"
}

Build It

While not attempting to "golf", I wanted to show how this might be implemented in Factor. First, some imports:

USING: alien.c-types classes.struct kernel io
io.encodings.binary io.files math specialized-arrays ;

FROM: sequences => map-sum ;

IN: wavsum

Each WAV file begins with a "master RIFF chunk" followed by format information and the sampled data. We could read each field specifically, or we can capture this header information directly into a packed structure (I added support for these in January and recently merged it into the main Factor repository).

PACKED-STRUCT: header
    { id char[4] }
    { totallength int }
    { wavefmt char[8] }
    { format int }
    { pcm short }
    { channels short }
    { frequency int }
    { bytes_per_second int }
    { bytes_by_capture short }
    { bits_per_sample short }
    { data char[4] }
    { bytes_in_data int } ;

We can easily read from an input stream directly into this structure:

: read-header ( -- header )
    header [ heap-size read ] [ memory>struct ] bi ;

The original solution then produced a sum of the remaining file, treated as a sequence of shorts (16-bit integers).

SPECIALIZED-ARRAY: short

: sum-contents ( -- sum )
    contents short-array-cast [ abs ] map-sum ;

Producing a "wavsum" from a file:

: wavsum ( path -- header sum )
    binary [ read-header sum-contents ] with-file-reader ;

Try It

We can try it on a sample wav file that I included with the vocabulary and we get the same output as the Ruby and C versions:

( scratchpad ) "vocab:wavsum/truck.wav" wavsum [ . ] bi@
S{ header
    { id char-array{ 82 73 70 70 } }
    { totallength 66888 }
    { wavefmt char-array{ 87 65 86 69 102 109 116 32 } }
    { format 50 }
    { pcm 2 }
    { channels 1 }
    { frequency 22050 }
    { bytes_per_second 10752 }
    { bytes_by_capture 512 }
    { bits_per_sample 4 }
    { data char-array{ 32 0 -12 3 } }
    { bytes_in_data 16777223 }
}
392717699

It might be useful to add some validation to this example (much like the original C version) for such things as endianness and the 16-bit WAV format. Alternatively, we could improve it to be more general to handle 8-bit or 24-bit encodings, as well as other header formats (not just the "extended WAV" format).

The code for this is on my Github.

No comments: