At long last, success is mine! I have overcome my theora encoding difficulties by adding audio and video buffering layers to the reference encoder shipped with libtheora. It’s only about 300 lines, but took me many days because I had to teach myself much of what I needed to know along the way, at least as far as memory allocation, pointers, and related C syntax was concerned. (Thanks to my U’s subscription to Safari Books Online, and the all-powerful Google search for advanced Linux I/O stuff too)

As I discussed earlier, I established that the problem actually stems from MPlayer filling the capacity of one or the other of the named pipes (audio or video) before writing any data to the other. With one of MPlayer’s output pipes full, the decoding process is blocked, and with one of the encoder’s input pipes empty, encoding is blocked, so the two processes deadlock each other. The solution is to suck up data on the full pipe and just buffer it – a process that I found often must be repeated numerous times -before MPlayer starts giving out data on the desired pipe again.

Because the amount of data to be buffered is unknown and significantly large, dynamic allocation/creation of data structures was necessary. The buffers are basically linked lists with each node containing a pointer to a 64k memory block (chosen because of the current Linux kernel’s named pipe size). This allows me to avoid copying large amounts of data – usually it is left in the original memory location it was read into until some portion of it must be placed in a contiguous block of memory with existing data for consumption by libtheora.

Another difficulty was properly detecting and handling the case where data cannot be read from either pipe because MPlayer has fallen behind the encoder.

Performance-wise, I am very pleased at my initial tests – CPU utilization is virtiually indistinguishible between my enhanced encoder and the original one when encoding data at realtime. And no memory leaks either 🙂

Since my first courses in Computer Science 5 years or so ago, I’ve been taught almost exclusively in languages that work ike Java & PHP. I’ve gotta say, there’s definitely something cool about doing your own memory management, finding bits with pointer arithmetic, and producing fast machine code at the end. I feel quite accomplished to have figured everything out.