Its been a bit since ther’s been any word from me on here, and I’ve got some news about the status of my SOC efforts and their ultimate fate. Basically, since SOC wrapped up, my schedule was completely full with academics and a 17 – 19 hr/week job. Truth be told, I’d rather have been continuing my work with MediaWiki than either of the above, but such was life. Some time ago, Brion (aka my mentor) and I discussed my situation, and he offered to look into having the foundation hire me as a contractor to continue my work. This seems to have gotten approved, and currently we’re working on the specifics. I’ve dropped to 4 – 5 hours of work at my other job to make room in my schedule.

Basically, what I have now can handle a large volume of uploads for one wiki, but there’s no way to have one recoding farm share the recoding tasks generated by numerous wikis. This is of course something necessary for use on wikimedia’s sites, but something I overlooked being new to the ways of wikimedia. So, expanding it to handle multiple wikis is priority one.

Advertisement

Well, I closed my books, Google searches, and my editor last Monday, and turned in my code.

 The biggest thing I think I learned this summer is not to spend too long getting caught up on a single aspect/feature of your project if you can’t figure it out. In my case, I spent entirely too long in the beginning and middle stages of the project basically just reading documentation, trying to build a complete picture in my mind of how MW stores, locates, and processes uploaded files, without writing any test code myself or anything. I did this partly because I was essentially afraid I’d miss some existing work that would be useful, or else not make use of some battle-tested code someone had taken the time to design. My efforts were really pretty fruitless, though, because I was trying to grasp too much at the same time. I found later that documentation is much more useful when your cursor is sitting halfway through a line and you need to figure out what function to use next. Tackling the same problems in this way, one step at a time, was of course much easier.

I also simply didn’t realize how much more complicated it would be to start transcoding uploaded media asynchronously immediately at upload time rather than periodically polling for items in a queue. When I wrote my proposal, it said “A second script (running perhaps as a cron job) will routinely monitor the job queue…” I had done a project like that in the past, but once I was accepted and had started working, I didn’t think it would be acceptable for there to be a gap of wasted time where an upload had occurred and there was an idle transcoding node that could be processing it but wasn’t. (That project also didn’t have to simultaneously monitor and control an encoder and decoder, just one app that did both.)

My proposal also makes no commitment to distributing the transcoding load among numerous transcoding nodes, but this too I decided was a must-have feature if my work was going to be widely used. In the end, it turned out not that complicated to implement, but in concept it did cause me to do a lot of thinking about “how should I go about this” that I could have totally skipped otherwise. (Actually, the version of my code that will be evaluated for SoC can only be used in a distributed fashion if the nodes can all write to a network filesystem that the recoded media is to be stored on, a requirement I was hoping to lift to be prepared for the possible upcoming removal of NFS from WikiMedia’s private network…but as there was no existing documented facility to write to a file repository not locally accessable, this wasn’t done.)

I didn’t expect the existing open source MPlayer -> Theora solution to be as limited as it was, but the improvements I made to the theora reference encoder overcame this unexpected roadblock, didn’t take that much time, and turned out to be a valuable opportunity to broaden my programing skills and give me some appreciation for the “power of open source.”

Finally, I spent tons of time working on MW’s support for audio and video media formats, as I was surprised to find out it really didn’t have any at the beginning of the summer. As my past posts discuss, I wrote code to more accurately detect MIME types of these media files, identify and validate them at upload time, the and beginnings of support for extracting metadata. I didn’t think I’d be doing any of this when I wrote my proposal, but what good is a system to recode uploaded videos if you can’t upload them in the first place?

All these things caused me to be behind schedule for the majority of the summer. I did produce a usable implementation of everything in time for Google’s “pencils down” deadline, but the time crunch at the end (also contributed to by confusion about the program end date…) did cause the code to suffer somewhat. Mainly, I want to add better handling for a variety of exceptional situations within the recoding daemon and within the UploadComplete hook that does the inital pushing of bits to add the job to the queue and notify an idle node of the event. Media support in MW still sucks too, and I might want to help out with that – for example, the existing MW notion of “media handlers” are bound to a file based on the file’s mime type. This works alright for jpeg files vs. djvu files, but not so much for video files, all of which can be examined by existing utilities like ffmpeg. Indeed, to test my code currently, one will need to map each video mime type they want to test to the video media handler in a config file.

I’m still awaiting any feedback from my mentor. I hope of course that I’ll get a passing evaluation, but even if I didn’t I don’t think I would consider my efforts a lost cause. Surely sooner or later, MediaWiki will have full audio and video support, and I want to continue to be a part of making that happen and ensure that my efforts will be made use of as much as possible. And I hope that sometime in the future, I will wake up, scan the day’s tech headlines, see “Wikipedia adds support for videos,” and know that I had something to do with it.

Since it’s been so long, I thought I’d just let everyone know I haven’t abandonded my work, died, or anything of the like. Blogging just doesn’t get the code written, so I haven’t been spending as much time on it.

As it’s now a week into August, I can say with certainty how the queue will be implemented. I did end up going with the idea that surfaced while authoring my last post, to have a daemon implemented in php running on each recoding node. The primary means of interfacing with it is an accompanying script that must run on the webserver of a recode node (yes, they must run apache), taking input via POST, forwarding it as commands to the daemon, and returning the outcome in an easily machine-processable format. This is accomplished over named pipes, the simplest means of interprocess communication that works for this task. All requests to and responses from the daemon get prefixed with their length to ensure complete transmission of messages. Notification of new jobs, as well as cancellation of the currently running job is then achieved by simply contacting the node (selected with aid of the DB) over http. The daemon expediently provides the notify script with results of a command so that the entire process can occur at upload time, specifically as manifsted in an UploadComplete hook. (Cancellation of a current job is used if a re-upload occurs at the time that the job generated by the old file is running.) The daemon uses a persistent instance of MPlayer for all assigned jobs, via MPlayer’s slave mode.

Although this isn’t quite as firm yet, I expect the following database design changes will facilitate operations:

  1. The image table will grow a new ENUM indicating whether the recoded editions of a particular file are available, in queue, or failed. I.E., a file is marked available when all configured recoded versions of it are finished processing.
  2. A new table will serve as the queue, and will contain only job order and the file’s database key. (I did considerable performance testing here and concluded that both columns should be indexed…retrievals will be by both order and name, and individual inserts/deletes seem to keep constant time regardless of number of records. *Very* rarely the order could get reindexed just to keep the integers from running off to infinity, perhaps as a task through the existing job queue. This all only matters if the queue has significant length anyway, but this might be the case if a wiki decides it doesn’t like its existing resample quality or size and wants to redo them all.)
  3. A new table will contain the current status of each recode node, including address to the notify script and the current job if there is one.

One of the recode daemon’s first tasks upon starting a new job is establishing which recoded versions of a given file are missing. I just like this better than storing such job details in the queue, as it greatly simplifies the queue as well as the adding of new formats in future. As a result, the daemon does need to interact with FileRepo and friends, so a good MediaWiki install must exist on recode nodes as well. In fact, much of the MW environment is actually loaded in the daemon at all times.

I’ve still got plenty to keep me busy here as the time starts getting tight.

Over the past few days I again put the queue implementation on a back burner, opting to wait for an interlibrary loan, “Advanced Unix Programming,” which I’m hoping will help me understand process sessions, process groups and such, which I’m hoping will in turn enable me to fork a process the way I’ve been intending to do. In the meanwhile, I’ve been working on further streamlining the recoding itself, and started writing the script that will run on each recoding node to oversee and synchronize mplayer, the encoder, and file transfers.

Specifically, I have further modified the encoder so that it can write a cache copy of the decompressed audio and video to disk as a command-line option. Combining this with my earlier work, not only will it be possible to do the recode of the first version of a file on the fly (streaming both directions from the file repository), but it will also be possible to make subsequent versions of the same file without redownloading or even re-decompressing it. My mentor says he had envisioned generating a few different versions (ie low/high bandwidth) for each file immediately at upload time, and so this scenario will be very common. Additionally, I have developed a way to have a single instance of MPlayer stay running and resident in memory which I can utilize for all jobs the node gets. It’s basically mplayer in slave mode, but with a little tweaking of mplayer.c to allow the command input stream to be re-attatched to different controlling processes (a necessary step since the php control script will terminate when there are no more jobs.) Although actually, come to think of it, I could also daemonize that too, and have the script that receives new job notifications just send the control script a SIGUSR1 or somesuch, prompting it to listen for new job details on a named pipe. Then a “stock” MPlayer could be used, even in a persistently running fashion. (Oh, the things you think of when writing in your blog…glad I didn’t spend *too* long on the mplayer.c tweaks)

This suggests yet another queue system design: because it would be possible to notify a node at any time of a new job, even when it was processing another, you could just store status info on each node in the database, namely the number of jobs currently assigned, and then simply select the node with the lowest number LIMIT 1 when deciding which recode node to notify of the new upload. The daemonized recode script could maintain its own internal queue of its jobs. It’s simple and easy, but the clear downside is that if you have a node backed up by 10 jobs that drops off the network or somehow crashes, all those jobs simply disappear. I think any attempt to back up the job queues and reassign to other nodes as they go down and come back up would probably be overly complicated, and no better than my existing plan based on a centralized queue.

…Just thinking out loud.

In other news, as my faithful fans can see I got sick of the old layout, as I could not see the cursor when replying to comments. Also, I picked up my midterm payment today from western union, trouble-free, for once. My library book came in today too, all 1000+ pages of it, so hopefully I can make this an educational experience without getting too sidetracked.

After I got that encoder working on Friday, the rest of the weekend’s been a bit of a let-down. I attempted to start implementing the queue manager, which as per my previously discussed design must be able to accept and acknowledge new jobs synchronously, and then possibly asynchronously monitor the job as it gets processed on a recode server. In other words, I need to be able to create a new process that I can have bi-directional communication with for specification of job parameters*, and then can switch to an asynchronous profile and stay active beyond the lifetime of the parent php script. I had a few strategies in mind to do this purely in php, ranging from an apparently incorrect memory that there is a way to end the http request before the script terminates, to starting a new php script via proc_open and closing STDIN/OUT/ERR back to the parent when the asynchronous effect was desired, to starting a new php script via an http request and having the client script abort or close the connection when the synchronous communications were completed (the new script of course using ignore_user_abort()).

Unfortunately, none of these strategies works. While it is possible to close the pipes created with proc_open and have the two processes run concurrently with no interaction, the parent process still will not terminate until the child has. So, while it would be possible to output all the HTML associated with an upload confirmation immediately, the connection wouldn’t close until it timed out. (Using a combination of Connection: close and content-length headers theoretically compels the browser to close the connection at the appropriate time in a scenario like this, but there’s no guarantee…plus generating a content length really requires everything to be output-buffered 😦 ) The other method, starting a new php script via an http request, probably would work on some configurations, but falls apart when there are any web server modules buffering script output, ie output compression mods. Even when the client indicates the response may not be compressed, something is retaining the initial bytes of output until the entire script completes. flush() doesn’t work, nor do various other ideas I had like sending an EOF character through the lines. I tried padding the output up to 8k and got the same result, and decided a solution that required more padding than that would just suck.

So, this leaves me with few options. Because there seems to be simply no way to proc_open a new process that stays alive after the parent has terminated, I am left with starting the new process by making a web request. I am now seriously considering implementing the queue manager in Java, as an extension to one of the many lightweight Java http servers out there. In this way, I would have full control over when responses are pushed out to the client, and could finish a request and then continue doing some processing. The big downside, besides being a bit more difficult to create, is that it would require MediaWiki installations that want to asynch. recode a/v contributions to have Java and run this special http server for the purpose.


*I want this system to be able to support a means for some job parameters such as frame size to be plugged in by the user, just as it is possible for images to be sized according to where they will be used now. Probably this capability would mostly be used when a video clip appears in multiple articles/pages. Because of the more expensive processing and storage requirements associated with video, however, I don’t want to accept jobs that are very close to an existing version of a file. As a result, I am reluctant to simply run a command through the shell, because of the difficulty of communicating with those processes. Another possibility is to do all the parameter & job error checking in the main script, add it to the queue, and then launch a process through the shell that seeks out a recode server for the job and oversees it. I will talk with my mentor about this option versus a non-php solution.

At long last, success is mine! I have overcome my theora encoding difficulties by adding audio and video buffering layers to the reference encoder shipped with libtheora. It’s only about 300 lines, but took me many days because I had to teach myself much of what I needed to know along the way, at least as far as memory allocation, pointers, and related C syntax was concerned. (Thanks to my U’s subscription to Safari Books Online, and the all-powerful Google search for advanced Linux I/O stuff too)

As I discussed earlier, I established that the problem actually stems from MPlayer filling the capacity of one or the other of the named pipes (audio or video) before writing any data to the other. With one of MPlayer’s output pipes full, the decoding process is blocked, and with one of the encoder’s input pipes empty, encoding is blocked, so the two processes deadlock each other. The solution is to suck up data on the full pipe and just buffer it – a process that I found often must be repeated numerous times -before MPlayer starts giving out data on the desired pipe again.

Because the amount of data to be buffered is unknown and significantly large, dynamic allocation/creation of data structures was necessary. The buffers are basically linked lists with each node containing a pointer to a 64k memory block (chosen because of the current Linux kernel’s named pipe size). This allows me to avoid copying large amounts of data – usually it is left in the original memory location it was read into until some portion of it must be placed in a contiguous block of memory with existing data for consumption by libtheora.

Another difficulty was properly detecting and handling the case where data cannot be read from either pipe because MPlayer has fallen behind the encoder.

Performance-wise, I am very pleased at my initial tests – CPU utilization is virtiually indistinguishible between my enhanced encoder and the original one when encoding data at realtime. And no memory leaks either 🙂

Since my first courses in Computer Science 5 years or so ago, I’ve been taught almost exclusively in languages that work ike Java & PHP. I’ve gotta say, there’s definitely something cool about doing your own memory management, finding bits with pointer arithmetic, and producing fast machine code at the end. I feel quite accomplished to have figured everything out.

Here’s a high-level overview of the design I have decided will work best for queueing the recode jobs.

There will be three interacting components.

  1. The upload server. This is where original contributions reside.
  2. The queue manager, which can be on any machine or machines running MediaWiki.
  3. The recode servers.

 Here’s a standard use case:

  1. Video (or audio) upload is received
  2. Upload server identifies & validates upload as recodable
  3. Upload server contacts queue manager via http POST, including final url to the file.
  4. Queue manager replies with an acknowledgement of the job being added and closes the connection. At this point, the upload server can essentially forget about the job and provide upload confirmation to the user.
  5. The queue manager will track (via mysql for easy shared access among backup queue managers) the status of all recode servers. If there is an idle recode server at the time a job request comes in from the upload server, it is immediately forwarded to that recode server, perhaps with some additional parameters such as output frame size added. The recode manager then monitors periodic reports from the recode server as the job runs, see 6. If there are no idle servers, the job is added to a queue, also implemented as a mysql table.
  6. Whenever a recode server is given a new job, it keeps the http connection on which the job was given open, and sends status information about the job’s progress over this connection every several seconds. This has two purposes: the main one is to allow the recode manager to detect and handle recode server processes that die unexpectedly, but could also facilitate snazzy ajax reporting of recode status to the user at a later date.
  7. The recode server starts streaming the original file directly from the upload server via a standard HTTP GET. (Technically, it may be several ranged requests as MPlayer probes the file.) As soon as it has downloaded enough data to generate recoded frames, it opens a POST connection back to the upload server and starts uploading the recoded version. No video data is ever written to disk on the recode server.
  8. When the entire file has been processed, the recode server notifies the queue manager of success and closes the connection. The queue manager may elect to assign this recode server a new job from the queue table at this time, and processing restarts at 5.

The most interesting exceptional use case is if a recode server fails to recode a file or dies unexpectedly. In the event of a failure message, timed-out or unexpectedly closed connection from the recode server to the queue manager, the queue manager will notify the upload server to delete any recoded data that may have been uploaded, cancel the failed job, and attempt to assign a new job to that recode server. If it cannot, it will update the recode server status table to reflect the failed server.

The child process and connection control routines necessary to do the above can be implemented entirely in PHP. Particularly, the queue manager will not need to be anything more than a PHP script/web service.

So I was a little too hasty on my previous 2 AM post of several days ago…I assumed the RIFF header problem was the only problem. Alas, once the encoder accepts the input data, it still reads the decompressed audio and video streams not at all in parallel. In other words, it doesn’t just grab a few kb of video data, process it, and then grab a few kb of corresponding audio data. Instead, it reads a ton of video data or a ton of audio data before switching streams. This is the reason many other sites have pointed out that you can’t directly feed both the decompressed audio and video from mplayer to the encoder via two named pipes: the current version of Linux has these fifo buffers pegged at 64 kb, and it was even smaller in older kernels. So, it seemed time to build in some additional buffering.

I looked through encoder_example.c first — in theory, you could implement it there, by having it read and temporarily store data from one named pipe whenever the other was empty, until data started appearing on it again. Trouble is, my experience with C is limited to one course, and it was in C++, so I’d have to do a bit of self-teaching to implement the kind of data structure and memory allocation that is necessary.

I spend a while attempting that, and then decided I was in a bit over my head. I ran back home to PHP and created an external buffering script with it, mostly just to make absolutely sure that it would work. It reads in data from a fifo mplayer is writing to, buffers it up to 10 MB, and writes it out to another fifo that is being emptied by the encoder. I didn’t really expect it to operate with any semblance of efficiency, which it definitely does not, but it did succeed in proving to me that figuring out how to implement a buffer directly within the encoder would really make it all work and thus be worth the effort.

And that’s where I’m at now…I didn’t expect this project would involve any coding outside of php…or contributions to other projects…but hey, this is where my SoC journey heading. Plus, it will be a fun challenge and a good achievement for me to improve the encoder.

I think I should also quickly comment on why I am so studiously ignoring ffmpeg2theora. Basically, because it was built to do something else. Since I’m clearly going to have to make modifications to example_encoder too, that reason may not be so valid, but sticking with mplayer provides a few extra benefits: obviously, people will be able to upload content in a few additional formats thanks to mplayer’s codec packs, and also this provides a tidy way to retain a copy of decompressed output that the encoder can reuse for, say, producing a ogg vorbis files at both low and high bitrates, without having to decompress twice.

I’ll still need to work on how to get a single instance of mplayer to decode at faster than normal playback speed…I’m not sure how robust the -speed 100 option is, and it also presents more RIFF header problems.

The past 24 hours have been intense, and good things are happening. Last night, I literally sketched out a variety of diagrams depicting how a distributed recode queueing system might work. I’ve settled on a general design, and did some preliminary process stream and network socket code testing in php to confirm it is possible. This morning, I felt like just jumping into a core component of my project that I had yet to touch: the recode managing code that itself, that each machine in the recoding cluster will run. This code’s job will be to supervise the forked processes it creates to decode, encode, and upload media files, and periodically send job status updates back to the machine that gave it its current job. The tools I will be using to get the job done will be mplayer, whose job will be decoding to PCM audio and YUV video, and the reference implementation of an encoder supplied as part of libtheora. I’m referring to examples/encoder_example.c. It’s code is not especially resillient if it were faced with corrupt or misformed files, but there doesn’t really seem to be much else in the way of terminal-based fronts to libtheora out there, and it is suitable since I can be certain its input (generated by MPlayer) will be properly constructed.

Or not.

Turns out, MPlayer’s PCM output code writes misformed RIFF headers on the front of the stream, and then goes back and fixes it at the end when it knows the file’s total size. (No Google searches to help me on that one, just a full days worth of examining wav files in a hex editor and sleuthing in MPlayer source code 🙂 ) This works great if you’re going to write to a normal file, but is a problem if you’re writing to an unseekable stream. For my project, having this capability is essential: I DO want the recode boxes to be able to start downloading the file to be recoded, and be writing data right back to the file repository as fast as it can be downloaded or recoded, rather than downloading the entire thing before decoding can commence, and recoding the entire thing before uploading can commence. I don’t want to be writing & keeping track of extremely large temporary decompressed audio and video files on the local filesystem, either.

So this means the decompressed audio and video must be piped directly to the encoder – and these are unseekable streams. The result is sometimes that encoder_example detects MPlayer’s bogus RIFF header and produces an error, otherwise it passes the audio on to libvoribis as audio with a sample rate in the Mhz (not khz) — and encoding dies there instead. (Hacking example_encoder to properly report sample rate doesn’t fix it, libvorbis still gets confused by the misformed RIFF header too.) This seems to be a long-standing problem that nobody has quite figured out before, so here’s hoping this entry finds its way to relevant Google searches.

As of right now, I unfortunately haven’t come up with any great ideas for how to calculate a correct RIFF header at the beginning, before the decompression itself has happened. The key missing piece of information within ao_pcm.c is the total (chronological) length of the file. I tried adding a (floating-point) “length” option to the -ao pcm set of suboptions so that length in seconds could be specified directly to this audio output plugin, but found it to be unrealistic to acquire a sufficiently accurate time in seconds and microseconds for the program to produce the correct header. So, nothing worthy of an MPlayer patch just yet. HOWEVER, as it turns out, encoder_example and libvorbis have excellent tolerance for somewhat miscalculated file and data length fields in the RIFF header. In fact, I’ve experimented with values that underestimated by 50%, and still the encoder has properly included all the audio in perfect sync in the outputted ogg.

So I’m probably going to chalk this up as being solved. And as it was the only missing link in a complete passthru from the http media download stream all the way to the ogg vorbis upload stream at the other end, I’m pretty happy to have gotten it figured out.

So it’s been a bit since I’ve posted…and since then, a lot has happened. Perhaps the most significant thing is my getting schooled pretty good on #mediawiki about the lack of merits of a pluggable mime detection system. Basically, its not that a pluggable media validation system is a bad idea, it’s that there’s no need to make any use of the existing MimeMagic module at all. My approach had been to start with it, and then use plugins to fine-tune its results as necessary. This is nice in that it doesn’t require a complete refactoring of uploading in MediaWiki, but it is overly complicated and has a few big drawbacks, one being that such a design wouldn’t really provide for extracting and caching file metadata, so that would have to be done at a later stage, requiring an in-depth analysis of the files twice.

 A better design is to just have upload validation handlers register themselves for file extensions that they can analyze, and have no generic mime detection at all. This is a part of Daniel’s proposed design expressed in his blog. Another nice not-yet mentioned detail of this design is that it provides for a clean way to taylor the recommended maximum upload size to different media types. The obvious drawback to this is that it requires the development of plugins specializing in reading & extracting metadata from every file type you want to support before this design could be deployed.

 Perhaps that wouldn’t actually be as much work as it sounds like. In the past 2 days, I’ve essentially created such a plugin that covers the entire audio and video arena, and I think it does a really good job, too. Currently, it can use the ffmpeg-php api, MPlayer’s little-known companion script midentify, or both. Adding additional analysis mechanisms would be a very straightforward process, but between those two you obtain pretty good overarching support for validating audio and video types. Actually, since MPlayer gives you ffmpeg’s abilities and then some, you can get along fine on midentify only, but I wrote support for ffmpeg-php for two reasons. The main one is that in my tests it sometimes finishes quicker, and sometimes by a lot. (Total runtimes for validation using a composite ffmpeg/mplayer solution are running between < .1 and ~.8 seconds on my test machine, depending on whether both end up needing to be invoked, so its in the ballpark.) Additionally, if MediaWiki moves to validating uploads using a number of dedicated plugins like this requiring some external utility or other, I can just hear the rumblings from private MediaWiki users. MPlayer is at least a truckload of RPM downloads, and at most a troublesome build/install, so for some that aren’t looking to do media recoding too, the php extension might make their installation experience easier.

 And that brings you to where I am at this moment. Currently I’m invoking this code through the existing UploadVerification hook, which operates with no regard to the uploaded extension/type. It’d be nice if my code only got called on uploads of extensions it had registered with the upload validator as audio or video (a long list, I know…and probably some work to properly compile) but for now, I think I’ll just emulate this by providing a list of extensions that it can verify as an array or sommat, and immediately return execution to the calling script if the upload isn’t on the list. Hopefully, at some point in the future, that can easily be adapted to register my code as the handler for those audio and video types.

And, if things start being done that way, it might also provide a good replacement for $wgFileExtensions, which otherwise is at risk of becoming tediously long as more file types get properly supported.

You can experiment with my code at http://mikeb.servehttp.com:8080/wiki/phase3/ — though this is the machine I’m working on, so it will be occasionally broken, have wierd debugging output, etc.