Its been a bit since ther’s been any word from me on here, and I’ve got some news about the status of my SOC efforts and their ultimate fate. Basically, since SOC wrapped up, my schedule was completely full with academics and a 17 – 19 hr/week job. Truth be told, I’d rather have been continuing my work with MediaWiki than either of the above, but such was life. Some time ago, Brion (aka my mentor) and I discussed my situation, and he offered to look into having the foundation hire me as a contractor to continue my work. This seems to have gotten approved, and currently we’re working on the specifics. I’ve dropped to 4 – 5 hours of work at my other job to make room in my schedule.

Basically, what I have now can handle a large volume of uploads for one wiki, but there’s no way to have one recoding farm share the recoding tasks generated by numerous wikis. This is of course something necessary for use on wikimedia’s sites, but something I overlooked being new to the ways of wikimedia. So, expanding it to handle multiple wikis is priority one.

Well, I closed my books, Google searches, and my editor last Monday, and turned in my code.

 The biggest thing I think I learned this summer is not to spend too long getting caught up on a single aspect/feature of your project if you can’t figure it out. In my case, I spent entirely too long in the beginning and middle stages of the project basically just reading documentation, trying to build a complete picture in my mind of how MW stores, locates, and processes uploaded files, without writing any test code myself or anything. I did this partly because I was essentially afraid I’d miss some existing work that would be useful, or else not make use of some battle-tested code someone had taken the time to design. My efforts were really pretty fruitless, though, because I was trying to grasp too much at the same time. I found later that documentation is much more useful when your cursor is sitting halfway through a line and you need to figure out what function to use next. Tackling the same problems in this way, one step at a time, was of course much easier.

I also simply didn’t realize how much more complicated it would be to start transcoding uploaded media asynchronously immediately at upload time rather than periodically polling for items in a queue. When I wrote my proposal, it said “A second script (running perhaps as a cron job) will routinely monitor the job queue…” I had done a project like that in the past, but once I was accepted and had started working, I didn’t think it would be acceptable for there to be a gap of wasted time where an upload had occurred and there was an idle transcoding node that could be processing it but wasn’t. (That project also didn’t have to simultaneously monitor and control an encoder and decoder, just one app that did both.)

My proposal also makes no commitment to distributing the transcoding load among numerous transcoding nodes, but this too I decided was a must-have feature if my work was going to be widely used. In the end, it turned out not that complicated to implement, but in concept it did cause me to do a lot of thinking about “how should I go about this” that I could have totally skipped otherwise. (Actually, the version of my code that will be evaluated for SoC can only be used in a distributed fashion if the nodes can all write to a network filesystem that the recoded media is to be stored on, a requirement I was hoping to lift to be prepared for the possible upcoming removal of NFS from WikiMedia’s private network…but as there was no existing documented facility to write to a file repository not locally accessable, this wasn’t done.)

I didn’t expect the existing open source MPlayer -> Theora solution to be as limited as it was, but the improvements I made to the theora reference encoder overcame this unexpected roadblock, didn’t take that much time, and turned out to be a valuable opportunity to broaden my programing skills and give me some appreciation for the “power of open source.”

Finally, I spent tons of time working on MW’s support for audio and video media formats, as I was surprised to find out it really didn’t have any at the beginning of the summer. As my past posts discuss, I wrote code to more accurately detect MIME types of these media files, identify and validate them at upload time, the and beginnings of support for extracting metadata. I didn’t think I’d be doing any of this when I wrote my proposal, but what good is a system to recode uploaded videos if you can’t upload them in the first place?

All these things caused me to be behind schedule for the majority of the summer. I did produce a usable implementation of everything in time for Google’s “pencils down” deadline, but the time crunch at the end (also contributed to by confusion about the program end date…) did cause the code to suffer somewhat. Mainly, I want to add better handling for a variety of exceptional situations within the recoding daemon and within the UploadComplete hook that does the inital pushing of bits to add the job to the queue and notify an idle node of the event. Media support in MW still sucks too, and I might want to help out with that – for example, the existing MW notion of “media handlers” are bound to a file based on the file’s mime type. This works alright for jpeg files vs. djvu files, but not so much for video files, all of which can be examined by existing utilities like ffmpeg. Indeed, to test my code currently, one will need to map each video mime type they want to test to the video media handler in a config file.

I’m still awaiting any feedback from my mentor. I hope of course that I’ll get a passing evaluation, but even if I didn’t I don’t think I would consider my efforts a lost cause. Surely sooner or later, MediaWiki will have full audio and video support, and I want to continue to be a part of making that happen and ensure that my efforts will be made use of as much as possible. And I hope that sometime in the future, I will wake up, scan the day’s tech headlines, see “Wikipedia adds support for videos,” and know that I had something to do with it.

Since it’s been so long, I thought I’d just let everyone know I haven’t abandonded my work, died, or anything of the like. Blogging just doesn’t get the code written, so I haven’t been spending as much time on it.

As it’s now a week into August, I can say with certainty how the queue will be implemented. I did end up going with the idea that surfaced while authoring my last post, to have a daemon implemented in php running on each recoding node. The primary means of interfacing with it is an accompanying script that must run on the webserver of a recode node (yes, they must run apache), taking input via POST, forwarding it as commands to the daemon, and returning the outcome in an easily machine-processable format. This is accomplished over named pipes, the simplest means of interprocess communication that works for this task. All requests to and responses from the daemon get prefixed with their length to ensure complete transmission of messages. Notification of new jobs, as well as cancellation of the currently running job is then achieved by simply contacting the node (selected with aid of the DB) over http. The daemon expediently provides the notify script with results of a command so that the entire process can occur at upload time, specifically as manifsted in an UploadComplete hook. (Cancellation of a current job is used if a re-upload occurs at the time that the job generated by the old file is running.) The daemon uses a persistent instance of MPlayer for all assigned jobs, via MPlayer’s slave mode.

Although this isn’t quite as firm yet, I expect the following database design changes will facilitate operations:

  1. The image table will grow a new ENUM indicating whether the recoded editions of a particular file are available, in queue, or failed. I.E., a file is marked available when all configured recoded versions of it are finished processing.
  2. A new table will serve as the queue, and will contain only job order and the file’s database key. (I did considerable performance testing here and concluded that both columns should be indexed…retrievals will be by both order and name, and individual inserts/deletes seem to keep constant time regardless of number of records. *Very* rarely the order could get reindexed just to keep the integers from running off to infinity, perhaps as a task through the existing job queue. This all only matters if the queue has significant length anyway, but this might be the case if a wiki decides it doesn’t like its existing resample quality or size and wants to redo them all.)
  3. A new table will contain the current status of each recode node, including address to the notify script and the current job if there is one.

One of the recode daemon’s first tasks upon starting a new job is establishing which recoded versions of a given file are missing. I just like this better than storing such job details in the queue, as it greatly simplifies the queue as well as the adding of new formats in future. As a result, the daemon does need to interact with FileRepo and friends, so a good MediaWiki install must exist on recode nodes as well. In fact, much of the MW environment is actually loaded in the daemon at all times.

I’ve still got plenty to keep me busy here as the time starts getting tight.

Over the past few days I again put the queue implementation on a back burner, opting to wait for an interlibrary loan, “Advanced Unix Programming,” which I’m hoping will help me understand process sessions, process groups and such, which I’m hoping will in turn enable me to fork a process the way I’ve been intending to do. In the meanwhile, I’ve been working on further streamlining the recoding itself, and started writing the script that will run on each recoding node to oversee and synchronize mplayer, the encoder, and file transfers.

Specifically, I have further modified the encoder so that it can write a cache copy of the decompressed audio and video to disk as a command-line option. Combining this with my earlier work, not only will it be possible to do the recode of the first version of a file on the fly (streaming both directions from the file repository), but it will also be possible to make subsequent versions of the same file without redownloading or even re-decompressing it. My mentor says he had envisioned generating a few different versions (ie low/high bandwidth) for each file immediately at upload time, and so this scenario will be very common. Additionally, I have developed a way to have a single instance of MPlayer stay running and resident in memory which I can utilize for all jobs the node gets. It’s basically mplayer in slave mode, but with a little tweaking of mplayer.c to allow the command input stream to be re-attatched to different controlling processes (a necessary step since the php control script will terminate when there are no more jobs.) Although actually, come to think of it, I could also daemonize that too, and have the script that receives new job notifications just send the control script a SIGUSR1 or somesuch, prompting it to listen for new job details on a named pipe. Then a “stock” MPlayer could be used, even in a persistently running fashion. (Oh, the things you think of when writing in your blog…glad I didn’t spend *too* long on the mplayer.c tweaks)

This suggests yet another queue system design: because it would be possible to notify a node at any time of a new job, even when it was processing another, you could just store status info on each node in the database, namely the number of jobs currently assigned, and then simply select the node with the lowest number LIMIT 1 when deciding which recode node to notify of the new upload. The daemonized recode script could maintain its own internal queue of its jobs. It’s simple and easy, but the clear downside is that if you have a node backed up by 10 jobs that drops off the network or somehow crashes, all those jobs simply disappear. I think any attempt to back up the job queues and reassign to other nodes as they go down and come back up would probably be overly complicated, and no better than my existing plan based on a centralized queue.

…Just thinking out loud.

In other news, as my faithful fans can see I got sick of the old layout, as I could not see the cursor when replying to comments. Also, I picked up my midterm payment today from western union, trouble-free, for once. My library book came in today too, all 1000+ pages of it, so hopefully I can make this an educational experience without getting too sidetracked.

After I got that encoder working on Friday, the rest of the weekend’s been a bit of a let-down. I attempted to start implementing the queue manager, which as per my previously discussed design must be able to accept and acknowledge new jobs synchronously, and then possibly asynchronously monitor the job as it gets processed on a recode server. In other words, I need to be able to create a new process that I can have bi-directional communication with for specification of job parameters*, and then can switch to an asynchronous profile and stay active beyond the lifetime of the parent php script. I had a few strategies in mind to do this purely in php, ranging from an apparently incorrect memory that there is a way to end the http request before the script terminates, to starting a new php script via proc_open and closing STDIN/OUT/ERR back to the parent when the asynchronous effect was desired, to starting a new php script via an http request and having the client script abort or close the connection when the synchronous communications were completed (the new script of course using ignore_user_abort()).

Unfortunately, none of these strategies works. While it is possible to close the pipes created with proc_open and have the two processes run concurrently with no interaction, the parent process still will not terminate until the child has. So, while it would be possible to output all the HTML associated with an upload confirmation immediately, the connection wouldn’t close until it timed out. (Using a combination of Connection: close and content-length headers theoretically compels the browser to close the connection at the appropriate time in a scenario like this, but there’s no guarantee…plus generating a content length really requires everything to be output-buffered 😦 ) The other method, starting a new php script via an http request, probably would work on some configurations, but falls apart when there are any web server modules buffering script output, ie output compression mods. Even when the client indicates the response may not be compressed, something is retaining the initial bytes of output until the entire script completes. flush() doesn’t work, nor do various other ideas I had like sending an EOF character through the lines. I tried padding the output up to 8k and got the same result, and decided a solution that required more padding than that would just suck.

So, this leaves me with few options. Because there seems to be simply no way to proc_open a new process that stays alive after the parent has terminated, I am left with starting the new process by making a web request. I am now seriously considering implementing the queue manager in Java, as an extension to one of the many lightweight Java http servers out there. In this way, I would have full control over when responses are pushed out to the client, and could finish a request and then continue doing some processing. The big downside, besides being a bit more difficult to create, is that it would require MediaWiki installations that want to asynch. recode a/v contributions to have Java and run this special http server for the purpose.

*I want this system to be able to support a means for some job parameters such as frame size to be plugged in by the user, just as it is possible for images to be sized according to where they will be used now. Probably this capability would mostly be used when a video clip appears in multiple articles/pages. Because of the more expensive processing and storage requirements associated with video, however, I don’t want to accept jobs that are very close to an existing version of a file. As a result, I am reluctant to simply run a command through the shell, because of the difficulty of communicating with those processes. Another possibility is to do all the parameter & job error checking in the main script, add it to the queue, and then launch a process through the shell that seeks out a recode server for the job and oversees it. I will talk with my mentor about this option versus a non-php solution.

At long last, success is mine! I have overcome my theora encoding difficulties by adding audio and video buffering layers to the reference encoder shipped with libtheora. It’s only about 300 lines, but took me many days because I had to teach myself much of what I needed to know along the way, at least as far as memory allocation, pointers, and related C syntax was concerned. (Thanks to my U’s subscription to Safari Books Online, and the all-powerful Google search for advanced Linux I/O stuff too)

As I discussed earlier, I established that the problem actually stems from MPlayer filling the capacity of one or the other of the named pipes (audio or video) before writing any data to the other. With one of MPlayer’s output pipes full, the decoding process is blocked, and with one of the encoder’s input pipes empty, encoding is blocked, so the two processes deadlock each other. The solution is to suck up data on the full pipe and just buffer it – a process that I found often must be repeated numerous times -before MPlayer starts giving out data on the desired pipe again.

Because the amount of data to be buffered is unknown and significantly large, dynamic allocation/creation of data structures was necessary. The buffers are basically linked lists with each node containing a pointer to a 64k memory block (chosen because of the current Linux kernel’s named pipe size). This allows me to avoid copying large amounts of data – usually it is left in the original memory location it was read into until some portion of it must be placed in a contiguous block of memory with existing data for consumption by libtheora.

Another difficulty was properly detecting and handling the case where data cannot be read from either pipe because MPlayer has fallen behind the encoder.

Performance-wise, I am very pleased at my initial tests – CPU utilization is virtiually indistinguishible between my enhanced encoder and the original one when encoding data at realtime. And no memory leaks either 🙂

Since my first courses in Computer Science 5 years or so ago, I’ve been taught almost exclusively in languages that work ike Java & PHP. I’ve gotta say, there’s definitely something cool about doing your own memory management, finding bits with pointer arithmetic, and producing fast machine code at the end. I feel quite accomplished to have figured everything out.

Here’s a high-level overview of the design I have decided will work best for queueing the recode jobs.

There will be three interacting components.

  1. The upload server. This is where original contributions reside.
  2. The queue manager, which can be on any machine or machines running MediaWiki.
  3. The recode servers.

 Here’s a standard use case:

  1. Video (or audio) upload is received
  2. Upload server identifies & validates upload as recodable
  3. Upload server contacts queue manager via http POST, including final url to the file.
  4. Queue manager replies with an acknowledgement of the job being added and closes the connection. At this point, the upload server can essentially forget about the job and provide upload confirmation to the user.
  5. The queue manager will track (via mysql for easy shared access among backup queue managers) the status of all recode servers. If there is an idle recode server at the time a job request comes in from the upload server, it is immediately forwarded to that recode server, perhaps with some additional parameters such as output frame size added. The recode manager then monitors periodic reports from the recode server as the job runs, see 6. If there are no idle servers, the job is added to a queue, also implemented as a mysql table.
  6. Whenever a recode server is given a new job, it keeps the http connection on which the job was given open, and sends status information about the job’s progress over this connection every several seconds. This has two purposes: the main one is to allow the recode manager to detect and handle recode server processes that die unexpectedly, but could also facilitate snazzy ajax reporting of recode status to the user at a later date.
  7. The recode server starts streaming the original file directly from the upload server via a standard HTTP GET. (Technically, it may be several ranged requests as MPlayer probes the file.) As soon as it has downloaded enough data to generate recoded frames, it opens a POST connection back to the upload server and starts uploading the recoded version. No video data is ever written to disk on the recode server.
  8. When the entire file has been processed, the recode server notifies the queue manager of success and closes the connection. The queue manager may elect to assign this recode server a new job from the queue table at this time, and processing restarts at 5.

The most interesting exceptional use case is if a recode server fails to recode a file or dies unexpectedly. In the event of a failure message, timed-out or unexpectedly closed connection from the recode server to the queue manager, the queue manager will notify the upload server to delete any recoded data that may have been uploaded, cancel the failed job, and attempt to assign a new job to that recode server. If it cannot, it will update the recode server status table to reflect the failed server.

The child process and connection control routines necessary to do the above can be implemented entirely in PHP. Particularly, the queue manager will not need to be anything more than a PHP script/web service.