Over the past few days I again put the queue implementation on a back burner, opting to wait for an interlibrary loan, “Advanced Unix Programming,” which I’m hoping will help me understand process sessions, process groups and such, which I’m hoping will in turn enable me to fork a process the way I’ve been intending to do. In the meanwhile, I’ve been working on further streamlining the recoding itself, and started writing the script that will run on each recoding node to oversee and synchronize mplayer, the encoder, and file transfers.

Specifically, I have further modified the encoder so that it can write a cache copy of the decompressed audio and video to disk as a command-line option. Combining this with my earlier work, not only will it be possible to do the recode of the first version of a file on the fly (streaming both directions from the file repository), but it will also be possible to make subsequent versions of the same file without redownloading or even re-decompressing it. My mentor says he had envisioned generating a few different versions (ie low/high bandwidth) for each file immediately at upload time, and so this scenario will be very common. Additionally, I have developed a way to have a single instance of MPlayer stay running and resident in memory which I can utilize for all jobs the node gets. It’s basically mplayer in slave mode, but with a little tweaking of mplayer.c to allow the command input stream to be re-attatched to different controlling processes (a necessary step since the php control script will terminate when there are no more jobs.) Although actually, come to think of it, I could also daemonize that too, and have the script that receives new job notifications just send the control script a SIGUSR1 or somesuch, prompting it to listen for new job details on a named pipe. Then a “stock” MPlayer could be used, even in a persistently running fashion. (Oh, the things you think of when writing in your blog…glad I didn’t spend *too* long on the mplayer.c tweaks)

This suggests yet another queue system design: because it would be possible to notify a node at any time of a new job, even when it was processing another, you could just store status info on each node in the database, namely the number of jobs currently assigned, and then simply select the node with the lowest number LIMIT 1 when deciding which recode node to notify of the new upload. The daemonized recode script could maintain its own internal queue of its jobs. It’s simple and easy, but the clear downside is that if you have a node backed up by 10 jobs that drops off the network or somehow crashes, all those jobs simply disappear. I think any attempt to back up the job queues and reassign to other nodes as they go down and come back up would probably be overly complicated, and no better than my existing plan based on a centralized queue.

…Just thinking out loud.

In other news, as my faithful fans can see I got sick of the old layout, as I could not see the cursor when replying to comments. Also, I picked up my midterm payment today from western union, trouble-free, for once. My library book came in today too, all 1000+ pages of it, so hopefully I can make this an educational experience without getting too sidetracked.

About these ads