I’ve been unusually inspired to be productive over the past few days…but mostly that’s turned into a discovery of how hard it is to jump into things. There’s just so much content – not only in code but also in ongoing discussions and documentation – to keep track of that I don’t know where to start or if I’m missing key considerations because I haven’t reviewed absolutely everything out there. For example,  in hopes that video contributions become popular, I want to implement this so that it can scale to multiple systems that do recoding. But how to do that most easily, considering the underlying technology/protocols in use on wikimedia’s private networks? The only real trick is preventing two recoding systems from simultaneously grabbing the same job from the queue before it is updated. One solution would be to just let this happen, and detect it by having all recode systems dump their output directly to a central file store using a predictable naming scheme. Then open new files with O_EXCL|O_CREAT, and if it fails then someone else has already claimed that job, so go get another one. But this requires a shared filesystem that supports that…currently afaik wikimedia is using NFS, but does an open call with O_EXCL|O_CREAT  work right under NFS? Heck if I know. And there’s discussion about removing the use of NFS anyway and switching to an API for private server to server file transfer in development by Tim Starling. I’m afraid if that route is taken, I won’t even be able to encode directly to the central file store (instead make locally, then copy, then delete…which takes a bit longer and is more complicated.)

Then there’s this whole concept of “media handlers” (which Tim’s also at work on) – From the looks of older posts to wikitech-l, they’re supposed to be relevant. I haven’t found formal documentation though, or any mention of them in includes/SpecialUpload.php, where uploads are handled. Makes me think they’re for the display side of things only, but wikitech-l messages make it look otherwise. I could scour lots more code to figure out what’s going on, but I’m waiting for a chance to talk with Tim about this stuff right now (hence the blog entry), which hopefully will quickly straighten a lot of things out.

 I have gotten a bit done…on the media compatibility front for example I’ve found at least one common codec in use today that I wan’t able to decode with my previously discussed MPlayer/ffmpeg or VLC combinations. The bad news is that there’s nothing I can do about it: the reason none of these can decode it is that I found there is no open-source decoder at all. The good news is that I am getting pretty convinced that MPlayer will be an easy-to-use tool that will “just work” for just about anything the open-source world can decode. I suspected this all along from personal usage, but wanted to test a bit more extensively and systematically for this project. For those interested, the undecodable codec was AMR, an audio (speech-optimized) codec developed for use on GSM and 3G cell networks. It’s relevant because some phones with cams stick it into their video streams…presumably because they have hardware that optimizes AMR encoding and that’s all they can handle when recording live video. Interestingly, if you feed it into Windows Media Player, it works just fine. Guess Micro$oft licenced it. I’d be curious to know how Facebook, which actively encourages cellphone videos to be uploaded to their video sharing service, got around this. Considering Wiki*’s different usage/audience, I don’t think I’ll continue to persue it, though.

 That’s all for now.

Advertisements