Monthly Archives: March 2011

Oxymoron

I went to “grand rounds” (that’s what they call medical lectures at the hospital) today. The topic was “Mild Traumatic Brain Injury”. This is apparently standard terminology in the field, to the point that it even has a standard mixed-caps initialism: “mTBI”. I just think there’s something darkly humorous about the name, especially now that I know that 20% of patients diagnosed with mTBI experience life-altering mental decline in the following months, including impairments to memory, language, and cognition. Someone has a funny definition of “mild”.

There’s also Complicated mTBI, and then Moderate Traumatic Brain Injury, which is seriously bad news. Don’t even ask what’s after “Moderate”.

Larynx

Those parties Friday and Saturday must have been really good, because I woke up today with a classic case of laryngitis. I’m going to bed before 11 PM for the first time in memory. Hopefully I’ll be able to speak when I wake up tomorrow … although I suppose I could do my work well enough without ever opening my mouth.

Live

I have a lot of deeply nerdy friends; it would be hugely surprising if it were otherwise. Our parties always tended to turn into detailed discussions of topics of academic interest. One of us recently had the keen insight to formalize this, and so we have begun a series of parties at which someone will be designated to give a full-length presentation on a subject of interest, preceded by a few shorter warmup talks.

Some people were interested in the subjects being presented last night but couldn’t attend, and so we decided to set up a live video stream over the internet.

Being the “video guy”, I took charge of the operation. I set up an Icecast instance on bemasc.net, which was easy, and then to start the stream I launched the following gstreamer pipeline:

gst-launch-0.10 v4l2src !
‘video/x-raw-yuv,width=640,height=480,framerate=15/1,format=(fourcc)I420′
! queue ! theoraenc bitrate=600 speed-level=2 ! queue ! oggmux name=mux
alsasrc ! audio/x-raw-int,rate=48000,channels=1,depth=16 ! queue !
audioconvert ! vorbisenc bitrate=48000 ! queue ! mux. mux. ! queue !
shout2send ip=bemasc.net port=8000 password=… mount=/name.ogg

This worked … but when I announced the address 7 people connected to the stream, which promptly died. It seems I just don’t have anywhere close to the required 4.5 Mbps of upstream bandwidth. We had 3 or 4 people connected through all the talks, and to serve them reliably I had to get the total bandwidth down to about 200 kbps, which also required reducing the resolution and framerate.

At 320x240x5fps, the 150kbps video feed looked like 1999, but it did work, and people enjoyed being able to participate without traveling to be there in person.

This is the first step toward something I’ve dreamed of for years: a teleparty, where attendees are in (at least) two different venues hundreds or thousands of miles apart, but videocameras and projection screens have been arranged so that the two spaces appears to share a wall, to which partygoers may sidle up for a conversation.

If you ask me, it’s only a matter of time.

Rule 34 of “the internet”

Whenever anyone references Rule 34 of the internet, my first reaction is to parse it as a reference to binary automata from A New Kind of Science. I decided to find out for myself what that would look like, so I whipped up a tiny implementation of the automaton that accepts strings as initializer input:
$ python rule.py "the internet" 34 ti34.png
Not very interesting unfortunately
Too bad it’s so dull. Rule 30 of “the internet” is much more entertaining, of course:
Preetty
Lest anyone get the impression that I was wasting time when I should have been preparing for my big presentation tomorrow morning, I wrote all the code on the bus.

Snow

Biking to work this morning, it was hard to imagine the snowy mountains that lined the streets. It felt like a distant memory from years ago, not the less than two months that has actually passed.

Biking home tonight felt like winter again.

Reputation

I got to work this morning in time for our 11 AM meeting. I had scan time in the evening, and a lot of new things to try, so I didn’t leave until after midnight.

I guess this is how grad school gets its reputation.

FFTW vs. OpenMP

My motion compensation code in lab spends most of its CPU time doing length-8192 iFFTs using FFTW 3.2.2. Specifically, each motion-update event corresponds to a cross-correlation comparison of the input with 20-160 reference vectors. Each of these comparisons is totally independent, so the problem is “embarrassingly parallel”. This was irrelevant when I wrote the code on a uniprocessor Pentium 4, but when that machine began to fail I moved to our sparkling new 8-core Xeon i7. Just one of those cores is already faster than the original hardware, but hey, faster is always better, right?

Looking for the path of maximum laziness, my first idea was to let FFTW handle the parallelization. I replaced my fftw_plan_dft_c2r_1d(n, in, out, FFTW_MEASURE)*, which produces a plan that is used k times, with fftw_plan_many_dft_c2r(1, &n, k, in, NULL, 1, d, out, NULL, 1, n, FFTW_MEASURE)*, where d = n/2 + 1. Then I called fftw_init_threads and fftw_plan_with_nthreads to enable parallelization. This required refactoring the cross-correlations so that all the FFTs could be performed simultaneously, at a certain cost in cache performance, but I figured it would be a small price to pay.

I was wrong. Even just benchmarking the FFTs, the speed up from 1 thread to 4 threads was only 33%. The program actually ran slower with 8 threads than it did with just 1. I tried increasing d to be 16-byte aligned, but nothing helped. My best guess is that FFTW just doesn’t know it should parallelize by independent transforms in plans from fftw_plan_many_*. Instead, it seemed to be splitting each of the 160 transforms across all the threads, even if this was counterproductive.**

Having exhausted this approach, I tried the next thing on my list, OpenMP***. This was much easier than expected. All I had to do was make my cross-correlation search thread-safe using fftw_execute_dft_c2r. After that, it just took one #pragma omp parallel for (and about an hour of fighting with Visual Studio configuration options … it turns out OpenMP only works in Release mode, not Debug). The resulting code is simple, clear, and gives nearly perfect linear speedup at 8 threads.

Eight times faster would be pretty good for two days work, especially if I actually needed the code to be any faster than it already was.

*: I only need single precision, so actually these calls are all fftwf_*.
**: This is a wild guess. If it’s true, it might be an undiscovered bug because I’m the only person who’s tried to combine plan_many, complex-to-real, Windows, single-precision mode, and multithreading.
***: Oddly enough Intel’s Thread Building Blocks page has a great explanation of why you (if you’re me) should use OpenMP (and not TBB).

Catastrophe

Over the last 25 years I’ve seen plenty of disasters, natural and otherwise, reported in the news, so I’ve had some time to get used to the way the reporting flows. Usually there’s a newsflash, a period of confusion as the details get sorted out, and then coverage of the aftermath that quickly peters out. Usually.

This time it’s different. The meltdown at Fukushima Daiichi is the first time I can recall live coverage of an inexorably deepening catastrophe. It is almost like watching emergency response in reverse time-lapse. On the plant site, the radiation levels have now risen past the threshold of safety, and the cleanup crews have been hastily withdrawn.

Now the world is watching and waiting as the reactor cores get hotter and hotter, boiling off their remaining coolant, after which they are expected to melt down completely. There is a distinct chance that at least one of the reactors will destroy its surrounding containment vessel, turning the surrounding district into a radioactive wasteland for the next 100 years.

I often wondered if another Chernobyl-class event could happen in my lifetime. It never occurred to me that we might see it coming days ahead of time and yet be helpless to prevent it.

Murder Cafeteria

Last night I went out to “Murder Cafe” on Biophysics’ dime (part of the department recruiting weekend). I was hesitant. We had done this once before, maybe two years ago at the “Mobfather” interactive murder-mystery in Back Bay, and it was awful. The actors seemed profoundly unenthusiastic about the job, a position not helped by the morose characters many of them played. They wore full stage makeup while milling about amongst the audience making exceedingly awkward small talk in character. They were talented, taking any opportunity to burst into incongruously excellent song, and therefore all the more irritated to be in this stupid show.

The Murder Cafe at the Elephant and Castle* was much stupider, and therefore better. The cast members didn’t try too hard to interact with the audience, they wore Halloween-grade costumes with little makeup, and put on sketches with only slightly more polish than improv. The singing was atrocious (and the food was just pub fair). Yet, somehow, it worked.

The trick, I think, is that the actors seemed to be enjoying themselves, more or less. They broke character if they thought it would help sell a joke, and didn’t pay too much attention to whether an accent was supposed to be Romanian or Scottish or New Yorker. Afterwards, one of them showed up in street clothes at the Karaoke bar upstairs and sang Lady Gaga, still out of tune.

*: I usually feel a little bit out of place at bars, but the Elephant and Castle was worse than usual. The geometry of the place is awful, and it was packed. The Karaoke master silently dropped us from the queue. But more than any of that, I just felt like I didn’t fit in with the Boston-accented crowd. This ain’t no Harvard Square establishment.