Craig’s Musings

Thoughts about software architecture, books and life

...nature speaks...

Does ‘seam carving’ generalize?

August 15th, 2008 · 2 Comments · Content management, Ideas, Reading

During a recent Charlie Rose interview with Peter Chernin, the subject of media formats came up, which reminded me of an earlier question I raised, “What is the natural unit of written collaboration?” When I mused upon this question it was in the context of documents and written content. I hadn’t really thought about so-called rich media (interactive) content.

So, what is the natural unit of such content?

Mr. Chernin remarked that the movie, the hour-long drama and the half-hour comedy are all resilient forms of content–and he would hope so given his responsibilities at News Corp. He also sees new forms emerging–all shorts, if you will: five, ten and 15 minute segments, depending on one’s context (e.g. watch on your smart phone while waiting for public transportation).

I recall the following quote from Mr. Chernin: “In a world of infinite choice, mediocrity is death.” (More quotes are captured here, for example.) Or, aim to stand out. Speaking of short films and excellence, Pixar’s Presto, which plays before WALL-E, is one of the funniest things I’ve ever seen in a movie theater.

Anyway, back to new forms of interactive content… Content reuse is something I build software to promote based on conversations with customers. During this interview, I couldn’t help but to think that there is a real opportunity here for software to support the production of variously sized shorts from full originals in such a way as to retain, if not amplify, the essence of the first edition.

Enter seam carving.

I found a seam carving implementation by Mike Swanson called Seamonster. This led me to a paper by seam carving’s creators, Dr. Shai Avidan and Dr. Ariel Shamir: Seam Carving for Content-Aware Image Resizing.

As the author’s note, without additional “higher level cues.” seam carving doesn’t work on all images. Nevertheless, the relatively straightforward nature of seam carving causes me to wonder if similar techniques can be applied to video, audio and even text.

This later work describes video application of seam carving. It appears, upon first glance, that a key to application is what Avidan and Shamir call energy criteria (e.g. the notion of forward energy in a video context).

What are the key energy criteria for audio files? What are the key energy criteria for documents? What artifacts should be anticipated when applying seam carving to various media and how can they be mitigated, if not avoided altogether?

I need to investigate seam carving in more detail. All resource pointers are much appreciated.

-Craig
http://craigrandall.net/
@craigsmusings

Tags: ··

2 Comments so far ↓

  • mhackney

    Craig, there is a comparable technique for audio called Time Stretching which allows altering the playback speed of the audio without affecting its pitch. It is commonly used by the radio and television industries to cram more audio into a 30 second commercial slot (like those annoying disclaimers at the end of auto commercials). Time Stretching is also used by amateur musicians (like me!) to slow down music in order to learn to play it. An open source implementation can be found here. I use a product called Amazing Slow Downer on the Mac.

    Time Stretching is a very effective technique for shortening audio playback times. I have been advocating capturing business meeting recordings for many years. (a pet peeve of mine is that a lot of corporate knowledge is created in meetings but is not always captured and shared). This technique can be used (along with silence clipping) to significantly shorten the playback time of useful information from an audio recording – without doing speech to text.

    In the mobile delivery of audio, image and video content, these Time Stretching and Seem Carving techniques are very powerful.

  • Craig Randall

    Thanks for this, Michael. I frequently use a technique I know as “time shifting” to play back podcasts, for example, at 1.5x thereby reducing the time required for me to get the gist of various subjects. Cheers…

You must log in to post a comment.