The Shape of Stories: Analyzing Film Cutting Speed for Tableau’s IronViz

I often tell people I’m “sound and lights” when it comes to getting attention. Given a choice, you’ll find me at the sound booth or playing with the lights. So, when the final IronViz feeder came around, I already knew I’d focus not directly on films or shows, but on topics around it (blame my severe lack of TV-watching habits). Copyright came to mind, but so did script optioning (the process of buying stories that are already made, such as books). I also had the idea of remixing on my mind courtesy of a friend’s referral. Finding data for these ideas seemed insurmountable.

Act 1 – A Quest

[I’m frantically hitting the interwebs at this point. The browser may or may not have 132 tabs across 4 windows. Copyright and optioning are leading to dead ends. I text my friend the scriptwriter to see if he has ideas. He says no. I keep looking with no luck in sight.]

Then I found James Cutting. And this:

The magic lines went something like this:

In a 2010 study, Cutting found an average of 1,132 shots per film in a smaller sample of 150 movies made between 1935 and 2010; the King Kong remake, incidentally, had the most: A whopping 3,099 shots packed into 187 minutes.

A link. To a study. (Or really supplemental material, but I’ll take it!) I may have gotten lost in papers for a bit. Either way, they lead me to Cinemetrics, an open research database. I may have fallen off my chair at this. After all my complaining about open data, here was a living, breathing researcher sharing his work and crowdsourcing ideas for additional research. (Seriously, go give them kudos!)

Now, I couldn’t find an API to get at the data and I didn’t want to hand copy it. Others, more smarter than me, know a thing or two about web scraping. Someday, these people will write posts and tag me on Twitter (hint, hint). So, like a newb jedi with a lightsaber just chilling nearby, I went for it (this seems to be a classic Hollywood formula). I used Alteryx and made my very own Frankenstein. It took days because I have absolutely no clue and this is the 2nd workflow I’ve started, and the first I’ve finished.

After getting data, merging this with IMDB data and kicking out a fair bit of test and partial data, I had my data set. I then started making sure I understood it by replicating some of the work Dr. Tsivian and others have done. These are not my ideas.

I also played around with some variations:

Now, Joseph Campell talks about character archtypes and we (Tableau people) sometimes discuss “style” around vizzes. Could the camera act as a lens into this formula? What about by genre? Do directors favor certain cuts again and again? Like a top 40 remix, I had to queue this up for repeat.

Act 2 – Building Early!

[Creativity is a fickle muse. Like a toddler, it goes from jumping on the bed to demanding food with no other solution in sight. The ideas are coming at this point and a few ideas are written on a whiteboard. I’m both getting data still and exploring data. This is sooooo not what you’re supposed to do, I hear.]

I was insanely curious about this data. But, I was back and forth between getting the data, merging it with IMDB data, and analyzing it. Naturally, this is a my-bad.

A few things of note here – I’m not a researcher. So, for me, replicating existing work (to make sure I had data right) was crucial. Both of my data sets are crowd-sourced, which means I really need to check it.

To me, there’s a few key parts to this data:

  • Shot length – Dr. Tsivian has focused heavily on this in his research, focusing on polynomial shapes. Others have used mean shot length and compared these in aggregate. I want to be able to look at this more.
  • Number of shots per minute (or other interval) – this helps me get a feel for pacing and allows me to begin standardizing. It too creates a shape, though I lose some of information (read Dr. Tsivian’s work).
  • Genre – IMDB provides me a few options with films. There’s several entries in each field and my data source is already huge.
  • Director – directors quite literally set the pace of the story. Can we find patterns in their work in shot length and number of shots? Does genre matter or influence their style?
  • Type is an interesting field, but there’s high variety in how it’s entered, if it’s entered. It may be useful or it may not be. I’ll find out.
  • This data is hand curated. How much does that influence it? Do I need to do something about it?

Act 3 – A Dead End

[It’s the weekend and the JSON API for my IMBD data won’t move. It’s sitting at 5% for days. This, kids, isn’t right.]

I ended up switching to OMDB API. This meant changing up my Alteryx job and losing some data. You do what you have to. I started with a literal design around film, but didn’t make much headway. It wasn’t hitting it.

Frustrated, I look at changing it up. And I find this.

It makes me laugh and I consider an approach similar to Robert Rouse with Us vs Them. I need to add to it, so I go into iMovie and I find an option to make trailers. Silver screen, get ready for this dashboard debut!

Act 4 – A Surprise Ending

[Several hours later, the GIF is not a GIF. It’s a trailer for the dashboard. Hey, even B-rated films become cult classics, right?]

I decide to go literal, but in a different direction. Channeling iMovie, I make its Linux cousin, Cutting Room Pro. I played around through a few iterations. A big thank you goes to Mike Cisneros for his keen eye! Like any other Hollywood production, there will probably be a sequel to this.

Trailer:

Full Theatrical Release: (Warning – director edits may happen)


And in case you’re completely lost…I offer you smarter people than me.