I have some tentative comments on the original article Attention and Hollywood Films, by James Cutting, Jordan DeLong, and Christine Nothelfer. I don’t see any point in discussing the notices it has attracted in various newspapers, journals, and web sites.

Cutting, DeLong and Nothelfer  do not quite claim, as others have done for them, that they have found the formula for box office success in the cinema, which is just as well, because their test sample could not possibly justify such a claim. The sample they are working with comprises 10 films from each of 15 years taken at five year intervals from 1935 to 2005. The sixty films from 1980 to 2005 that they use are said to be “…among the highest grossing of their year…”, but I very much doubt that they are really the Top Ten box office champs for the years concerned. For films chosen from before 1980, they “…were among those with the largest number of viewer ratings on the IMDb.” It should be obvious to anybody acquainted with this source that such a choice can have very little real relationship with box office takings in that distant year in which these films were released. Most importantly, if the desire was to test the hypothesis that the formula that the researchers arrived at was necessary for box office success, it is essential that the sample of highly successful films be compared with another sample of highly unsuccessful films, which should be shown to lack the formula. This was not done, as is so often the case in hypothesis testing in the softer sciences.

Turning to the meat of their work, I think that the regularities they are detecting are a product of two features of film construction, one basic, and one historical.

The first, which I have commented on long ago in my books, and also in the piece The Numbers Speak, which is on the Cinemetrics website as an extract from my  Moving Into Pictures, under articles by Barry Salt (get the preferred PDF form), is that the basic units of dramatic structure of films, which are the scenes, are put together in the film script so that the dramatic nature of successive scenes are contrasted. That is, in a typical film, a dialogue scene is followed by an action scene, which is followed by a comedy scene, which is followed by another action scene, which is followed by a romantic scene, and so on. At a slightly more basic level, this produces a “tension-release” pattern (or however you like to describe it), which is found in all traditional time-based arts, such as music, as well as in drama. And these two varieties of film scenes are treated differently in their editing. Action scenes, and other dramatically tense ones, have fast cutting, and the other kinds have slower cutting to varying degrees.

You can see this illustrated in the films on the Cinemetrics database with the new moving average tool that has just been added. (I suggest starting with the 20 shot setting, but you can try other ranges.) You will see that the moving average trace usually has a wave-like shape, and that the peaks and troughs of it are often of approximately equal height and depth. As well, there is a tendency for there to be long stretches of roughly equal frequency in these waves. This is most evident in the films with faster cutting, as in the graph for Darby O’Gill and the Little People in my piece Speeding Up and Slowing Down in the Measurement Articles section on the Cinemetrics website. “Long take” films, with an ASL of greater than 15 seconds, do not show these cutting rate variations so clearly. Action films made in the last few decades largely dispense with romantic scenes and comedy scenes, so the contrast in the cutting rates of successive scenes is now even more marked.

The historical part of the phenomenon is the increase in the cutting rate since the nineteen-forties. This is illustrated by a graph covering the ASLs of 7,448 American films made between 1930 and 2006, which is in the new 3rd. edition of my Film Style and Technology on page 378, and also in “The Shape of 1959” in The New Review of Film and Television Studies (Vol. 7, No. 4, 2009). This increase has certainly been consciously intentional on the part of the film-makers over the last thirty years.

The result of this speeding up is constriction in the range of shot lengths used in films, as illustrated by the following two graphs of the shot length distributions for Catherine the Great (1934)

Number of shots of the given lengths (in seconds)


And for Derailed (2002).

Number of shots of the given lengths (in seconds)

In Catherine the Great about 12% of the shots are less than three seconds long, while in Derailed, about 90% of the shots are less than three seconds, and nearly 50% of the shots are less than one second. Hence successive shots in Derailed are much more likely to be nearly the same length.

This is shown by the difference between the autocorrelation coefficients of lag 1 for the two films. For Catherine the Great it is 0.1164, while for Derailed it is 0.1976. That is, about 80% bigger. (The autocorrelation coefficient of lag 1 measures the correlation  between the lengths of any shot and the next shot, taken in succession down the length of the film.)

James Cutting and his collaborators work in terms of a series of more complicated elaborations of autocorrelation, which I will not discuss in detail. The quantities they use are the Autoregressive index AR and its modified form, mAR.

They assert that mAR is not an artefact of the decreases in mean shot length (ASL) over the last 50 years. This seems dubious, since I find that for their sample, the correlation of mAR with ASL is slightly better (r = 0.49) than its correlation with release year (r = 0.43), for the films under consideration.

The other path of their investigation is through the spectral decomposition of the series of shot lengths in the films using Fourier analysis, and they point out that this is mathematically related to the results of their autoregression analysis, and hence gives similar results.

Cutting, DeLong and Nothelfer give the impression that the editor of a film has complete freedom to make a shot any length they like. This is certainly not the case in general, as anyone who has edited a film knows. In particular, for dialogue scenes, the length of the speeches has a strong influence on shot length. Even when there are cutaways to reaction shots in the middle of a speech, these tend to follow the length of sentences within the individual speeches. There is more freedom for an editor to choose shot length in action scenes, but the need for cuts on action at certain points again exerts some control on shot length, and even the length of the actions of the actors themselves, as staged, has to be respected. It is these and other independent causes simultaneously acting to determine the lengths of shots that produces the usual Lognormal distribution of shot lengths, as I have said so often before. However, the trend towards using more jump cuts within scenes, which is demonstrated in the “Statistical Style Analysis – Part 4” section of my Film Style and Technology: History and Analysis (3rd. edition), does give some more freedom to choose the shot length independently of the content of the shot, if editors want to. This freedom is inherent in a montage sequence, and accounts for the very large chains of similar length shots in Rocky IV, as revealed in the analysis by Cutting et al. Rocky IV has many (too many) training montage sequences cut to the regular beat of music, which is what pushes it right up to the top of the shot correlation stakes.

If audiences could be satisfied with nothing but action movies that have NOTHING but action in them, then the ASL could get shorter than 1.5 seconds, where it has halted at present, and the various shot length correlations studied by Cutting et al. could attain the maximum all the time, but I don’t think this is likely.

There are some further doubts about the postulated fundamental connection between the (hypothetical) basic psychological processes of attention, and film shot lengths. One alternative view of the matter is that the film audience is primarily attending to the succession of things represented in the film scenes, not the cuts between the shots. These cuts were certainly intended by film-makers to be “invisible” up until recent times. The counter-example to think about here is the most popular variety of videogame, while also remembering that videogames make more money than films nowadays. This is the “first person shooter”, which consists of just a continuous Point of View shot of what is meant to be in the protagonist’s sight, with no cuts in the scene shown on the computer screen. In first person shooters the player’s attention is totally gripped, completely without the benefit of editing. 

Finally, as all Cinemetricians should know, the method used by Cutting, DeLong and Nothelfer to get frame-accurate shots lengths is not efficient, with their work times being 15 to 36 hours per film. Using a NLE, and putting some sort of mark on each cut to get the shot lengths takes me (and no doubt others) only 3 to 12 hours per film, depending on the number of shots in it.

All the above is obviously far from being the last word on this work by Cutting et al., but what they have done is certainly usefully provocative.

Barry Salt