A Statistical Analysis of Tonal Harmony
By David Temperley
2009
————
Overview
It is generally believed that harmony in common-practice music (i.e. 18th and 19th century Western art music) is characterized by certain basic principles. Dominant harmonies (V and vii) go to tonics (I), predominants (IV and ii) go to dominants, root motion by descending fifth is especially favored, and so on. But to what extent are these principles actually followed in common-practice composition? There has been surprisingly little empirical study of this question. [1]
This page presents a statistical analysis of harmonic progressions in a corpus of common-practice music. The data files and programs mentioned below are available in this zipped file.
The data comes from the workbook accompanying Stefan Kostka and Dorothy Payne’s theory textbook Tonal Harmony, 3rd edition (McGraw-Hill, 1995). The workbook contains a number of excerpts of common-practice pieces, to be analyzed by the student; an accompanying instructor’s manual contains “correct” analyses done by the textbook authors, in conventional Roman numeral notation. The analyses also show modulations, and represent each chord in relation to the local key.
I created a corpus consisting of all of the analyzed excerpts in the workbook of 8 measures of more in length; there were 46 such excerpts. I call this the “Kostka-Payne corpus.” (A list of the excerpts is at the end of this page, and also in the file kp-corpus-info.) I created midifiles and “notefiles” (textfiles listing the notes with pitches and on/off times) of all the excerpts. (This was done in connection with the testing of the Melisma music analysis system; the notefiles and midifiles are available at the Melisma download site.) The harmonic analyses of the excerpts were computationally encoded by Bryan Pardo, and added to the midifiles (these midifiles are available at Pardo’s website). I then converted Pardo’s analyses into another format, which I call “chord-list” format. The beginning of a chord-list (for the opening of the Minuet in G major from the Notebook for Anna Magdalena Bach) is shown here:
0.000 2.608 - 0 1 7 7 2.608 3.913 - 5 4 7 0 3.913 5.217 - 0 1 7 7 5.217 6.521 - 11 7 7 6
Each line represents a chord segment. The first number indicates the beginning of the segment, in seconds. (For each excerpt, I chose a tempo that I thought was reasonable, and then generated times for the chord segments using this tempo.) The second number represents the end time of the segment. Following this are four integers. The first is the “chromatic relative root”: the chromatic interval from the root to the tonic. I use the usual pitch-class notation for intervals: I = 0, bII (or #I) = 1, II = 2, etc. The second integer indicates the “diatonic relative root” – the Roman numeral number (I = 1, bII = 2, II = 2, etc.). The third number indicates the tonic (assuming the usual pitch-class notation: C = 0, Db/C# = 1, etc.), and the fourth number indicates the _absolute_ root (again assuming the usual pitch-class notation). So the first chord statement above indicates I in the key of G major – a G major chord, in absolute terms. (Applied chords were relabeled in relation to the local key: for example, V/V was converted to II.)
Note that this format contains no information about the quality of chords (major/minor/diminished) or extensions (e.g. sevenths, ninths). This information is available in Pardo’s midifiles, but I did not encode it. [2]
The file kp-chord-list contains the chord-lists for the complete KP corpus. The title of each excerpt (using the short names shown in the kp-corpus-info is indicated at the beginning of the excerpt. Dotted lines “—” separate one key section from another. (“Pivot chords” – chords at key boundaries that function in both the previous key and the following one – are represented in both key sections.) I also separated the corpus into major-key and minor-key key sections; the file kp-chord-list includes just the major-key ones, and kp-chord-list-mi includes just the minor-key ones.
A few chords in the corpus were given chord symbols for which there is no widely accepted root, such as “German 6th”. For such chords, the label -1 is used for the chromatic, diatonic, and absolute roots.
Some Aggregate Statistics
Once I had the KP corpus in “chord-list” form, I then wrote a perl-script, tally.pl, which extracts various kinds of aggregate statistics.
The corpus contains 919 chords, and a total time of 1354.116 seconds.
First I extracted the total count of each chromatic relative root, and the total amount of time spent on that root.
proportion total excluding time Root count proportion tonic (secs) proportion I 318 0.346 --- 553.792 0.409 bII 17 0.018 0.029 29.805 0.022 II 104 0.113 0.180 118.766 0.088 bIII 10 0.011 0.017 16.668 0.012 III 21 0.023 0.036 25.104 0.019 IV 70 0.076 0.121 91.622 0.068 #IV 17 0.018 0.029 18.652 0.014 V 214 0.233 0.370 302.102 0.223 bVI 34 0.037 0.059 44.383 0.033 VI 50 0.054 0.087 76.706 0.057 bVII 6 0.007 0.010 8.301 0.006 VII 35 0.038 0.061 37.552 0.028
(The first “proportion” column shows the count of the chord as a proportion of the total count; the second “proportion” column shows the time spent on the chord as a proportion of the total time.)
There were also 23 “miscellaneous” chords, not assigned any explicit root (such as augmented-sixth chords), taking a total time of 30.663 seconds. (These are assigned chromatic root of -1 in the chord list; diatonic root and absolute root are also -1.)
Then I looked at the “chord transitions” — the number of times each chord moves to each other chord. “Antecedent” chords are shown on the vertical axis, “consequent” chords on the horizontal; for example, the number of occurrences of I moving to II is 31. (The data only reflects transitions within a single key section; no transition is recorded for moves from one key section to another.)
CHROMATIC ROOT TRANSITION COUNTS Cons I bII II bIII III IV #IV V bVI VI bVII VII Ant I 0 7 31 1 4 45 2 116 11 17 3 19 bII 3 0 8 0 0 0 1 2 0 0 0 1 II 22 3 0 1 4 1 7 45 2 8 0 6 bIII 1 1 0 0 0 0 0 4 4 0 0 0 III 1 0 2 0 0 7 0 1 0 7 0 1 IV 32 2 10 0 4 0 3 11 0 1 1 4 #IV 7 0 0 0 0 0 0 9 0 0 0 0 V 167 0 8 1 2 4 0 0 7 6 0 2 bVI 5 2 8 0 1 3 0 2 0 3 2 0 VI 4 2 28 0 1 4 2 1 0 0 0 1 bVII 0 0 0 5 0 0 0 1 0 0 0 0 VII 27 0 0 0 3 0 1 1 1 0 0 0
It is useful to represent this data in two other ways. First, we represent chromatic root transitions as a proportion of the total count for the consequent chord. The values in each column sum to 1; thus one can see, for example, that I is approached by V 62.1% of the time.
CHROMATIC ROOT TRANSITIONS AS PROPORTION OF COUNT FOR CONSEQUENT CHORD Cons I bII II bIII III IV #IV V bVI VI bVII VII Ant I 0.000 0.412 0.326 0.125 0.211 0.703 0.125 0.601 0.440 0.405 0.500 0.559 bII 0.011 0.000 0.084 0.000 0.000 0.000 0.062 0.010 0.000 0.000 0.000 0.029 II 0.082 0.176 0.000 0.125 0.211 0.016 0.438 0.233 0.080 0.190 0.000 0.176 bIII 0.004 0.059 0.000 0.000 0.000 0.000 0.000 0.021 0.160 0.000 0.000 0.000 III 0.004 0.000 0.021 0.000 0.000 0.109 0.000 0.005 0.000 0.167 0.000 0.029 IV 0.119 0.118 0.105 0.000 0.211 0.000 0.188 0.057 0.000 0.024 0.167 0.118 #IV 0.026 0.000 0.000 0.000 0.000 0.000 0.000 0.047 0.000 0.000 0.000 0.000 V 0.621 0.000 0.084 0.125 0.105 0.062 0.000 0.000 0.280 0.143 0.000 0.059 bVI 0.019 0.118 0.084 0.000 0.053 0.047 0.000 0.010 0.000 0.071 0.333 0.000 VI 0.015 0.118 0.295 0.000 0.053 0.062 0.125 0.005 0.000 0.000 0.000 0.029 bVII 0.000 0.000 0.000 0.625 0.000 0.000 0.000 0.005 0.000 0.000 0.000 0.000 VII 0.100 0.000 0.000 0.000 0.158 0.000 0.062 0.005 0.040 0.000 0.000 0.000
Now the same for the antecedent chord. Now each row sums to 1. For example, I moves to V .453 of the time.
CHROMATIC ROOT TRANSITIONS AS PROPORTION OF COUNT FOR ANTECEDENT CHORD Cons I bII II bIII III IV #IV V bVI VI bVII VII Ant I 0.000 0.027 0.121 0.004 0.016 0.176 0.008 0.453 0.043 0.066 0.012 0.074 bII 0.200 0.000 0.533 0.000 0.000 0.000 0.067 0.133 0.000 0.000 0.000 0.067 II 0.222 0.030 0.000 0.010 0.040 0.010 0.071 0.455 0.020 0.081 0.000 0.061 bIII 0.100 0.100 0.000 0.000 0.000 0.000 0.000 0.400 0.400 0.000 0.000 0.000 III 0.053 0.000 0.105 0.000 0.000 0.368 0.000 0.053 0.000 0.368 0.000 0.053 IV 0.471 0.029 0.147 0.000 0.059 0.000 0.044 0.162 0.000 0.015 0.015 0.059 #IV 0.438 0.000 0.000 0.000 0.000 0.000 0.000 0.562 0.000 0.000 0.000 0.000 V 0.848 0.000 0.041 0.005 0.010 0.020 0.000 0.000 0.036 0.030 0.000 0.010 bVI 0.192 0.077 0.308 0.000 0.038 0.115 0.000 0.077 0.000 0.115 0.077 0.000 VI 0.093 0.047 0.651 0.000 0.023 0.093 0.047 0.023 0.000 0.000 0.000 0.023 bVII 0.000 0.000 0.000 0.833 0.000 0.000 0.000 0.167 0.000 0.000 0.000 0.000 VII 0.818 0.000 0.000 0.000 0.091 0.000 0.030 0.030 0.030 0.000 0.000 0.000
As a final analysis, we consider the counts of different root interval motions. The left column below shows each chromatic interval (+m2 = ascending minor second, +M2 = ascending major second, etc.) along with its count. The right column groups these into diatonic intervals. (Each interval is represented by its smallest possible form; so a descending fifth is represented as an ascending fourth, +P4.)
INTERVAL COUNTS Chromatic Diatonic +m2 72 +M/m2 127 +M2 55 +m3 7 +M/m3 32 +M3 25 +P4 308 +P4 308 -TT 25 TT 25 -P4 167 -P4 167 -M3 21 -M/m3 64 -m3 43 -M2 34 -M/m2 65 -m2 31
Discussion
To a considerable extent, the conventional rules of harmony are supported by this data. This is perhaps most clearly seen in the table of root transition counts. The most common root motions, in order, are V-I, I-V, ii-V, and I-IV (the last two are equally common). All of these are standard, “correct” progressions of tonal harmony. “Incorrect” progressions such as V-IV are generally less common.
A few things are surprising. In particular, the frequencies of ii-I and IV-I are surprisingly high. Both of these represent “predominant-to-tonic” motions and are generally considered undesirable. IV-I progressions do occur in certain circumstances (such as plagal cadences and I-IV-I motions expanding an opening I) but their frequency here seems high. This appears to be largely due to cadential 6/4 chords; this is discussed further below.
The interval counts are also of interest. Traditional theory holds that certain intervallic root motions are preferred over others: descending fifths are most preferred (strongly favored over ascending fifths), descending thirds over ascending thirds, and ascending seconds over descending seconds. This data clearly shows all three of these preferences: descending fifths (+P4, 308) are much more common than ascending fifths (-P4, 167), descending thirds (65) are more common than ascending (32), and ascending seconds (127) are more common than descending (65). Overall, fourths are by far the most common (475); seconds (192) are much more common than thirds (96), and tritones least common of all (25).
Aggregate Statistics (with Cadential 6/4’s Reanalyzed)
A close inspection of the data revealed that the oddities noted above — the high frequency of ii-I and IV-I — were largely due to cadential 6/4 chords. Cadential 6/4’s, which are extremely common in the KP corpus (and in common-practice music generally), are analyzed in the Kostka-Payne text in a “two-level” fashion: A I6/4-V is placed inside a larger V. (This is in fact a common convention; under this convention, the cadential 6/4 is labeled as V6/4.) The encoding of the data by Pardo reflected the lower level (I6/4-V), and the data presented above reflects that as well. However, cadential 6/4’s are frequently (indeed normally) preceded by II or IV; thus it seemed likely that this largely accounted for the high frequency of II-I and IV-I motions. I thought that using the “V6/4” analysis might permit the conventional principles of tonal harmony to emerge more strongly. (This is surely one reason why many people prefer the V6/4 analysis.)
The data was therefore recoded, using the higher-level (V) analysis of cadential 6/4’s. That is, every two chord statements representing a cadential I6/4 followed by a V were replaced by a single statement representing V. The modified chord-list is kp-chord-list-2. Consider just the transition table:
Cons I bII II bIII III IV #IV V bVI VI bVII VII Ant I 0 7 31 1 4 45 2 84 11 17 3 19 bII 2 0 8 0 0 0 1 3 0 0 0 1 II 5 3 0 1 4 1 7 62 2 8 0 6 bIII 1 1 0 0 0 0 0 4 4 0 0 0 III 1 0 2 0 0 7 0 1 0 7 0 1 IV 27 2 10 0 4 0 3 16 0 1 1 4 #IV 3 0 0 0 0 0 0 13 0 0 0 0 V 166 0 8 1 2 4 0 0 7 6 0 2 bVI 3 2 8 0 1 3 0 4 0 3 2 0 VI 4 2 28 0 1 4 2 1 0 0 0 1 bVII 0 0 0 5 0 0 0 1 0 0 0 0 VII 26 0 0 0 3 0 1 2 1 0 0 0
The recoding of cadential 6/4’s has a significant effect. The count of II-I is reduced from 22 to 5; the count of IV-I is reduced from 32 to 27. The top 10 transitions are now V-I; I-V; II-V; I-IV; I-II; VI-II; IV-I; VII-I; I-VII; I-VI.
Once the “V6/4” analysis of cadential 6/4’s is assumed, the conventional principles of tonal harmony appear to be very strongly confirmed. Not a very earth-shattering conclusion (which is why I decided to put this in a web page rather than trying to publish it!) but I think it’s good to know.
A number of other comments could be made about this data. For example, compare the transitional frequency of IV-II (10) to II-IV (1); IV-II is much more common, again confirming a conventional rule. But I will leave further explorations to the reader. The reader could also use tally.pl to reproduce these statistics, and to gather further statistics from the chord lists provided — for example, analyzing major and minor key sections separately. (In fact, the differences between the major and minor key distributions are fairly modest. Perhaps this should not surprise us, since the primary tonic/dominant/predominant harmonies – I, V, II, IV – are the same in both modes, and function similarly.)
Notes
1. A few sources deserve mention. Helen Budge’s (1943) dissertation, “A Study of Chord Frequencies Based on the Music of Representative Composers of the Eighteenth and Nineteenth Centuries,” presents an interesting statistical analysis of tonal harmony, systematically gathered from analyses by experts. But only data on the frequency of individual (diatonic) chords is provided; there is no data about transitions (motions from chord to chord). Allen Irvine McHose’s (1947) study “The Contrapuntal Harmonic Technique of the 18th Century” offers occasional statistics about the frequency of various chords and progressions, but presents no complete data (such as tables of chord or progression frequencies). Philip Norman’s 1945 study “A Quantitative Study of Harmonic Similarities in Certain Specified Works of Bach, Beethoven, and Wagner” has statistics about chord progressions, but he assumes a new chord on every note – that is, he makes no allowance for non-chord-tones; this goes against the modern practice of harmonic analysis. Dmitri Tymoczko’s paper “Root Motion, Function, Scale Degree” (Musurgia 2005, available in English at Tymoczko’s website) analyzes a set of progressions from major-key Bach chorales. Finally, David Huron, in his book Sweet Anticipation (2006), presents data about chord transitions for “a sample of Baroque music” (pp. 250-1; no further information is given about the sample).
2. The mftext program available at the Melisma website) can be used to extract the chord labels from Pardo’s midifiles. While I have not analyzed the labels in detail with regard to mode and inversion, I did extract a few basic statistics. There are 949 chord labels total (this is slightly greater than my count, since in Pardo’s annotations, there may be two chords of the same root and key in succession). Chords built on major triads (including seventh chords that contain major triads, e.g. dominant sevenths) are 68.3% of the total; those built on minor triads, 21.2%; those built on diminished triads, 9.9%. Root-position chords are 60.7 of the total; first-inversion, 23.3%; second inversion, 12.9%; third inversion, 3.1%.
Downloads
(All in the zipped file kp-corpus-files.zip)
kp-corpus-info: List of excerpts in the Kostka-Payne corpus
kp-nbck: This directory contains “note-beat-chord-key” files for all excerpts in the corpus: A list of notes (“Note [ontime] [offtime] [pitch]”), beats (“Beat [time] [level]”), chords (“Chord [ontime] [offtime] [root]”) and key sections (“Key [start time] [end time] [tonic] [mode:ma=0,mi=1]”). I made these as an intermediate step towards making the “chord-lists” below. These files bring together the “beat list” and “note list” formats that I used with the Melisma system (see the Melisma website for explanation) with the harmonic and key information from the Kostka-Payne analyses.
kp-chord-list: Chord list (list of chord statements) for the KP corpus
kp-chord-list-ma: Chord list for the KP corpus, major key sections only
kp-chord-list-mi: Chord list for the KP corpus, minor key sections only
kp-chord-list-2: Chord list for the KP corpus with the “V6/4” analysis of cadential 6/4 chords
kp-chord-list-2-ma: The “V6/4” chord-list, major-key sections only
kp-chord-list-2-mi: The “V6/4” chord-list, minor-key sections only
tally.pl: a perl script for extracting aggregate data from chord lists. (The tables presented above are all outputs of tally.pl.)
List of excerpts in the Kostka-Payne Corpus
The Kostka-Payne corpus is a set of 46 excerpts from the workbook and instructor's manual for _Tonal Harmony_ (1995, 3rd edition) by Stefan Kostka and Dorothy Payne. The excerpts in the corpus are as follows. Name of file Composer, Title, Measure numbers p.# in p.# in inst. wkbk. manual bach.annamin Anonymous (but often attributed to Bach), 49 79 minuet in G, mm. 1-16 bach.jesu Bach, Chorale, "Jesu, der du meine Seele" 104 163 bach.kindlein Bach, Chorale, "Uns ist ein Kindlein heut' 103 162 geborn" beet.rondo Beethoven, Rondo Op. 51, no. 1, mm. 103-120 129 212 beet.son10-1.II Beethoven, Sonata Op. 10 No. 1, II, mm. 1-8 62 92 beet.son10-3.II Beethoven, Sonata Op. 10 No. 3, II, mm. 9-17 106 168 beet.son13.II Beethoven, Sonata Op. 13, II, mm. 1-8 85 135 beet.son14-1.III Beethoven, Sonata Op. 14 No. 1, III 101 - beet.son2-3.III Beethoven, Sonata Op. 2 No. 3, III, mm. 81-88 45 72 beet.son27-2.I Beethoven, Sonata Op. 27 No. I, mm. 1-9 129 209 beet.sq135.III Beethoven, String Quartet Op. 135, III, 87 138 mm. 1-10 beet.strio Beethoven, String Trio Op. 9 No. 3, II, 134 225 mm. 1-10 brahms.undgehst Brahms, "Und gehst du ueber den Kirchhof", 95 152 Op. 44, mm. 29-37 campbell.barb Ayer (arr. by Campbell), "Oh! You Beautiful 150 249 Doll!", mm. 1-9 chop.maz63-2 Chopin, Mazurka Op. 63, No. 2, mm. 1-16 149 248 chop.maz67-2 Chopin, Mazurka Op. 67, No. 2, mm. 1-16 86 137 chop.noc27-1 Chopin, Nocturne Op. 27 No. 1, mm. 41-52 144 238 grieg.mountain Grieg, "The Mountain Maid", Op. 67 No. 2, 150 255 mm. 1-11 haydn.son22.III Haydn, Sonata No. 22, III, mm. 1-8 120-1 - haydn.son30.I Haydn, Sonata No. 30, I, mm. 84-96 81 125 haydn.sq20-4.I Haydn, String Quartet Op. 20 No. 4, I, 76 118 mm. 13-24 haydn.sq50-6.II Haydn, String Quartet Op. 50 No. 6, II, 76 117 mm. 55-63 haydn.sq74-3.II Haydn, String Quartet Op. 74 No. 3, II, 133 223 mm. 30-7 haydn.sq.76-6.II Haydn, String Quartet Op. 76 No. 6, II, 144 241 mm. 31-9 mzt.bsnconc Mozart, Bassoon Concerto K. 191, II, mm. 42-50 87 139-40 mzt.ekn.II Mozart, "Eine Kleine Nachtmusik", K. 525, II, 61 92 mm. 1-8 mzt.pc488.II Mozart, Piano Concerto K. 488, II, mm. 1-12 131 - mzt.son330.II Mozart, Sonata K. 330, II, mm. 21-8 104 164 mzt.son333.III Mozart, Sonata K. 333, III, mm. 91-8 116 - mzt.trio Mozart, Piano Trio K. 542, I, mm. 210-229 123 199 mzt.voiche Mozart, Marriage of Figaro, "Voi che sapete", 105 167 mm. 41-52 schub.bfson.I Schubert, Sonata in Bb, D. 960, I, mm. 149-68 144 239-40 schub.erlkonig.I Schubert, "Erlkonig", mm. 113-23 129 210 schub.erlkonig.II Schubert, "Erlkonig", mm. 134-48 129 211 schub.flusse Schubert, "Auf dem Flusse", mm. 14-21 112 179 schub.imp1 Schubert, Impromptu Op. 90 No. 1, mm. 42-55 124 201 schub.strio Schubert, String Trio D. 471, mm. 187-201 138 233 schub.tanze Schubert, Originaltanze Op. 9 No. 14, mm. 1-24 145 242 schum.grenadiere Schumann, "Die beiden Grenadiere", mm. 23-37 133 221 schum.sehnsucht Schumann, "Sehnsucht", mm. 2-11 133 219 schum.thranen Schumann, "Aus meinen Thranen spriessen", 96 154 mm. 1-17 schum.tragodie Schumann, "Tragodie", mm. 1-9 134 224 schum.wennich Schumann, "Wenn ich in deine Augen seh'", 105 165 mm. 1-21 tchaik.morning Tchaikovsky, "Morning Prayer", mm. 1-17 95 - tchaik.nurse Tchaikovsky, "The Nurse's Tail", mm. 5-15 138 232 tchaik.symph6 Tchaikovsky, Symphony No. 6, I, mm. 89-97 150 251-3