Underlying structures of music (and language?)

Written By: Youki on December 13, 2008 13 Comments

In an earlier post, Usree linked a youtube video comparing Coldplay’s “Viva La Vida” to Joe Satriani’s “If I Could Fly.”  A background in music theory/composition will tell you that the majority of modern, mainstream music is based off a few pretty common chord progressions (copying the melody and tempo, however, is a completely different issue).  This has even been the source of many music-related skits.  One of my favorites:

A few medleys have been made that combine two or more songs.  You have probably heard Israel Kamakawiwo’ole’s medley of  “Over the Rainbow” and “What a Wonderful World”:

And here’s one I found amazing:


The vast majority of songs written nowadays use a pretty standard set of chords, and it’s difficult to produce an original chord progression that sounds “good” (obviously a subjective term).

The same way that music has underlying structures that makes it “sound good,” does language?  Well the first step is to identify what I mean by “sound good.”  To me, music that sounds good is music that fits a harmonic pattern or works in contrast to that pattern.  By pattern,  I am referring to the notes that fall within the key of a song.   Songs are typically set to a key (like C major, or A minor) and this determines the notes which fit in the key.  So if you’re in the key of C major, and you play the notes C-E-G (a C major chord), everything fits the key and it sounds pleasant.  If you randomly change some notes around you’re likely to make the song sound horrible and dissonant.

The same idea can be applied to folk tales.  Vladimir Propp, in “Morphology of the Folk Tale,” breaks down Russian fairy tales into 31 functions (note that he doesn’t claim that all folk tales contain all 31 functions, and the order may vary).  For example, here are the first seven functions:

  • A member of a family leaves home (the hero is introduced);
  • An interdiction is addressed to the hero (‘don’t go there’, ‘don’t do this’);
  • The interdiction is violated (villain enters the tale);
  • The villain makes an attempt at reconnaissance (either villain tries to find the children/jewels etc; or intended victim questions the villain);
  • The villain gains information about the victim;
  • The villain attempts to deceive the victim to take possession of victim or victim’s belongings (trickery; villain disguised, tries to win confidence of victim);
  • Victim taken in by deception, unwittingly helping the enemy;

When reading a story, the reader expects a certain pattern of events to occur, and deviations are typically in contrast to the expected pattern.  This is not to say that everything is predictable; surprises will occur, but for the most part, they will be logically integrated into the plot of the story.  For example, in a murder mystery, you may be surprised by who the murderer is, but there is typically an explanation (even the lack of an explanation itself could be an integral part of the character’s identity)Of course, more creative writers will leave some things unexplained, but this also fits into the logic of the narrative (one can hardly expect the reader to be omniscient about all events and motives in a story).  Within a Proppian analysis, each potential function of a story always fits within a very logical order, an expectation of what events can happen and why they happen.

So, to bring it back to music theory, a song in the key of C major will be based on the following notes: C, D, E, F, G, A, B, C.  The available chords will typically fall within this set of notes: C major (C-E-G), D minor (D-F-A), E minor (E-G-B), F major (F-A-C), G major (G-B-D), and A minor (A-C-E).  Similarly, a story in the fairy tale genre will usually have the following plot (paraphrased from Propp): introduction of hero, villain enters tale, villain does something harmful to hero or hero’s family, hero leaves home to seek revenge/fortune/solution, hero is tested by a helper, hero succeeds at test, hero receives gift/reward which allows the villain to be defeated, villain is defeated and order is restored.  Each event in a story is linked to the story’s genre, and there are only a certain set of events that can occur which fit appropriately within the expectations of the genre.  For example, in a standard fairy tale story, you can’t have the hero willingly give up and let the villain win, or have the villain come to realize that he/she was wrong (without any external influence) and set things right.  What makes a hero “good” and a villain “bad” is a logic to their actions and motives, a logic that fit within the overarching genre of the story.

Of course, you can have more complex characters, the same way you can have more complex chord progressions.  You can throw in more textured chords, like a basic jazz chord can throw in a minor 7th note in the scale (so instead of just C-E-G, you can have C-E-G-B flat).  You can make exquisitely complex chord progressions, but if the music sounds good, there’s more than likely a very logical placement of each and every single note.

Perhaps studying language is like composing music — it helps to understand the underlying structure of the composition.  The same way a song will be in a certain key, words will have a certain frame.  The same way a song will have notes that are better aligned with the key, there are words that are better aligned within the frame.  The same way a song has a chord progression, words have a flow to them, a feeling of connectedness that links not only the words to each other, but to other words said in a different time and place by different people, but with similar meanings and intentions.


  1. Usree Bhattacharya on: 13 December 2008 at 11:34 am

    An absolutely outstanding post, Youki, hands down one of the best posts I have read in my life (here or anywhere else). Thank you for your thoughts.

    Very neat comparative analysis…My best friend, a guitarist, told me, when I sent him the Satriani/Cold Play link, that it’s a very common chord progression. In that particular instance, Satriani’s response came AFTER Cold Play’s song was nominated for a Grammy, and after the song became a world wide sensation and massive commercial hit, so I am somewhat wary anyway. (Though I still think he rocks)

    Your post made me think of how Indian classical (some non-classical, light classical, Bollywood) music is based on rāgas. We’re given a series of notes, and that’s what we’re allowed to use for particular songs. Of course there IS room for play, but as a child I was taught that certain notes “work” musically at certain times in the day, or with different seasons. The room for play lay strictly within those notes, and deviations were interesting and permitted sometimes, but experimental in nature, for just the right injection of surprise in the melody. Of course, we also have “blended” rāgas, and they’re very interesting too, but even then, the boundaries became defined, if not limited. As a harmonium player, I find it very useful; if I know a rāga well, I can learn to play any particular song easily, and doing vocal variations, as a singer, becomes very easy, for some common note sequences are encouraged, and anticipated by the audience. This is a rather reductive summary of Indian music, but I thought it would be interesting to bring that in as well.

    As far as plagiarism and Indian music is concerned, my third blog post-ever-last year-dealt with this issue. Indian MTV was airing this song that I found to be lifted from Britney Spears and from “Addictive” by Truth Hurts. Months later after posting that, I found that the makers of the song “Addictive” had been sued and taken to the cleaners for having used an Indian song without giving credit. Yikes. It tells you why these things are so tricky.

    Finally, I really enjoyed your comparison-drawing across studying music and studying language. However, here’s the deal: I feel that in your mind you bridged the language (grammar etc)/literature (folktales) divide too quickly, and I didn’t understand that. Would love to hear more…

  2. Usree Bhattacharya on: 13 December 2008 at 11:51 am

    Wow, I was on Youtube just now, and listen to this Cat Stevens song from 1973…

    as the text says, you could make the claim against Satriani as well.

  3. Youki on: 13 December 2008 at 10:56 pm

    The same way a folk tale will have a predictable structure: 1) establishment of setting, 2) introduction of hero, 3) introduction of villain, 4) problem, 5) solution, and 6) restoration, written and spoken genres tend to have a somewhat predictable structure (that is not to say that the actual utterance fits the conventions of the genre, nor that the conventions are bounded and fixed).

    So if you’re writing a paper for a class, your paper may have a similar structure to folk tales. Here’s an example: Setting: this is an academic paper in which you’re addressing an issue/problem. Introduction of hero: that would be you, or more specificially, your hypothesis. Introduction of villain: other theories you’re trying to counter or modify. Problem: all academic papers need a bit (or a lot!) of conflict/tension to push the hypothesis forward. There has to be a point to your paper, otherwise why would anyone read it? Solution: validation of your hypothesis, or reevaluation if data proves it incorrect. Restoration: incorporation of hypothesis into theory to produce new theoretical model.

    Although folk tales tend to have a “happily ever after” ending, academic papers often have a “plans for the future” paragraph that just strengthens the idea of the iterative nature of academic discourse.

    The same way that music can “sound good” (refer to original post) to us because the notes and chord progressions are based on predictable/expected patterns, language itself can “sound good” based on predictable/expected patterns.

    For example, let’s say I wrote an academic paper examining the relationship between exercise and health. I construct the following argument:

    1. People who exercise daily tend to also have pets.
    2. People who exercise daily tend to be healthier.
    3. Therefore, having pets makes you healthier.

    Each element in the paper needs to be constructed in such a way that it fits in with the overall logic of the paper. Reading the above 3 points, people are probably thinking “correlation is not causation” and they’d be right, there is a clear misalignment with point 3. We all have a pretty standard structure of logic that we apply to academic papers — that the conclusion must logically match the hypothesis and every point in-between. There is a progression to the elements in an academic paper, and while you can be creative with how the progression unfolds, you always have to be “in the right key” or else the paper loses its structural/logical integrity.

    You can also apply this to political speeches. Think about how Obama used the phrase “yes we can” in his speeches. Had a very musical element to it:

    That small phrase points to the overall frame of Obama’s speeches: bringing change to Washington, a counter to Bush/neocon ideology, hope and optimism about America. Every single word in his speech needs to align to the overall frame, every single sentence needs to be constructed in such a way that it supports the frame.

    I am reminded of Lincoln’s phrase “mystic chords of memory” in his inaugural address:

    “I am loath to close. We are not enemies, but friends. We must not be enemies. Though passion may have strained, it must not break our bonds of affection. The mystic chords of memory, stretching from every battlefield and patriot grave to every living heart and hearthstone all over this broad land, will yet swell the chorus of the Union, when again touched, as surely they will be, by the better angels of our nature.”

    What made Obama so inspirational? Among many other things, I’d say that he knew how to play those “mystic chords of memory.”

  4. daveski on: 14 December 2008 at 1:48 am

    Very, very cool stuff here! Though I have to agree with Usree, I’m going to have to think over a little more the levels at which folk tales and essay writing and musical composition are intuited as being ‘in key’ by their listeners, viewers, and readers.

    What came to mind more immediately for me, though, was the memory of being in my first graduate seminar at Cal, when I was an undergraduate from a completely different major but with aspirations in electronic music. The seminar was at the Center for New Music and Audio Technology up on Arch Street north of campus, and it blew me away.

    What made me think of it when reading this post specifically was an illustration in that seminar of how easily fooled our sense of the ‘musicality’ of sequences of notes over time can be. One of the researchers there connected a synthesizer to a computer which was running a program that had calculated the precise ratios of notes in given keys as they appear in the compositions of various classical composers. The exact percentage of time on the I, the IV, the V, etc. in any key turn out to follow certain patterns as you might expect–the most time is spent on the I and second-most on the V generally, but beyond that there are differences depending on the composer, especially with the notes that are more dissonant or further away from the I.

    So the researcher turns on the computer and, it starts spitting out notes of various frequencies and lengths, generated randomly, uniquely, but always preserving the ratio on a larger scale that was used (unconsciously, I assume) but Mozart, then by Bach, etc. And guess what?

    It sounded good! I knew I was being fed synthetic auditory candy, but it was a pleasant auditory experience. And not only that, but my mind seemed to assemble meaningful phrases, to remember previous turns and to build expectations (on shorter timescales) as I listened.

    And it made me wonder, like I wondered when listening to Obama’s “Yes we can” (which I found moving almost to the point of tears), like I wondered when listening to George W’s “Stay the course” (which moved me the opposite way, also to the point of tears)…..can myth trump narrative? Does it, at the level of the iterative phrase, create plots of its own accord?

    By the way, writing this comment I just discovered the CNMAT blogs!

  5. Youki on: 14 December 2008 at 10:56 am

    It’s not random if the ratio is preserved, because music is really an aural expression of mathematical frequencies. You can graph out the frequency of notes, and it’s an exponential graph. As long as the ratio is preserved, you can start at any random point and the music will sound the same (just in a different key).

    For example, middle C (C4 – the 4 refers to the octave number. C5 is one octave higher, C3 one lower) has a frequency of 262 Hz. C5 is 524 Hz (double). C3 is half C4, 131. This hold true for all octaves. A1 is 55, A2 is 110, A3 is 220, A4 is 440, A5 is 880, etc.

    Every note can actually be expressed with the following equation:
    frequency = 440*(2^n/12)
    where n is the number of half steps from A4. So if you want to know the frequency of C4, it’s 3 half-steps up, and you plug in 3:
    frequency = 440*(2^3/12) = 523.25
    A3 is 12 half-steps below A4:
    frequency = 440*(2^-12/12) = 220

    If you randomize the note but preserve the ratio, all you’re really doing is adjusting the equation to be this:
    frequency = r*440*(2^n/12)
    where r is a random number. It’s basically the same as changing the key; the ratio between the notes is preserved.

    I can actually demonstrate this. Let’s take an A minor chord:
    440*(2^0/12) = 440 Hz = A
    440*(2^3/12) = 524 Hz = C
    440*(2^7/12) = 660 Hz = E

    If I pick a random number and apply it to each note (let’s say I pick the not-so-random 1.1227, just so the frequencies match actual notes), I get:
    1.1227*440*(2^0/12) = 494 Hz = B
    1.1227*440*(2^3/12) = 588 Hz = D
    1.1227*440*(2^7/12) = 741 Hz = F#
    which is a B minor chord. It doesn’t matter what number I pick randomly, it will always sound pleasant, because the ratio (or in musical terms, the interval) is preserved.

    Music is very mathematical. Did you notice the relationship between A (440) and E (660)? 440+(440/2) = 660. Or 3:2 (660:440). It’s called a perfect fifth. No matter the frequency, a perfect fifth will always be a 3:2 ratio.

    You can basically express an entire song through ratios. A whole step (like going from C to D) is 9:8, a third (like from C to E) is 5:4, a fourth (C to F) 4:3, a fifth (C to G) is 3:2, a sixth (C to A) is 5:3, a seventh (C to B) is 15:8. An octave is 2:1.

    Look at the ratios with the smallest numerators. 1:1 and 2:1, those correspond to the I chords you saw in the percentage of time spent on chords data (in the above post). What’s the next smallest numerator? 3:2, or V, the 2nd most frequent chord. Next smallest is 4:3, or IV, the 3rd most common chord. Not a coincidence — these ratios, above all, are the main factor in what makes a song pleasant to hear.

    Ratios are the foundation of how music sounds. You can do whatever you want to the tempo and duration of notes, but if you preserve the ratios, you preserve one of the most essential aspect of music.

    “Can myth trump narrative?”

    ooooooohhhhhhh nice question. What do you think? I’ve love to see a new blog post on the subject. I unfortunately have a paper due tomorrow that I need to work on.

  6. Usree Bhattacharya on: 14 December 2008 at 1:01 pm

    Applause all around for the excellent discussion underway here. I am learning such interesting stuff from the comments!

    Okay, one thing we’re not clearly addressing, that I can see, is the underlying cultural context, and how that might affect things. Y said: ‘The same way that music can “sound good” (refer to original post) to us because the notes and chord progressions are based on predictable/expected patterns, language itself can “sound good” based on predictable/expected patterns.’ Where does culture fit into these claims? There’s acculturation to musical/narrative patterns, right? For example, I grew up on Bengali “nonsense poetry,” Abol Tabol (no, it doesn’t just comprise of just “nonsense” words). In nonsense poetry, the everyday becomes exoticized, and outrageously exaggerated in very unique and unexpected ways. Through humor-“nonsense” words and storylines-the everyday becomes atypical, in a genre that I haven’t really found a parallel to. It’s a very unusual genre (post coming) that I became acculturated to finding predictable in some ways. If I wasn’t reared on those poems, I would find them very strange, and not at all predictable in the sense you seem to be indicating.

    So here’s a question…your focus here seems specifically on “Western” music, and broader literary genres…and while I think that sure, there are larger claims one could make across cultures about music and narrative, there is the interesting question of what harmonies are melodies, and patterns-literary or musical-become predictable, for whom, how and why. How does the acculturation process occur? How does it become a part of our habitus (btw, I never thought I’d bring that word into a comment!)?

    By the way, you guys will LOVE one of my best friends’ works, David Monacchi. He creates music using the natural sound in the world around us.

    Apologia: I am a little frightened of looking a little stupid here, given the depth of Dave’s comments and Youki’s rejoinder. So bear with me.

  7. Youki on: 14 December 2008 at 11:29 pm

    good questions! I think the Obama “yes, we can” video is a good entry point for discussing culture. I got sidetracked by the more technical aspects of music, but you’re right, this discussion should do a better job with culture. In the first few lines of his speech, Obama brings in a history to the phrase “yes, we can,” expanding the phrase beyond just the speech to a larger set of cultural and historical meanings:

    “It was a creed written into the founding documents that declared the destiny of a nation.

    Yes we can.

    It was whispered by slaves and abolitionists as they blazed a trail toward freedom.

    Yes we can.

    It was sung by immigrants as they struck out from distant shores and pioneers who pushed westward against an unforgiving wilderness.

    Yes we can.

    It was the call of workers who organized; women who reached for the ballots; a President who chose the moon as our new frontier; and a King who took us to the mountaintop and pointed the way to the Promised Land.

    Yes we can to justice and equality.

    Yes we can to opportunity and prosperity.

    Yes we can heal this nation.

    Yes we can repair this world.

    Yes we can.”

    This is precisely what I meant when I said that Obama knows how to play those “mystic chords of memory.” He’s drawing on centuries of culture to produce a speech that extends far beyond its own textual boundaries. It’s an emotional speech because it draws so heavily upon culture, upon our shared experiences and memories.

  8. Usree Bhattacharya on: 15 December 2008 at 11:23 am

    The “mystic chords of memory” are playing a beautiful melody in my mind. (G)Obama.

  9. daveski on: 15 December 2008 at 8:28 pm

    Just saw this post linked to on one of my favorite tech blogs, TechCrunch, about an MIT Master’s student‘s creation, called “MusicBox” — looks like an amazing way to visualize music. Going along with the translation (transduction?) of meaning across different modes: music, speech, visuals, etc…Gotta check this out!

  10. Usree Bhattacharya on: 16 January 2009 at 10:32 pm

    Hi Youki, have you heard of the Axis of Awesome? Check out their video, Songs in 4 Chords

  11. Youki on: 17 January 2009 at 4:58 am

    hah yeah I considered linking that vid but thought people wouldn’t be interested in another “songs that have the same chords” vid. Guess I was wrong!

  12. Usree Bhattacharya on: 21 January 2009 at 6:09 pm

    This is making the rounds of the Net…thought of your post as I watched it…

  13. Tim on: 22 September 2009 at 12:54 pm

    Nice, love the Axis of Awesome!!!

