Home » Japanese, Translation

データから言葉を生かし・・ Traces from the Japanese database

Written By: daveski on July 17, 2010 No Comment

The past few weeks I’ve been helping out in a small hidden corner of one of the BLC’s bigger endeavors of the past few years: the Library of Foreign Language Film Clips, a collection of clips and linguistic and cultural metadata from oodles of movies in non-English languages. The LFLFC page I just linked to explains the project better than I can here, and Berkeley students who are studying Chinese, Finnish, French, German, Italian, Japanese, Korean, Russian, or Spanish can check it out by logging in with Calnet ID here (more languages coming, is the official word).

These are top-down views, though, and for me, what this or any other language learning attempt is about has always been the bottom-up. And there isn’t much more of a bottom-up view than that seen from the designer of tags–the unique linguistic elements that are attached to film clips so that they can be searched for, retrieved, and made use of by language learners and teachers. Any one clip can be identified by descriptions of the setting, the characters who appear, what they’re doing, the cultural practices in which they’re engaged, the speech acts they’re using as they talk, and (of course) the actual words they’re saying. And it’s a list of about 10,000 of these words–words that the actors in these films actually utter in the course of the film–that I’ve been staring at, scrutinizing, looking up, and preparing for entry into the database these last few weeks.

I wrote above that these words are unique, but of course that’s not true. The beauty of language-in-use is how people and films make words their own every time they use them; but at the same time they must be at least somewhat recognizable to their hearers and readers in order for meaning to be made. These words are part of a living language, spoken and reproduced by hundreds, thousands, millions of people and media channels. And what makes the work of checking, editing, and entering them in a database interesting is remembering, imagining, and tracing the paths of life that each one incites.

Of course, there’s a lot to say about the challenge–I might even say violence–of standardizing language, dividing it into discrete elements and classifying them, eliminating the regional dialects and individual inflections from each utterance so that the ‘words’ can be isolated and normalized for the database, much like the dictionary. I’ll save that for another post, though (or, whoops, did I say it already?!), and here just offer some of the paths I’ve followed for some choice words and phrases from the list, since they’ve been a resource for me as a past-and-still learner of Japanese…

根を詰める (kon o tsumeru) comes up early in the list. Staring at it, I draw a blank. I look it up in the dictionary I’m using, and get the sample sentence, 根を詰めて勉強する (kon o tsumete benkyou suru), “to concentrate on studying”. For the first few seconds, I’m still floundering. What is 根, and why does it have to be 詰めた? I guess that I know “根” mostly from its reading as ね (ne), meaning “root (of a plant, tree, etc.)”. And then the metaphorical meaning of 根 as something like “resolve” starts to make sense. Stick-to-it-ness is like the root of the spirit. Nice.

Now I see たてがみ (tategami) in the list, the word for “mane”, all in hiragana. Does it need to be associated with a 漢字 (kanji, Chinese character)? And what does it mean again? Dang, this is the second time I’ve seen it in the list and I didn’t remember it until I looked it up again. Maybe because I haven’t known any horses or zebras or the like in Japanese. 漢字がないみたい。まあ「鬣」って確かに辞書に載ってはいるけどこんなに複雑な字が実際にどのくらいつかわれているだろうか。(I don’t think it needs a kanji. I mean, the dictionary shows “鬣” but how much is such a complicated character actually used anyway?)

Scanning further through the list,「決まったも同然」(kimatta douzen) comes up. This looks like a collocation to me, and true to form, I can’t find it in the dictionary. What is “同然”? It apparently means “practically the same as…” Gonna bookmark this one for later.

「形になる」(katachi ni naru) is another phrase that is probably going to have to be broken up for the database, into 形 (“shape” or “form”) and 〜になる (“becomes”, “turns into”). But that’s too bad. If you let yourself get lost in each word and how they play together in the mind, “becoming a shape” or “taking shape” as it would more colloquially be in English, just sounds pretty cool. “Form” and “shape” have such great metaphorical potential and 形 does too. (and, by the way, whose or what shape is “taken” when something “takes shape” in English?)

One of my all-time favorites from the list of candidate tags: ダブルチーズバーガーセット (daburu chiizu baagaa setto, “Double Cheese Burger Set”). It’s a mouthful of katakana (カタカナ) but if you frequent McDonald’s or In-N-Out or any other fast food haven it’ll be immediately clear what this is. It reminds me of my all-time favorite hamburger experiences in Japan at モスバーガー (MOS Burger). The question here is, from a learner’s perspective, is it better to save the whole thing as one unit? Or should ダブル (daburu) be separated from チーズ (chiizu), separated from バーガー (baagaa), separated from セット (setto)? I’m voting for the latter, since all kinds of other combinations come to mind for each word: ダブルプレー (double play), ハンバーガー (hamburger), etc. And one really interesting thing that just came up when looking for collocations with チーズ was the image results for チーズ as a Japanese word compared to “cheese” as an English word. Hmm…. “fromage“, anyone?

Then there are words where you wonder, who has said this before? Who ‘owned’ the words and its emotions and actions and consequences and memories, before it got stripped down and put into a dictionary or a database as just one more bug in the collection?: 掛け替えのない (kakegae no nai, “irreplaceable”; “precious”), like in 「掛け替えのない子」, “one’s dearest child”. I don’t have any memories of ever having heard or read this, but now I’m curious. What scenes, on-film and in-life, would this appear in?

How cool that 向こう岸 (mukougishi, “the opposite bank” (of a river, etc.)) is its own vocabulary item. Is there a single dictionary entry for “this bank”?

白粉 (oshiroi), the word for white face powder, is an interesting case–reading the hiragana “おしろい” I would have thought it might be written 御白い, but the “白い” (shiroi, “white”) appears to have migrated to the beginning of the word. And the second character, 粉, usually read kona or fun, rounds out the meaning of the 2-character compound (white – powder) but the pronunciation seems quite random. What is the history of this word? It seems to be an example of a Chinese written lexical item transplanted right onto a spoken Japanese word, like a clam married to its shell out of historical circumstance and convenience.

満更でもない (manzara demo nai), apparently, means “not completely bad.” The negative 〜でもない ending means that 満更 must be “completely bad”, but I’m still figuring this one out. “満”, as far as I know, means something like “full”, while “更” can mean “beyond”, “even more”, etc. Hmmm. My encounter with this phrase makes me wonder if we have a dark little place in our mind full of vocabulary bookmarks, waiting for new language experiences to bring them to light.

Suddenly, out of nowhere, もんじゃ焼き (monja yaki) appears on the list. There’s a good breaking point.


Digg this!Add to del.icio.us!Stumble this!Add to Techorati!Share on Facebook!Seed Newsvine!Reddit!

Leave a Reply:

You must be logged in to post a comment.

  Copyright ©2009 Found in Translation, All rights reserved.| Powered by WordPress| WPElegance2Col theme by Techblissonline.com