Friday, February 12, 2010

Onyomi Reading Analysis

I conducted a quick analysis of Kanji Onyomi readings.  First I wanted to know was how many unique sounds or readings there are and how frequently they occur.


9056 Kanji in the WWWJDIC Kanji Dictionary 2500 Most Frequent Kanji in the Asahi Newspaper
1 or more Kanji 421 348
2+ Kanji 328 273
5+ Kanji 266 181
10+ Kanji 208 112

According to my results there are 421 unique Kanji Onyomi readings.  If you narrow down the 9056 Kanji that are listed in the WWJDIC Kanji Dictionary file to the 2500 most frequently printed Kanji in the Asahi Newspaper the number drops to 348.  Furthermore, if you focus on the readings for which there are 5 or more Kanji that use that Onyomi reading the number drops to 181.

What I've realized in my studies is that it is sometimes difficult to learn a new reading.  That is to say that until you've learned one or more Kanji that use a particular reading it can be difficult to learn.  I think that this is again related to the theory of association value.  The more characters you know with a particular reading the easier it becomes to learn a new character with the same reading.

Next I wanted to know which readings appear most frequently.  Here are the top Onyomi readings by frequency.

9056 Kanji in the WWWJDIC Kanji Dictionary
2500 Most Frequent Kanji in the Asahi Newspaper
OnyomiFrequencyOnyomiFrequency
コウ292コウ106
ショウ236ショウ97
ソウ17570
165ソウ64
カン163トウ58
トウ16257
148カン55
セン147キョウ55
キョウ137チョウ48
ケン12247

The tables are pretty similar.  Readings that are common in the large set of Kanji are also common in the most frequently used newspaper Kanji.  It's interesting to note that the readings for the 4 most frequent readings kou, shou, so, sou, all sound very similar to an English speaker.  I think this is just one of the many reasons why it takes Westerners a long time to master listening skills.  It must take hundreds and possibly thousands of hours of listening to develop deep Japanese listening ability.

Thursday, February 11, 2010

97 bottles of beer on the wall

Today is day 23 of my challenge marking word 1150 and just over 19% complete.  My actual word count minus leeches (words I couldn't learn this time around) is 2577 or 32% of the complete JLPT vocabulary list.  I'm very satisfied with my progress after about 16 months of Japanese study.  I don't mean to sound like a braggart because I believe that this is simply evidence of the number of hours I've studied and not some silly metric of intelligence.  I truly hope my progress reports encourage other people to study language as well.

I have a lot of ideas for new posts but I've been completely engrossed by the challenge.  This weekend should allow me the time to write about something other than progress reports.  My last post was on an interesting topic but I lost steam at the end and feel like I failed to make a solid argument.  I'm looking forward to trying again.

This weeks calculations are about Kanji readings.  The question I asked myself was how many new Kanji readings do I need to learn at different intervals of my challenge.  Basically I looked at where I currently stand with readings and the number of readings I'll have entering the JLPT 2 list.

Until this year there were 4 levels in the JLPT test.  There are now 5 levels.  The changes are mainly to more evenly distribute the difficulty curve because there were many complaints about the gap between levels 3 and 2.  Personally I don't think there's much use in having an official test easier than the level 2 exam.  It won't help you get a job nor will it get you into any Japanese Universities.  I won't digress any further into the structure or utility value of the exam except to say that I'm using a list published from a few years back.

When I restarted my computer I forgot to save the stats I calculated so here are the estimates from memory.  There are roughly 2,000 Kanji in the complete list of JLPT vocabulary words.  When I started my challenge I had learned all 1406 words on the level 3 and 4 exams plus a few hundred words from levels 1 and 2.  I never calculated my number of known readings before the challenge.  According to my analysis of the level 3, 4, and partial level 1 list that I've completed I can now read somewhere around 1,300 characters.  This leaves me with around 700 new readings in order to complete my challenge.  I'm currently 52% of the way through the level 1 word list.  In order to complete the remaining 48% I will learn about 350 new readings.  Once I've completed the level 1 list the final leg of my challenge is to tackle the level 2 list.  The reason for completing the level 1 list first was pretty arbitrary.  It's supposed to be more difficult although I'm not sure if that's true or not.

The JLPT 2 list contains 3690 words.  I've calculated that in order to complete this list I will only have to learn an additional 350 Kanji readings.  I felt very relieved when I derived this number.  I'm rapidly approaching the point of diminishing returns on Kanji readings!  I figure that once I've reached 3,000 readings I'll be able to read 99% of printed material and 100% of newspaper articles.  If you guessed that my primary objective with Japanese is to read fluently then you guessed correctly.  I can't wait to sit down and read a newspaper!

Monday, February 8, 2010

Remembering Nonsense Syllables

The Psychology of Forgetting

Hermann Ebbinghaus is a famous 20th century psychologist best known for his discovery of the forgetting curve and the spacing effect in learning.  Modern day flashcard systems known as SRS (spaced-repetition systems) can thank Ebbinghaus for these discoveries.  The underlying spacing algorithms are based on his research findings.  Without SRS flashcard systems I fear I wouldn't have developed such a fondness of the Japanese writing system.

Ebbinghaus conducted a number of experiments where he memorized lists of nonsense syllables.  Nonsense syllables are syllables that have little association with English.  For example the syllable PED would not be considered nonsense because of its association with real words like pedantic and pedal.  Syllables such as DAX are be considered to be nonsense because they have little to no association with real English words.

This research would help Ebbinghaus establish his theory of the forgetting curve.  Those familiar with SRS flashcard systems are already well acquainted with the theory.  Time is relative to forgetting.  The longer we remember something, the longer it takes to forget.  No memory is permanent.  Given a long enough period of time anything is forgettable including your mother tongue.

Organic Association Value

I suggest you read the Wikipedia article on association value instead of reading my poorly regurgitated synopsis.

Association value is a theory in cognitive psychology that supposedly allows us to remember things more easily if we can associate them with something else that we already know.  This is what causes us to learn and acquire foreign language vocabularies at different rates.  The less association value a word has the more abstract it becomes making it more difficult to learn.

Examples of this in language are numerous.  I once read that there are more than 6,000 nouns that are nearly identical in English and Spanish.  In other words the association value is high.  These word-pairs are much more easily acquired when one of the words is already known.  As language families become more distant so do association values.  Japanese for instance borrowed the majority of it's modern lexicon from Chinese along with a good portion of the writing system.  Although the languages are quite different phonetically and morphologically, the distance in writing and vocabulary is much less than in other language families.  The association value is higher across language.

Creating Association Value

It's my belief that if we want to learn a foreign language that is very different from our mother tongue like Japanese that we can speed up the process by creating association value.  Abstractions can be bridged and association value created if we use our creative thinking ability.  Ideas, concepts, and words that are abstract to us can be made less abstract.  The speed of learning increased.

I've published several instances over the past few weeks of how I use mnemonics to create association value.  Here I'll break down the process.

Take the word 産婦人科 (サンフジンカ or san-fu-jin-ka) which means maternity and gynecology department.  If I tried to learn this word last year I would have quit studying Japanese.  The truth now is that the word is quite simple to read and learn.

Here are the Kanji definitions.

産 サン products/childbirth

婦 フ lady

人 ジン person

科 カ department

Given the meanings of individual Kanji the word is simple and logical.  If you know the sounds for each Kanji you have no trouble reading the word.  The problem is that learning the meaning, writing, and phonology for these characters requires a lot of elbow-grease and a different strategy altogether.  The characters have no association value so you have to create it.  Only after you create enough association value is this word easily learned (semantics and phonology for each character).  Thanks to Ebbingaus there's no reason to memorize nonsense.  Break down the language into discrete parts and create your own association value.

Saturday, February 6, 2010

Day 18 - Progress Report (900 words, 15% complete)

It's been several days since I started memorizing Kanji readings before learning my daily word list.  I've found this strategy to be a immensely easier than before.  The numbers below show my daily card progress.

Day 18 new word statistics

1:42 50
1:52 32
2:00 21
2:05 8
2:09 3

As you can see I remembered 94% of the word list by the 5th review.  The final 3 cards may or may not be remembered by tonight.  I'm not really concerned.  Today was my most successful day so far.  I attribute this to learning the 46 new card readings in advance.

I've been recording the number of new Kanji readings that I've deliberately learned preceding the learning of new words.  The number of new readings should drop over time making the learning of new words even easier. Although I estimate that I can read over 1,000 Kanji, I currently have 382 cards in my onyomi reading deck.

New Kanji Readings (382 total)
2/7 - 46
2/6 - 17
2/5 - 38
2/4 - 48

Ideally new readings should be learned the day before the words that use them.  However I've been doing them about an hour to several hours before learning words.  It seems to be working.  If I have the time I will try to increase my number of new readings per day.

Thursday, February 4, 2010

What does success look like?

I define success as a daily perseverance towards a goal.  My goal for today is to memorize 50 new Japanese words and maintain my growing vocabulary (current est. 2,373 words and 1k+ Kanji readings).

I'm wearing an eye-patch today because there's something wrong with my eyes.  I occasionally get severe headaches.  Although I can't attribute it entirely to EJS (extreme Japanese studying) it does seem somewhat related to extensive monitor viewing.  Maybe my right eye compensates for the left.  Either way, the patch is helping.

What does success look like?  Well, I consider myself a success today and here's what I look like.  Eye-patch, black beanie, headset, and a pretty serious facial expression.

I will not fail today.  I will persevere.  What does your success look like?

Tuesday, February 2, 2010

Day 13 (10.83% complete)

Until today I've been learning new Kanji readings from words themselves.  The process works but it's still very difficult and time consuming.  I've decided to augment my approach so that the acquisition of new words goes a little more smoothly.

I have a new deck of Kanji onyomi readings.  These are onyomi recognition cards only.  My new strategy is to add new cards to this deck as I learn new words.  The day before I learn a new set of words I'll add any new Kanji to this deck and study the readings.  This will give me some exposure to the reading before I get to the new words in the vocabulary deck.  It seems obvious that if you know the meaning of the Kanji and the readings that learning the definitions is like the final step.

Today was my first day trying this approach.  It was time consuming to manually create nearly 50 new onyomi reading cards but I think it was worth it.  This morning I memorized the new onyomi readings before proceeding onto the vocabulary deck.  Although I used to be generally opposed to the idea of memorizing readings I think this is a fantastic way to prepare for the brute-force method I'm employing for learning a massive word list.

There is a great side-effect to this approach.  Learning readings is generally pretty easy.  I attribute the relative ease to my prior completion of Heisig's Remembering the Kanji book.  When I'm through with my challenge I will have failed/suspended a large number of word cards.  However, I will have the ability to read almost all of them during my next run.  When I go through the list of remaining words at the end of my challenge I predict that they will be easily acquired.

That's all for now.  I've got to get to calligraphy class!

Tying your shoelaces

Two weeks before I moved from San Francisco to Japan I decided to challenge myself.  I was going to walk 20 miles per day for 10 days.  The distance from my apartment across the Golden Gate Bridge and back was roughly 20 miles.  The challenge was to do this for 10 days.  It would end up costing almost 5 hours per day but my iPod was speaking Japanese, my running shoes were equipped, and the sandwiches and Cliff Bars were all packed.  I was ready for my challenge!

When I started walking I never imagined that it would become a physical challenge.  I'm no athlete but I'm in really good condition for a guy that spends most of his life sitting.  My intuition was that distance walking was relatively easy and that 5 consecutive hours of it everyday would simply be a psychological challenge.  If I quit it would be from boredom and not from the physical stress.

As it turns out walking 5 hours by yourself is boring.  Luckily I had the Golden Gate Bridge to keep me company.  If you've ever visited or lived in San Francisco you're probably familiar with the strange weather patterns.  One minute it can be sunny and hot.  The next minute it can be cold, rainy, and cloudy.  The quickly shifting weather patterns and fog rolling in and out of the bay made for some breathtaking views that helped pass the time.  It was an experience that makes me happy when I think about it.  Every day was unique.  It feels strange for me to say now but there was something spiritual about it.

By the 6th day I could barely walk.  Wearing a brand new knee and ankle brace I limped to the bus stop to meet a friend for brunch.  I'm fortunate that he convinced me that it would be foolish to continue my challenge.  I'm a pretty stubborn guy but he asserted that I risked permanent injury and I believed him.  After all, I tied my shoelaces too tight causing the tendons in my ankle to become inflamed.

My ankle was full of fluid and I could hear it creak every time I moved it.  The change was so gradual over the course of 100 miles that I hardly noticed until it was too late.  The soreness I experienced after walking felt natural to me at the time.  I can sleep it off, I thought.  It's probably just muscle soreness.  It took over a month for my ankle to completely heal and I enjoyed my last week in San Francisco with a limp.

Sometimes I can only see the big picture.  The details of a journey are simply variables for a future postmortem analysis.  It's important to monitor the details, especially when we're tired and everything seems to be going according to plan.  The warning signs of a problem can be so subtle that our intuition tells us to ignore it.  I think it's true in many aspects of our lives: health, business, relationships, and even language learning.  Everything we experience gives us subtle cues about the condition of our future.  My older sister taught me how to tie my shoelaces over 20 years ago.  As I discovered, tying your shoelaces for 20 years doesn't always make you an expert shoelacer.

I feel very nostalgic about the week I walked across the Golden Gate Bridge.  I look forward to doing it again one day.  Next time I'll check my laces.

Monday, February 1, 2010

Homophones and the Death Star

I read a post on Victory Manual today that talks about the use of mnemonics for remembering words.  Alex brings up the words 衛星 and 衛生 which mean satellite and hygiene respectively.  Both words are pronounced エイセイ.  Coincidentally I learned the word for satelite just yesterday.

Here's the mnemonic I used.

Word: 衛星
Kanji Meaning: defense + star
Mnemonic: Death Star

When I saw that the Kanji for defense and star were used the image of the Death Star from the movie Star Wars popped into my head immediately.  There's no getting rid of it now.  The cool thing about mnemonics is that they work so well sometimes that you don't even get a choice in their design.  It's automatic.

At the time I didn't know that there was a homophone for エイセイ that meant hygiene.  As soon as I read this I updated my mnemonic to include Darth Vader washing his hands.  Lexicon +1.

While I was typing the word エイセイ into this post I realized that there are actually three other homophones.  Here they are.

永世 eternity; perpetuity; immortality; permanence
永生 eternal life; immortality
永逝 death; dying

Death Star.  It was purely coincidental but now I can add the meanings immortality and death to my mnemonic.  Every now and then it's nice to learn 5 words for the price of 1!