This program accepts input in ローマ字 (rōmaji) and produces the equivalent ひらがな (hiragana) and カタカナ (katakana) output. This program does not map arbitrary English phonemes into the Japanese sound system, nor does it translate from English to Japanese.

Transliteration is done by use of an associative array (hash table) that contains mappings from syllables written in Latin letters to 仮名 (kana). ひらがな may be produced by entering lowercase letters, カタカナ by using uppercase. Some punctuation is also allowed, though the Romanization for the lesser-used brackets is quite non-standard.

Some attempt has been made to simultaneously support the most common Romanization systems (修正ヘボン式ローマ字 (Modified Hepburn), 訓令式ローマ字 (Kunrei-siki), and 日本式ローマ字 (Nihon-siki), though completely meeting that goal is impossible. Where different systems conflict, precedence has been given to 日本式ローマ字 since it is the most regular and has a 1-to-1 relation between 仮名 and rōmaji.

Unrecognized characters in the input will be returned unprocessed but highlighted in a different color.

Quotes

Apostrophes (single quotes) are used to differentiate ん (syllabic n) from a doubled consonant when it appears before one of なにぬねの. They are similarly used to prevent ん from binding with vowels and y syllables. For example, 「んな」, 「んあ」, and 「んや」 ("n'na", "n'a", and "n'ya") all require an apostrophe, while 「っな」, 「な」, and 「にゃ」 ("nna", "na", and "nya") do not.

An attempt has been made to translate Latin quotes properly such that regular double and single quotes as are used in English will be converted into left and right corner brackets (「」『』) as appropriate. However, there is a peculiarity with the way that white corner brackets (『 and 』) are represented:

Normally single quotes would be used to represent white corner brackets, but this is not possible since single quotes have already been used as described previously. Instead, white corner brackets are represented by doubled double quotes (""). This is admittedly non-standard and may change if a better solution is discovered.

A limitation of the quote translation is that quotes must be balanced. That is, if a set of regular corner brackets is opened, it is necessary to close it before the next set of the same type of bracket may be opened. For example, this ordering of brackets is possible: 「『』『』」「」 (and may be generated with this rather improbable sequence: "'""'""'""'""'"'"'") while this one is not: 「『」』.

X Characters

Some characters are difficult to represent in rōmaji. These characters' Romanization has been preceded with an "x". The cases where "x" is used are:

Small versions of characters may be accessed using "x". For example, to produce a 促音 (sukuon, or "little tsu") enter "xtu" or "XTU", which will output っ or ッ. Of course, double consonants will be replaced by the 促音 without having to explicitely specify "xtu" or "XTU". A number of small カタカナ that are only used by the Ainu language are supported, though they are included solely for completeness as the intention of this program is for transliteration of Japanese, not Ainu.
The カタカナ syllables ヷ (va), ヸ (vi), ヹ (ve), ヺ (vo) have been replaced in modern usage with ヴ (vu) plus a small vowel. The older single-symbol versions of these syllables may be accessed with "XVA", "XVI", etc.
The very rare "ŋ" or "ng" sounds are produced this way using strings such as "xnga". This was done to avoid collision with the far more prevalent "nga" sequence that should more commonly be rendered as 「んが」.

Output

Several output encodings are available. UTF-8 is the preferred encoding, and all modern web browsers should natively support it. ISO-8859-1 encoding with XHTML &#codes; is provided as a second option for older browsers that do not properly handle Unicode. In addition, the following legacy encodings are provided: EUC-JP, Shift_JIS, ISO-2022-JP. Please note that the legacy encodings have not been as thoroughly tested.

All pages except the Source Code page inherit the encoding set on the main page. The Source Code page always uses UTF-8 since that is what the script is actually encoded in and a different encoding may break it.

Quotes

X Characters

Output

Meta