[vlc-devel] ISO 639-3

John Cowan cowan at ccil.org
Sun Jan 31 09:04:11 CET 2010

Bruno Haible scripsit:

> Well, that's theory. In reality [1], in ISO 639-3, in one year, they have
> merged 8 language identifiers and split 9 language identifiers.

Quite so.  So 8 language identifiers still refer to the same entities
as before, but those entities aren't considered languages, but rather
dialects of the language they are merged into.  Their denotation has
not changed.

Likewise, the 9 identifiers still refer to the same entities as before,
but those identities aren't ocnsidered languages, but rather groups or
collections of languages into which they were split.  Their denotation
has not changed.

> Whereas in ISO 639-2, the official rules are equally weak [2], but much
> less changes are made in practice [3]:
>   - In 2008, Croatian and Serbian have received new 3-letter codes, but this
>     is irrelevant for Unix since they already had 2-letter codes and these were
>     not changed.
>   - In 2001, Javanese has received a new 2-letter code and a new 3-letter code.
>   - Deprecated identifiers (mo, jw, sh, in, ji) will not be reused.

Deprecated identifiers in 639-3 will not be reused either.

> In summary, it is fine to use identifiers from ISO 639-1 and ISO 639-2 for
> translation catalog names. But when you use ISO 639-3 identifiers, you should
> be prepared to surprises.

A change in denotation would be surprising.  A change in scope to or
from dialect, individual language, macrolanguage, and language collection
should not be.

John Cowan       http://www.ccil.org/~cowan        <cowan at ccil.org>
        You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
        You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
                Clear all so!  `Tis a Jute.... (Finnegans Wake 16.5)

More information about the vlc-devel mailing list