[Mediawiki-i18n] [Wikitech-l] Language codes vs site codes

Siebrand Mazeland (WMF) smazeland at wikimedia.org
Wed May 9 07:46:31 UTC 2012

Hi Denny,

On Wed, May 9, 2012 at 3:36 AM, Denny Vrandečić
<denny.vrandecic at wikimedia.de> wrote:
> (Siebrand, I am unsure if this will arrive at mediawiki-i18n, feel free to
> forward it you consider it interesting to them).

It does after the list admin approves it, but you may just want to subscribe[1].

> OK, I've written a few lines of Python [1] which actually helped me answer
> my questions. Sorry to bother.
> And the answers are yes, yes, no, but close, and i hope so.

No problem. We like people answering their own question. More time for
us to do other things :).

> There are a small number of wikis which use a different language code than
> their site code is, namely:
> crh -> crh-latn
> als -> gsw
> be-x-old -> be-tarask
> roa-rup -> rup
> simple -> en

There are a few more, actually. See includes/DefaultSettings.php,

> But, at the same time, the given *site* codes exist as *language* codes as
> well, i.e. the languages/messages files exist for them, but they just
> fallback to the given language code (i.e. MessagesAls.php just names gsw as
> a fallback).

That's an issue with the Wikimedia setup, I guess. If the language
code could be different from the subdomain name, that is what should
be done. It's probably not as simple as it looks. See next item.

> I would not be surprised if each of these five examples would have an
> anecdote to explain why they are the way they are :)

That, and a past in which there was less attention for trying to stick
to a particular standard. Doesn't really matter, we're stuck with it
for now, and should try to not make it worse, and fix it on the long
run. For the 5+ years that I'm involved in MediaWiki development,
there have been requests to rename wikis to more appropriate subdomain
names, but for some reason, no progress has been made on it yet. These
10+ requests are tracked in https://bugzilla.wikimedia.org/19986.

> P.S.: There is one thing I do not understand though. According
> to https://simple.wikipedia.org/w/api.php?action=query&meta=siteinfo the
> language of simple.wp is "en", but MessagesSimple.php seems to be taken into
> account (instead of "edit" it has "change" in the UI, one of only two
> changes in MessagesSimple to MessagesEn).

Thanks for mentioning. That shouldn't have been there. Fixed in

> So it seems that the language is
> "simple" -- why does it say "en" in the siteinfo?

Because it *is* English. It just should be English with a reduced
vocabulary. There have been many debates in the past over its
usefulness, and if possibly other languages should also get a simple
vocabulary Wikimedia project (and subdomain). This is just for
reference, please do not comment on this in this thread, but start a
new one if you'd wish to discuss simple language versions.

Not sure if this made your insight clearer, but at least I hope I was
able to add some details :).

[1] https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n


Siebrand Mazeland
Product Manager Localisation
Wikimedia Foundation

M: +31 6 50 69 1239
Skype: siebrand

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

More information about the Mediawiki-i18n mailing list