[Mediawiki-i18n] Please view and comment CAPTCHA images in 154 languages

praveenp me.praveen at gmail.com
Sun Mar 30 18:16:44 UTC 2014


On Sunday 30 March 2014 08:40 PM, Federico Leva (Nemo) wrote:
> As said, if you find <https://en.wiktionary.org/?oldid=23646739> or 
> others offensive, please just edit and add {{context|vulgar}} or 
> |obscene or whatever appropriate.
I'll try.
> Yes, all the problems you mention seem just to be consequences of 
> this. The entry in question is <https://en.wiktionary.org/wiki/%E0%B4%BE>

Could you check any rendering issues also? In this image - 
image_077ebd23_d890a7083e967d92.png - vowel sign appears after the 
letter as മുംബ ൈ (without space) correct one is മുംബൈ. Images 
image_00896685_3f5db13f53a2f352.png , 
image_35628971_fbfc5b67d488e883.png , 
image_b5d2be0d_7223dc2282b35e15.png etc.. also share similar problem.


> Is there some generalisable learning here? Exclude letters? 
> (Wiktionary experts should tell us if they're all tagged as such.) 
> Only use "words" of at least two unicode characters?

Vowel signs should not start a captcha (or any of the words in captcha) 
and no two vowel signs should appear side by side.

Vowel signs for Malayalam: ാ, ി, ീ, ു, ൂ, ൃ, െ, േ, ൈ, ൊ, ോ, ൗ, ൌ
Other signs (above same rule should be applied on these signs also) : ്, ം, ഃ

Vowel letters should not be in the middle of a word (or captcha)
Vowel letters: അ, ആ, ഇ, ഈ, ഉ, ഊ, ഋ, ഌ, എ, ഏ, ഐ, ഒ, ഔ

(Possibly these rules are applicable to other Indic languages also 
because their vowel letters and vowel signs act very similar to Malayalam.)

If possible, do not include Malayalam chillu characters [1] in captcha 
(atleast for now) because they have two encodings possible since Unicode 
5.1.0. Normalization enabled only in ml.wikis and bug to enable 
normalization in all wikimedia wikis still pending [2].

If possible, limit the Malayalam block to U+0D02 to U+0D57, because 
other characters (except chillu characters) are not popular and probably 
not even mapped in keyboards. In the limit itself U+0D3A, 0D3D and 0D4E 
should be avoided which are also facing similar uncertainty.

[1]: http://unicode.org/versions/Unicode5.1.0/#Malayalam_Chillu_Characters
[2]: https://bugzilla.wikimedia.org/show_bug.cgi?id=45476


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/mediawiki-i18n/attachments/20140330/5f956181/attachment-0001.html>


More information about the Mediawiki-i18n mailing list