[Translators-l] TranslateSvg: beta version available for testing

Philippe Verdy verdy_p at wanadoo.fr
Sun Aug 5 10:46:27 UTC 2012


For an example of the problem, look at this typical image where the
existing source labels (here in French) are splitted into multiple
<tspan>'s : there's no easy way in the interface of translation (for
example here to Russian) to group these tspans according to their
parent <text> element, and allow a translation to group these tspans
in a <g> group for each language, these <g> groups being the real
members of a generated <switch> element which will be the new child of
the <text> element.

http://translatesvg.wmflabs.org/w/index.php?language=ru&title=Special%3ATranslate&taction=translate&group=Sample7308a.svg&limit=100&task=view

Here, a structure like:
   <text>
     <tspan>...</tspan>
     <tspan>...</tspan>
  <text>

Will need to be converted first into:
   <text>
     <g lang="fr">
       <tspan>...</tspan>
       <tspan>...</tspan>
     </g>
   <text>
using some meta information requested to the first user converting the
SVG to a translatable version where this will be the correct first
language.

And then the multilingual version will be created by inserting the switch :
   <text>
     <switch>
       <g systemLanguage="fr">
         <tspan>...</tspan>
         <tspan>...</tspan>
       </g>
     <switch>
   <text>
(the source language is normally kept as the default fallback
language, the last in the switch). With this kind of layout, it will
be possible for a translation to add or remove tspans as necessary,
and to organize where line breaks will really appear, as their
translation may not match one for one at the <tspan> level, but only
at the <text> level.

Note also that the generated id's in the SVG for each successive
fallback in the list of child elements of the <switch> each time adds
a "fallback-" part before the new language code. This would create
overlong id's like:

tspan3065-fallback-fallback-fallback-fallback-fallback-fallback-fr

(see http://translatesvg.wmflabs.org/w/images/5/53/Picturebook_3.svg,
and look at the SVG source to compare the generated IDs that are each
time longer for the child tspan of each <text> part of a switch)

The "fallback-" part should only appear in the generated ID on the
last child element of the switch. My opinion is that it is a bug of
the ID generator. But even in this case, the "fallback-" part is never
needed (look at the correct generated ID's for the parent <text> which
just appends the distinct language code to the original IDs (note also
that this simpler methods does not warranty that the generated ID will
be really distinct, unless the original ID's before starting the
translation to not create collision elsewhere.

Note also that the ":" character is allowed in an ID, and to make an
ID like "text528" unique after it is translation a naming convetion
similar to Wikimedia would generate ID's like "en:text528" with a
language code prefix: if an original ID already uses a colon, the
simpelst method to avoid collisions after the translation is to
convert all these original IDs by duplicating these original colons.
So if an original ID (in a SVG file without the language switch) is
something like ":xyz", its translatable ID will be changed first into
"::xyz" itself prefixed with the language code so it will start as
":fr:::xyz"; but if the orinal ID is "xyz:t" it is directly
translatable without other preprocessing by just prepending the prefix
":langcode:" as ":fr:xyz:t" without problem here because the oriignal
ID did not contain a leading colon ; there are a lot other possible
schemes to ensire ID's will remain unique while preserving much of the
original IDs in the members of the generated switches)

2012/8/5 Philippe Verdy <verdy_p at wanadoo.fr>:
> Tried tit but for being really usable to produce translated images,
> the Mediawiki thumbnail generator should be able to generate
> translated versions of the images in distinct paths, by recognizing
> the method used in the saved SVG file : the <switch> statement that is
> using also a member element's property which matches the
> "systemlocale".
>
> Another problem for this method : the systemlocale is not necessarily
> the one used to render pages in Mediawiki. Notably it will not match
> the language selected in MediaWiki by the user account settings or by
> navigating any page with the "uselang" query parameter.
>
> If there a possibility of creating multiple distinct versions of the
> SVG within the same HTTP "folder", using the untranslated image (or
> the autotranslated page for the name of the parent folder, and the
> same name of the language code for the child element, for both its
> description page and the media page) ? Could it cause problems with
> how images are currently named and saved in Wikimedia sites (for now
> the "File:" namespace does not support folders, so the "/" in an image
> is part of the image name and this could create conflicts.
>
> Are there other solutions for supporting translated images (also with
> a "uselang" query parameter to get a SVG image without the language
> switch but selecting the orrect labels, as well as a way to get the
> PNG thumbnail generator to include the language code in the thumbnail
> names, as well as creating separate histories, or an hostory filter
> for each language)....
>
> This will remain for now a Beta as long as the <switch> SVG feature is
> not correctly supported in renderers and there's no easy way also to
> indicate that by default the image should be rendered in the wiki's
> default language if there's no user preferences for selecting the
> corret language, and a way to navigate between languages (for example,
> the history if these images shows thumbnails showing only the Englosh
> version, and differences between versions are not visible when they
> affect only one language which is not English, and no way to see the
> other versions in the thumbnails of the history, plus no standard way
> in browsers to select which language to display in a multilingual
> SVG).
>
> Full support of multilingual images, with a derived and cached
> collection of autogenerated monolingual images seems to be the best
> long term solution.
>
> Finally it seems OK to allow a translation to change the display
> font-size of even the font family (and the label positions), or to
> drop some font styles (notably bold and italic), but it does not look
> safe to allow a transaltion to add bold and italic styles, or to add
> or remove the underlines.
>
> Other attributes may also be necessary for RTL languages, because
> sometimes a single label will be translated using multiple scripts
> requiring bidirectional processing bplus Bidi overrides (new <tspan>
> elements will be needed to split an existing label when translating
> from English to Arabic for example, and in some cases it will also be
> necessary to allow <tspans> to include more or less linebreaks with
> <br> elements between <tspans>).
>
> Note that changing font styles may impact applications like
> cartographic maps, because all font styles have their own semantics
> (for example making distinctions between major cities and minor
> cities:  if a label is too long, it may overlap other labels and a
> choice will require to hide some other labels



More information about the Translators-l mailing list