573 lines
20 KiB
Plaintext
573 lines
20 KiB
Plaintext
PEP: 545
|
||
Title: Python Documentation Translations
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Victor Stinner <victor.stinner@gmail.com>,
|
||
Inada Naoki <songofacandy@gmail.com>,
|
||
Julien Palard <julien@palard.fr>
|
||
Status: Draft
|
||
Type: Process
|
||
Content-Type: text/x-rst
|
||
Created: 04-Mar-2017
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
The intent of this PEP is to make existing translations of the Python
|
||
Documentation more accessible and discoverable. By doing so,
|
||
attracting and motivating new translators and new translations.
|
||
|
||
Translated documentation will be hosted on python.org. Examples of
|
||
two active translation teams:
|
||
|
||
* http://docs.python.org/fr/: French
|
||
* http://docs.python.org/jp/: Japanese
|
||
|
||
http://docs.python.org/en/ will redirect to http://docs.python.org/.
|
||
|
||
Sources of translated documentation will be hosted in the Python
|
||
Documentation organization on GitHub: https://github.com/python-docs/.
|
||
Contributors will have to sign the Python Contributor Agreement (CLA)
|
||
and the license will be the PSF License.
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
On the french ``#python-fr`` IRC channel on freenode, it's not rare to
|
||
meet people who don't speak english and so are unable to read the
|
||
Python official documentation. Python wants to be widely available,
|
||
to all users, in any language: that's also why Python 3 now allows
|
||
any non-ASCII identifiers:
|
||
https://www.python.org/dev/peps/pep-3131/#rationale
|
||
|
||
There are a least 3 groups of people who are translating the Python
|
||
documentation in their mother language (french [16]_ [17]_ [18]_,
|
||
japanese [19]_ [20]_, spanish [21]_), even though their translation
|
||
are not visible on d.p.o. Other less visible and less organized
|
||
groups are also translating in their mother language, we heard of
|
||
Russian, Chinese, Korean, maybe some others we didn't found yet. This
|
||
PEP defines rules to move translations on docs.python.org so they can
|
||
easily be found by developers, newcomers and potential translators.
|
||
|
||
The Japanese team currently (March 2017) translated ~80% of the
|
||
documentation, french team ~20%. French translation went from 6% to
|
||
23% in 2016 [13]_ with 7 contributors [14]_, proving a translation
|
||
team can be faster than documentation mutates.
|
||
|
||
|
||
Quoting Xiang Zhang about Chinese translations:
|
||
|
||
I have seen several groups trying to translate part of our official
|
||
doc. But their efforts are disperse and quickly become lost because
|
||
they are not organized to work towards a single common result and
|
||
their results are hold anywhere on the Web and hard to find. An
|
||
official one could help ease the pain.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Translation
|
||
-----------
|
||
|
||
Issue tracker
|
||
'''''''''''''
|
||
|
||
Considering that issues opened about translations may be written in
|
||
the translation language, which can be considered noise but at least
|
||
is inconsistent, issues should be placed outside `bugs.python.org
|
||
<https://bugs.python.org/>`_ (b.p.o).
|
||
|
||
As all translation must have their own github project (see `Repository
|
||
for Po Files`_), they must use the associated github issue tracker.
|
||
|
||
Considering the noise induced by translation issues redacted in any
|
||
languages which may beyond every warnings land in b.p.o, triage will
|
||
have to be done. Considering that translations already exist and are
|
||
not actually a source of noise in b.p.o, an unmanageable amount of
|
||
work is not to be expected. Considering that Xiang Zhang and Victor
|
||
Stinner are already triaging, and Julien Palard is willing to help on
|
||
this task, noise on b.p.o is not to be expected.
|
||
|
||
Also, language team coordinators (see `Language Team`_) should help
|
||
triaging b.p.o by properly indicating, in the issue author language if
|
||
needed, the right issue tracker.
|
||
|
||
|
||
Branches
|
||
''''''''
|
||
|
||
Translation teams should focus on last stable versions, and use tools
|
||
(scripts, translation memory, …) to automatically translate what is
|
||
done in one branch to other branches.
|
||
|
||
.. note::
|
||
Translation memories are a kind of database of previously translated
|
||
paragraphs, even removed ones. See also `Sphinx Internationalization
|
||
<http://www.sphinx-doc.org/en/stable/intl.html>`_.
|
||
|
||
The three stable branches will be translated [12]_: 2.7, 3.5, and 3.6.
|
||
The scripts to build the documentation of older branches have to be
|
||
modified to support translation [12]_, whereas these branches now only
|
||
accept security-only fixes.
|
||
|
||
The development branch (master) should have a lower translation priority
|
||
than stable branches. But docsbuild-scripts should build it anyway so
|
||
it is possible for a team to work on it to be ready for the next
|
||
release.
|
||
|
||
|
||
Hosting
|
||
-------
|
||
|
||
Domain Name, Content negociation and URL
|
||
''''''''''''''''''''''''''''''''''''''''
|
||
|
||
Different translation can be told appart by changing one of:
|
||
Country Code Top Level Domain (CCTLD),
|
||
path segment, subdomain, or by content negociation.
|
||
|
||
Buying a CCTLD for each translations is expensive, time-consuming, and
|
||
sometimes almost impossible when already registered, this solution
|
||
should be avoided.
|
||
|
||
Using subdomains like "es.docs.python.org" or "docs.es.python.org" is
|
||
possible but confusing ("is it `es.docs.python.org` or `docs.es.python.org`?").
|
||
Hyphens in subdomains like
|
||
`pt-br.doc.python.org` in uncommon and SEOMoz [23]_ correlated the presence of
|
||
hyphens as a negative factor. Usage of underscores in subdomain is
|
||
prohibited by the RFC1123 [24]_, section 2.1. Finally using subdomains
|
||
means creating TLS certificates for each languages, which is more
|
||
maintenance, and will probably causes us troubles in language pickers
|
||
if, like for version picker, we want a preflight to check if the
|
||
translation exists in the given version: preflight will probably be
|
||
blocked by same-origin-policy. Wildcard TLS certificates are very
|
||
expensive.
|
||
|
||
Using content negociation (HTTP headers ``Accept-Language`` in the
|
||
request and ``Vary: Accept-Language``) leads to a bad user experience
|
||
where they can't easily change the language. According to Mozilla:
|
||
"This header is a hint to be used when the server has no way of
|
||
determining the language via another way, like a specific URL, that is
|
||
controlled by an explicit user decision." [25]_. As we want to be
|
||
able to easily change the language, we should not use the content
|
||
negociation as a main language determination, so we need somthing
|
||
else.
|
||
|
||
Last solution is to use the URL path, which looks readable, allows
|
||
for an easy switch from a language to another, and nicely accepts
|
||
hyphens. Typically something like: "docs.python.org/de/". Example
|
||
with a hyphen: "docs.python.org/pt-BR/"
|
||
|
||
As for version, sphinx-doc does not support compiling for multiple
|
||
languages, so we'll have full builds rooted under a path, exactly like
|
||
we're already doing with versions.
|
||
|
||
So we can have "docs.python.org/de/3.6/" or
|
||
"docs.python.org/3.6/de/". Question is "Does the language contains
|
||
multiple version or does version contains multiple languages?" As
|
||
versions exists in any cases, and translations for a given version may
|
||
or may not exists, we may prefer "docs.python.org/3.6/de/", but doing
|
||
so scatter languages everywhere. Having "/de/3.6/" is clearer about
|
||
"everything under /de/ is written in deutch". Having the version at
|
||
the end is also an habit taken by readers of the documentation: they
|
||
like to easily change the version by changing the end of the path.
|
||
|
||
So we should use the following pattern:
|
||
"docs.python.org/LANGUAGE_TAG/VERSION/".
|
||
|
||
Current documentation is not moved to "/en/", but "docs.python.org/en/"
|
||
will redirect to "docs.python.org/en/".
|
||
|
||
|
||
Language Tag
|
||
''''''''''''
|
||
|
||
A common notation for language tags is the IETF Language Tag [3]_
|
||
[4]_ based on ISO 639, alghough gettext uses ISO 639 tags with
|
||
underscores (ex: ``pt_BR``) instead of dashes to join tags [5]_
|
||
(ex: ``pt-BR``). Examples of IETF Language Tags: ``fr`` (French),
|
||
``jp`` (Japanese), ``pt-BR`` (Orthographic formulation of 1943 -
|
||
Official in Brazil).
|
||
|
||
It is more common to see dashes instead of underscores in URLs [6]_,
|
||
so we should use IETF language tags, even if sphinx uses gettext
|
||
internally: URLs are not meant to leak the underlying implementation.
|
||
|
||
It's uncommon to see capitalized letters in URLs, and docs.python.org
|
||
don't use any, so it may hurt readability by attracting the eye on it,
|
||
like in: "https://docs.python.org/pt-BR/3.6/library/stdtypes.html".
|
||
RFC 5646 (Tags for Identifying Languages (IETF)) section-2.1 [7]_
|
||
tells the tags are not case sensitive. As the RFC allows lower case,
|
||
and it enhances readability, we should use lowercased tags like
|
||
``pt-br``.
|
||
|
||
It's redundant to display both language and country code if they're
|
||
the same, typically "de-DE", "fr-FR", although it make sense,
|
||
respectively "Deutch as spoken in Germany" and "French as spoken in
|
||
France", it's not a usefull information for the reader. So we may drop
|
||
those redundencies. We should obviously keep the country part when it
|
||
make sense like "pt-BR" for "Portuguese as spoken in Brazil".
|
||
|
||
So we should use IETF language tags, lowercased, like ``/fr/``,
|
||
``/pt-br/``, ``/de/`` and so on.
|
||
|
||
|
||
Fetching And Building Translations
|
||
''''''''''''''''''''''''''''''''''
|
||
|
||
Currently docsbuild-scripts are building the documentation [8]_. These scripts
|
||
should be modified to fetch and build translations.
|
||
|
||
Building new translations is like building new versions, so we're
|
||
adding complexity, but not that much.
|
||
|
||
Two steps should be configurable distinctively: Build a new language,
|
||
and add it to the language picker. This allows a transition step
|
||
between "we accepted the language" and "it is translated enough to be
|
||
made public", during this step, translators can review their
|
||
modifications on d.p.o without having to build the documentation
|
||
locally.
|
||
|
||
From the translations repositories, only the ``.po`` files should be
|
||
opened by the docsbuild-script to keep the attack surface and probable
|
||
bugs sources at a minimum. This mean no translation can patch sphinx
|
||
to advertise their translation tool. (This specific feature should be
|
||
handled by sphinx anyway [9]_).
|
||
|
||
|
||
Community
|
||
---------
|
||
|
||
Mailing List
|
||
''''''''''''
|
||
|
||
The `doc-sig`_ mailing list will be used to discuss cross-language
|
||
changes on translated documentations.
|
||
|
||
There is also the i18n-sig list but it's more oriented towards i18n APIs
|
||
[1]_, than translation the Python documentation.
|
||
|
||
.. _i18n-sig: https://mail.python.org/mailman/listinfo/i18n-sig
|
||
.. _doc-sig: https://mail.python.org/mailman/listinfo/doc-sig
|
||
|
||
|
||
Chat
|
||
''''
|
||
|
||
Python community being highly active on IRC, we should create a new
|
||
IRC channel on freenode, typically #python-doc for consistency with
|
||
the mailing list name.
|
||
|
||
Each language coordinator can organize its own team, even by choosing
|
||
another chat system if the local usage asks for it. As local teams
|
||
will write in their native languages, we don't want each team in a
|
||
single channel, and it's also natural for the local teams to reuse
|
||
their local channels like "#python-fr" for french translators.
|
||
|
||
|
||
Repository for PO Files
|
||
'''''''''''''''''''''''
|
||
|
||
Considering that each translation teams may want to use different
|
||
translation tools, and that those tools should easily be synchronized
|
||
with git, all translations should expose their ``.po`` files via a git
|
||
repository.
|
||
|
||
Considering that each translation will be exposed via git
|
||
repositories, and that Python has migrated to GitHub, translations
|
||
will be hosted on github.
|
||
|
||
For consistency and discoverability, all translations should be in the
|
||
same github organization and named according to a common pattern.
|
||
|
||
Considering that we want translations to be official, and that Python
|
||
already have a github organization, translations should be hosted as
|
||
projects of the `Python documentation GitHub organization`_.
|
||
|
||
For consistency, translations repositories should be called
|
||
``python-docs-LANGUAGE_TAG`` [22]_.
|
||
|
||
The docsbuild-scripts may enforce this rule by refusing to fetch
|
||
outside of the Python organization or a wrongly named repository.
|
||
|
||
The CLA bot may be used on the translation repositories, but with a
|
||
limited effect as local coordinators may synchronize themselves
|
||
translations from an external tool like transifex, loosing in the
|
||
process who translated what.
|
||
|
||
Version can be hosted on different repositories, different directories
|
||
or different branches. Storing them on different repositories will
|
||
probably pollute the Python documentation github organization. As it
|
||
is typical and natural to use branches to separate versions, branches
|
||
should be used to do so.
|
||
|
||
.. _Python documentation GitHub organization: https://github.com/python-docs/
|
||
|
||
|
||
Translation tools
|
||
'''''''''''''''''
|
||
|
||
Most of the translation work is actually done on Transifex [15]_.
|
||
|
||
Other tools may be used later https://pontoon.mozilla.org/
|
||
and http://zanata.org/
|
||
|
||
|
||
Contributor Agreement
|
||
'''''''''''''''''''''
|
||
|
||
Contributions to translated documentation will be requested to sign the
|
||
Python Contributor Agreement (CLA):
|
||
|
||
https://www.python.org/psf/contrib/contrib-form/
|
||
|
||
|
||
Language Team
|
||
'''''''''''''
|
||
|
||
Each language team should have one coordinator responsible to:
|
||
|
||
- Manage the team
|
||
- Choose and manage the tools its team will use (chat, mailing list, …)
|
||
- Ensure contributors understand and agree with the CLA
|
||
- Ensure quality (grammar, vocabulary, consistency, filtering spam, ads, …)
|
||
- Do redirect to GitHub issue tracker issues related to its
|
||
language on bugs.python.org
|
||
|
||
The license will be the `PSF License <https://docs.python.org/3/license.html>`_,
|
||
and copyright should be transferable to PSF later.
|
||
|
||
|
||
Alternatives
|
||
------------
|
||
|
||
Simplified English
|
||
''''''''''''''''''
|
||
|
||
It would be possible to introduce a "simplified english" version like
|
||
wikipedia did [10]_, as discussed on python-dev [11]_, targetting
|
||
english learners and childrens.
|
||
|
||
Pros: It yields a single other translation, theorically readable by
|
||
everyone, and reviewable by current maintainers.
|
||
|
||
Cons: Subtle details may be lost, and translators from english to english
|
||
may be hard to find as stated by Wikipedia:
|
||
|
||
> The main English Wikipedia has 5 million articles, written by nearly
|
||
140K active users; the Swedish Wikipedia is almost as big, 3M articles
|
||
from only 3K active users; but the Simple English Wikipedia has just
|
||
123K articles and 871 active users. That's fewer articles than
|
||
Esperanto!
|
||
|
||
|
||
Changes
|
||
=======
|
||
|
||
Migrate GitHub Repositories
|
||
---------------------------
|
||
|
||
We (authors of this PEP) already own french and japanese Git
|
||
repositories, so moving them to the Python documentation organization will not be a
|
||
problem. We'll however follow the `New Translation Procedure`_.
|
||
|
||
|
||
Patch docsbuild-scripts to Compile Translations
|
||
-----------------------------------------------
|
||
|
||
Docsbuild-script must be patched to:
|
||
|
||
- List the languages tags to build along with the branches to build.
|
||
- List the languages tags to display in the language picker.
|
||
- Find translation repositories by formatting
|
||
``github.com:python-docs/python-docs-{language_tag}.git`` (See
|
||
`Repository for Po Files`_)
|
||
- Build translations for each branches and each languages
|
||
|
||
Patched docsbuild-scripts must only open ``.po`` files from
|
||
translation repositories.
|
||
|
||
|
||
List coordinators in the devguide
|
||
---------------------------------
|
||
|
||
Add a page or a section with an empty list of coordinators to the
|
||
devguide, each new coordinators will be added to this list.
|
||
|
||
|
||
Create sphinx-doc Language Picker
|
||
---------------------------------
|
||
|
||
Highly similar to the version picker, a language picker must be
|
||
implemented. This language picker must be configurable to hide or
|
||
show a given language.
|
||
|
||
|
||
Enhance rendering of untranslated fuzzy translations
|
||
----------------------------------------------------
|
||
|
||
It's an opened sphinx issue [9]_, but we'll need it so we'll have to
|
||
work on it. Translated, fuzzy, and untranslated paragraphs should be
|
||
differentiated. (Fuzzy paragraphs have to warn the reader what he's
|
||
reading may be out of date.)
|
||
|
||
|
||
New Translation Procedure
|
||
=========================
|
||
|
||
Designate a Coordinator
|
||
-----------------------
|
||
|
||
The first step is to designate a coordinator, see `Language Team`_,
|
||
The coordinator must sign the CLA.
|
||
|
||
The coordinator should be added to a list of translation coordinator
|
||
on the devguide.
|
||
|
||
|
||
Create github repository
|
||
------------------------
|
||
|
||
Create a repository named "python-docs-{LANGUAGE_TAG}" on the Python
|
||
documentation github organization (See `Repository For Po Files`_.), and grant the
|
||
language coordinator push rights to this repository.
|
||
|
||
|
||
Add support for translations in docsbuild-scripts
|
||
-------------------------------------------------
|
||
|
||
As soon as the translation hits its firsts commits, update the
|
||
docsbuild-scripts configuration to build the translation (but not
|
||
displaying it in the language picker).
|
||
|
||
|
||
Add translation to the language picker
|
||
--------------------------------------
|
||
|
||
As soon as the translation hits:
|
||
|
||
- 100% of bugs.html with proper links to the language repository
|
||
issue tracker.
|
||
- 100% of tutorial
|
||
- 100% of library/functions (builtins)
|
||
|
||
the translation can be added to the language picker.
|
||
|
||
|
||
Previous discussions
|
||
====================
|
||
|
||
- `[Python-ideas] Cross link documentation translations (January, 2016)`_
|
||
- `[Python-ideas] Cross link documentation translations (January, 2016)`_
|
||
- `[Python-ideas] https://docs.python.org/fr/ ? (March 2016)`_
|
||
|
||
|
||
.. _[Python-ideas] Cross link documentation translations (January, 2016):
|
||
https://mail.python.org/pipermail/python-ideas/2016-January/038010.html
|
||
|
||
.. _[Python-Dev] Translated Python documentation (Febrary 2016):
|
||
https://mail.python.org/pipermail/python-dev/2017-February/147416.html
|
||
|
||
.. _[Python-ideas] https://docs.python.org/fr/ ? (March 2016):
|
||
https://mail.python.org/pipermail/python-ideas/2016-March/038879.html
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] [I18n-sig] Hello Python members, Do you have any idea about
|
||
Python documents?
|
||
(https://mail.python.org/pipermail/i18n-sig/2013-September/002130.html)
|
||
|
||
.. [2] [Doc-SIG] Localization of Python docs
|
||
(https://mail.python.org/pipermail/doc-sig/2013-September/003948.html)
|
||
|
||
.. [3] Tags for Identifying Languages
|
||
(http://tools.ietf.org/html/rfc5646)
|
||
|
||
.. [4] IETF language tag
|
||
(https://en.wikipedia.org/wiki/IETF_language_tag)
|
||
|
||
.. [5] GNU Gettext manual, section 2.3.1: Locale Names
|
||
(https://www.gnu.org/software/gettext/manual/html_node/Locale-Names.html)
|
||
|
||
.. [6] Semantic URL: Slug
|
||
(https://en.wikipedia.org/wiki/Semantic_URL#Slug)
|
||
|
||
.. [7] Tags for Identifying Languages: Formatting of Language Tags
|
||
(https://tools.ietf.org/html/rfc5646#section-2.1.1)
|
||
|
||
.. [8] Docsbuild-scripts github repository
|
||
(https://github.com/python/docsbuild-scripts/)
|
||
|
||
.. [9] i18n: Highlight untranslated paragraphs
|
||
(https://github.com/sphinx-doc/sphinx/issues/1246)
|
||
|
||
.. [10] Wikipedia: Simple English
|
||
(https://simple.wikipedia.org/wiki/Main_Page)
|
||
|
||
.. [11] Python-dev discussion about simplified english
|
||
(https://mail.python.org/pipermail/python-dev/2017-February/147446.html)
|
||
|
||
.. [12] Passing options to sphinx from Doc/Makefile
|
||
(https://github.com/python/cpython/commit/57acb82d275ace9d9d854b156611e641f68e9e7c)
|
||
|
||
.. [13] French translation progression
|
||
(https://mdk.fr/pycon2016/#/11)
|
||
|
||
.. [14] French translation contributors
|
||
(https://github.com/AFPy/python_doc_fr/graphs/contributors?from=2016-01-01&to=2016-12-31&type=c)
|
||
|
||
.. [15] Python-doc on Transifex
|
||
(https://www.transifex.com/python-doc/)
|
||
|
||
.. [16] French translation
|
||
(https://www.afpy.org/doc/python/)
|
||
|
||
.. [17] French translation github
|
||
(https://github.com/AFPy/python_doc_fr)
|
||
|
||
.. [18] French mailing list
|
||
(http://lists.afpy.org/mailman/listinfo/traductions)
|
||
|
||
.. [19] Japanese translation
|
||
(http://docs.python.jp/3/)
|
||
|
||
.. [20] Japanese github
|
||
(https://github.com/python-doc-ja/python-doc-ja)
|
||
|
||
.. [21] Spanish translation
|
||
(http://docs.python.org.ar/tutorial/3/index.html)
|
||
|
||
.. [22] [Python-Dev] Translated Python documentation: doc vs docs
|
||
(https://mail.python.org/pipermail/python-dev/2017-February/147472.html)
|
||
|
||
.. [23] Domains - SEO Best Practices | Moz
|
||
(https://moz.com/learn/seo/domain)
|
||
|
||
.. [24] Requirements for Internet Hosts -- Application and Support
|
||
(https://www.ietf.org/rfc/rfc1123.txt)
|
||
|
||
.. [25] Accept-Language
|
||
(https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language)
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|