602 lines
21 KiB
Plaintext
602 lines
21 KiB
Plaintext
PEP: 545
|
||
Title: Python Documentation Translations
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Julien Palard <julien@palard.fr>
|
||
Inada Naoki <songofacandy@gmail.com>,
|
||
Victor Stinner <victor.stinner@gmail.com>,
|
||
Status: Draft
|
||
Type: Process
|
||
Content-Type: text/x-rst
|
||
Created: 04-Mar-2017
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
The intent of this PEP is to make existing translations of the Python
|
||
Documentation more accessible and discoverable. By doing so, we hope
|
||
to attract and motivate new translators and new translations.
|
||
|
||
Translated documentation will be hosted on python.org. Examples of
|
||
two active translation teams:
|
||
|
||
* http://docs.python.org/fr/: French
|
||
* http://docs.python.org/ja/: Japanese
|
||
|
||
http://docs.python.org/en/ will redirect to http://docs.python.org/.
|
||
|
||
Sources of translated documentation will be hosted in the Python
|
||
organization on GitHub: https://github.com/python/. Contributors will
|
||
have to accept a Documentation Contribution Agreement (to be written).
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
On the French ``#python-fr`` IRC channel on freenode, it's not rare to
|
||
meet people who don't speak English and so are unable to read the
|
||
Python official documentation. Python wants to be widely available
|
||
to all users in any language: this is also why Python 3 supports
|
||
any non-ASCII identifiers:
|
||
https://www.python.org/dev/peps/pep-3131/#rationale
|
||
|
||
There are at least 4 groups of people who are translating the Python
|
||
documentation to their native language (French [16]_ [17]_ [18]_,
|
||
Japanese [19]_ [20]_, Spanish [21]_, Hungarian [27]_ [28]_) even
|
||
though their translations are not visible on d.p.o. Other, less
|
||
visible and less organized groups, are also translating the
|
||
documentation, we've heard of Russian [26]_, Chinese and
|
||
Korean. Others we haven't found yet might also exist. This PEP
|
||
defines rules describing how to move translations on docs.python.org
|
||
so they can easily be found by developers, newcomers and potential
|
||
translators.
|
||
|
||
The Japanese team has (as of March 2017) translated ~80% of the
|
||
documentation, the French team ~20%. French translation went from 6%
|
||
to 23% in 2016 [13]_ with 7 contributors [14]_, proving a translation
|
||
team can be faster than the rate the documentation mutates.
|
||
|
||
|
||
Quoting Xiang Zhang about Chinese translations:
|
||
|
||
I have seen several groups trying to translate part of our official
|
||
doc. But their efforts are disperse and quickly become lost because
|
||
they are not organized to work towards a single common result and
|
||
their results are hold anywhere on the Web and hard to find. An
|
||
official one could help ease the pain.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Translation
|
||
-----------
|
||
|
||
Issue tracker
|
||
'''''''''''''
|
||
|
||
Considering that issues opened about translations may be written in
|
||
the translation language, which can be considered noise but at least
|
||
is inconsistent, issues should be placed outside `bugs.python.org
|
||
<https://bugs.python.org/>`_ (b.p.o).
|
||
|
||
As all translation must have their own github project (see `Repository
|
||
for Po Files`_), they must use the associated github issue tracker.
|
||
|
||
Considering the noise induced by translation issues redacted in any
|
||
languages which may beyond every warnings land in b.p.o, triage will
|
||
have to be done. Considering that translations already exist and are
|
||
not actually a source of noise in b.p.o, an unmanageable amount of
|
||
work is not to be expected. Considering that Xiang Zhang and Victor
|
||
Stinner are already triaging, and Julien Palard is willing to help on
|
||
this task, noise on b.p.o is not to be expected.
|
||
|
||
Also, language team coordinators (see `Language Team`_) should help
|
||
with triaging b.p.o by properly indicating, in the language of the
|
||
issue author if required, the right issue tracker.
|
||
|
||
|
||
Branches
|
||
''''''''
|
||
|
||
Translation teams should focus on last stable versions, and use tools
|
||
(scripts, translation memory, …) to automatically translate what is
|
||
done in one branch to other branches.
|
||
|
||
.. note::
|
||
Translation memories are a kind of database of previously translated
|
||
paragraphs, even removed ones. See also `Sphinx Internationalization
|
||
<http://www.sphinx-doc.org/en/stable/intl.html>`_.
|
||
|
||
The three currently stable branches that will be translated are [12]_:
|
||
2.7, 3.5, and 3.6. The scripts to build the documentation of older
|
||
branches needs to be modified to support translation [12]_, whereas
|
||
these branches now only accept security-only fixes.
|
||
|
||
The development branch (master) should have a lower translation priority
|
||
than stable branches. But docsbuild-scripts should build it anyway so
|
||
it is possible for a team to work on it to be ready for the next
|
||
release.
|
||
|
||
|
||
Hosting
|
||
-------
|
||
|
||
Domain Name, Content negotiation and URL
|
||
''''''''''''''''''''''''''''''''''''''''
|
||
|
||
Different translations can be identified by changing one of the
|
||
following: Country Code Top Level Domain (CCTLD),
|
||
path segment, subdomain or content negotiation.
|
||
|
||
Buying a CCTLD for each translations is expensive, time-consuming, and
|
||
sometimes almost impossible when already registered, this solution
|
||
should be avoided.
|
||
|
||
Using subdomains like "es.docs.python.org" or "docs.es.python.org" is
|
||
possible but confusing ("is it `es.docs.python.org` or
|
||
`docs.es.python.org`?"). Hyphens in subdomains like
|
||
`pt-br.doc.python.org` is uncommon and SEOMoz [23]_ correlated the
|
||
presence of hyphens as a negative factor. Usage of underscores in
|
||
subdomain is prohibited by the RFC1123 [24]_, section 2.1. Finally,
|
||
using subdomains means creating TLS certificates for each
|
||
language. This not only requires more maintenance but will also cause
|
||
issues in language switcher if, as for version switcher, we want a
|
||
preflight to check if the translation exists in the given version:
|
||
preflight will probably be blocked by same-origin-policy. Wildcard
|
||
TLS certificates are very expensive.
|
||
|
||
Using content negotiation (HTTP headers ``Accept-Language`` in the
|
||
request and ``Vary: Accept-Language``) leads to a bad user experience
|
||
where they can't easily change the language. According to Mozilla:
|
||
"This header is a hint to be used when the server has no way of
|
||
determining the language via another way, like a specific URL, that is
|
||
controlled by an explicit user decision." [25]_. As we want to be
|
||
able to easily change the language, we should not use the content
|
||
negotiation as a main language determination, so we need something
|
||
else.
|
||
|
||
Last solution is to use the URL path, which looks readable, allows
|
||
for an easy switch from a language to another, and nicely accepts
|
||
hyphens. Typically something like: "docs.python.org/de/" or, by
|
||
using a hyphen: "docs.python.org/pt-BR/".
|
||
|
||
As for the version, sphinx-doc does not support compiling for multiple
|
||
languages, so we'll have full builds rooted under a path, exactly like
|
||
we're already doing with versions.
|
||
|
||
So we can have "docs.python.org/de/3.6/" or
|
||
"docs.python.org/3.6/de/". A question that arises is:
|
||
"Does the language contain multiple versions or does the version contain
|
||
multiple languages?". As versions exist in any case and translations
|
||
for a given version may or may not exist, we may prefer
|
||
"docs.python.org/3.6/de/", but doing so scatters languages everywhere.
|
||
Having "/de/3.6/" is clearer, meaning: "everything under /de/ is written
|
||
in German". Having the version at the end is also a habit taken by
|
||
readers of the documentation: they like to easily change the version
|
||
by changing the end of the path.
|
||
|
||
So we should use the following pattern:
|
||
"docs.python.org/LANGUAGE_TAG/VERSION/".
|
||
|
||
The current documentation is not moved to "/en/", instead
|
||
"docs.python.org/en/" will redirect to "docs.python.org".
|
||
|
||
|
||
Language Tag
|
||
''''''''''''
|
||
|
||
A common notation for language tags is the IETF Language Tag [3]_
|
||
[4]_ based on ISO 639, although gettext uses ISO 639 tags with
|
||
underscores (ex: ``pt_BR``) instead of dashes to join tags [5]_
|
||
(ex: ``pt-BR``). Examples of IETF Language Tags: ``fr`` (French),
|
||
``ja`` (Japanese), ``pt-BR`` (Orthographic formulation of 1943 -
|
||
Official in Brazil).
|
||
|
||
It is more common to see dashes instead of underscores in URLs [6]_,
|
||
so we should use IETF language tags, even if sphinx uses gettext
|
||
internally: URLs are not meant to leak the underlying implementation.
|
||
|
||
It's uncommon to see capitalized letters in URLs, and docs.python.org
|
||
doesn't use any, so it may hurt readability by attracting the eye on it,
|
||
like in: "https://docs.python.org/pt-BR/3.6/library/stdtypes.html".
|
||
RFC 5646 (Tags for Identifying Languages (IETF)) section-2.1 [7]_
|
||
states that tags are not case sensitive. As the RFC allows lower case,
|
||
and it enhances readability, we should use lowercased tags like
|
||
``pt-br``.
|
||
|
||
We may drop the region subtag when it does does not add
|
||
distinguishing information, for example: "de-DE" or "fr-FR".
|
||
(Although it might make sense, respectively meaning "German as
|
||
spoken in Germany" and "French as spoken in France"). But when
|
||
the region subtag actually adds information, for example "pt-BR"
|
||
for "Portuguese as spoken in Brazil", it should be kept.
|
||
|
||
So we should use IETF language tags, lowercased, like ``/fr/``,
|
||
``/pt-br/``, ``/de/`` and so on.
|
||
|
||
|
||
Fetching And Building Translations
|
||
''''''''''''''''''''''''''''''''''
|
||
|
||
Currently docsbuild-scripts are building the documentation [8]_.
|
||
These scripts should be modified to fetch and build translations.
|
||
|
||
Building new translations is like building new versions so, while we're
|
||
adding complexity it is not that much.
|
||
|
||
Two steps should be configurable distinctively: Building a new language,
|
||
and adding it to the language switcher. This allows a transition step
|
||
between "we accepted the language" and "it is translated enough to be
|
||
made public". During this step, translators can review their
|
||
modifications on d.p.o without having to build the documentation
|
||
locally.
|
||
|
||
From the translation repositories, only the ``.po`` files should be
|
||
opened by the docsbuild-script to keep the attack surface and probable
|
||
bug sources at a minimum. This means no translation can patch sphinx
|
||
to advertise their translation tool. (This specific feature should be
|
||
handled by sphinx anyway [9]_).
|
||
|
||
|
||
Community
|
||
---------
|
||
|
||
Mailing List
|
||
''''''''''''
|
||
|
||
The `doc-sig`_ mailing list will be used to discuss cross-language
|
||
changes on translated documentation.
|
||
|
||
There is also the i18n-sig list but it's more oriented towards i18n APIs
|
||
[1]_ than translating the Python documentation.
|
||
|
||
.. _i18n-sig: https://mail.python.org/mailman/listinfo/i18n-sig
|
||
.. _doc-sig: https://mail.python.org/mailman/listinfo/doc-sig
|
||
|
||
|
||
Chat
|
||
''''
|
||
|
||
Due to the Python community being highly active on IRC, we should
|
||
create a new IRC channel on freenode, typically #python-doc for
|
||
consistency with the mailing list name.
|
||
|
||
Each language coordinator can organize their own team, even by choosing
|
||
another chat system if the local usage asks for it. As local teams
|
||
will write in their native languages, we don't want each team in a
|
||
single channel. It's also natural for the local teams to reuse
|
||
their local channels like "#python-fr" for French translators.
|
||
|
||
|
||
Repository for PO Files
|
||
'''''''''''''''''''''''
|
||
|
||
Considering that each translation team may want to use different
|
||
translation tools, and that those tools should easily be synchronized
|
||
with git, all translations should expose their ``.po`` files via a git
|
||
repository.
|
||
|
||
Considering that each translation will be exposed via git
|
||
repositories, and that Python has migrated to GitHub, translations
|
||
will be hosted on github.
|
||
|
||
For consistency and discoverability, all translations should be in the
|
||
same github organization and named according to a common pattern.
|
||
|
||
Given that we want translations to be official, and that Python
|
||
already has a github organization, translations should be hosted as
|
||
projects of the `Python GitHub organization`_.
|
||
|
||
For consistency, translation repositories should be called
|
||
``python-docs-LANGUAGE_TAG`` [22]_, using the language tag used in
|
||
paths: without region subtag if redundant, and lowercased.
|
||
|
||
The docsbuild-scripts may enforce this rule by refusing to fetch
|
||
outside of the Python organization or a wrongly named repository.
|
||
|
||
The CLA bot may be used on the translation repositories, but with a
|
||
limited effect as local coordinators may synchronize themselves with
|
||
translations from an external tool, like transifex, and lose track
|
||
of who translated what in the process.
|
||
|
||
Versions can be hosted on different repositories, different directories
|
||
or different branches. Storing them on different repositories will
|
||
probably pollute the Python github organization. As it
|
||
is typical and natural to use branches to separate versions, branches
|
||
should be used to do so.
|
||
|
||
.. _Python GitHub organization: https://github.com/python/
|
||
|
||
|
||
Translation tools
|
||
'''''''''''''''''
|
||
|
||
Most of the translation work is actually done on Transifex [15]_.
|
||
|
||
Other tools may be used later like https://pontoon.mozilla.org/
|
||
and http://zanata.org/.
|
||
|
||
|
||
Contributor Agreement
|
||
'''''''''''''''''''''
|
||
|
||
Contributions to translated documentation will be requested to sign
|
||
the Python Contributor Agreement (CLA):
|
||
|
||
https://www.python.org/psf/contrib/contrib-form/
|
||
|
||
|
||
Language Team
|
||
'''''''''''''
|
||
|
||
Each language team should have one coordinator responsible for:
|
||
|
||
- Managing the team.
|
||
- Choosing and managing the tools the team will use (chat, mailing list, …).
|
||
- Ensure contributors understand and agree with the CLA.
|
||
- Ensure quality (grammar, vocabulary, consistency, filtering spam, ads, …).
|
||
- Redirect issues posted on b.p.o to the correct GitHub issue tracker
|
||
for the language.
|
||
|
||
The license will be a Documentation Contribution Agreement (to be
|
||
written).
|
||
|
||
|
||
Alternatives
|
||
------------
|
||
|
||
Simplified English
|
||
''''''''''''''''''
|
||
|
||
It would be possible to introduce a "simplified English" version like
|
||
wikipedia did [10]_, as discussed on python-dev [11]_, targeting
|
||
English learners and children.
|
||
|
||
Pros: It yields a single translation, theoretically readable by
|
||
everyone and reviewable by current maintainers.
|
||
|
||
Cons: Subtle details may be lost, and translators from English to English
|
||
may be hard to find as stated by Wikipedia:
|
||
|
||
> The main English Wikipedia has 5 million articles, written by nearly
|
||
140K active users; the Swedish Wikipedia is almost as big, 3M articles
|
||
from only 3K active users; but the Simple English Wikipedia has just
|
||
123K articles and 871 active users. That's fewer articles than
|
||
Esperanto!
|
||
|
||
|
||
Changes
|
||
=======
|
||
|
||
Migrate GitHub Repositories
|
||
---------------------------
|
||
|
||
We (authors of this PEP) already own French and Japanese Git repositories,
|
||
so moving them to the Python documentation organization will not be a
|
||
problem. We'll however be following the `New Translation Procedure`_.
|
||
|
||
|
||
Patch docsbuild-scripts to Compile Translations
|
||
-----------------------------------------------
|
||
|
||
Docsbuild-script must be patched to:
|
||
|
||
- List the language tags to build along with the branches to build.
|
||
- List the language tags to display in the language switcher.
|
||
- Find translation repositories by formatting
|
||
``github.com:python/python-docs-{language_tag}.git`` (See
|
||
`Repository for Po Files`_)
|
||
- Build translations for each branch and each language.
|
||
|
||
Patched docsbuild-scripts must only open ``.po`` files from
|
||
translation repositories.
|
||
|
||
|
||
List coordinators in the devguide
|
||
---------------------------------
|
||
|
||
Add a page or a section with an empty list of coordinators to the
|
||
devguide, each new coordinator will be added to this list.
|
||
|
||
|
||
Create sphinx-doc Language Switcher
|
||
-----------------------------------
|
||
|
||
Highly similar to the version switcher, a language switcher must be
|
||
implemented. This language switcher must be configurable to hide or
|
||
show a given language.
|
||
|
||
The language switcher will only have to update or add the language
|
||
segment to the path like the current version switcher does. Unlike
|
||
the version switcher, no preflight are required as destination page
|
||
always exists (translations does not add or remove pages).
|
||
Untranslated (but existing) pages still exists, they should however be
|
||
rendered as so, see `Enhance Rendering of Untranslated and Fuzzy
|
||
Translations`_.
|
||
|
||
|
||
Update sphinx-doc Version Switcher
|
||
----------------------------------
|
||
|
||
The ``patch_url`` function of the version switcher in
|
||
``version_switch.js`` have to be updated to understand and allow the
|
||
presence of the language segment in the path.
|
||
|
||
|
||
Enhance Rendering of Untranslated and Fuzzy Translations
|
||
--------------------------------------------------------
|
||
|
||
It's an opened sphinx issue [9]_, but we'll need it so we'll have to
|
||
work on it. Translated, fuzzy, and untranslated paragraphs should be
|
||
differentiated. (Fuzzy paragraphs have to warn the reader what he's
|
||
reading may be out of date.)
|
||
|
||
|
||
New Translation Procedure
|
||
=========================
|
||
|
||
Designate a Coordinator
|
||
-----------------------
|
||
|
||
The first step is to designate a coordinator, see `Language Team`_,
|
||
The coordinator must sign the CLA.
|
||
|
||
The coordinator should be added to the list of translation coordinators
|
||
on the devguide.
|
||
|
||
|
||
Create Github Repository
|
||
------------------------
|
||
|
||
Create a repository named "python-docs-{LANGUAGE_TAG}" (IETF language
|
||
tag, without redundent region subtag, with a dash, and lowercased.) on
|
||
the Python github organization (See `Repository For Po Files`_.), and
|
||
grant the language coordinator push rights to this repository.
|
||
|
||
|
||
Add support for translations in docsbuild-scripts
|
||
-------------------------------------------------
|
||
|
||
As soon as the translation hits its first commits, update the
|
||
docsbuild-scripts configuration to build the translation (but not
|
||
displaying it in the language switcher).
|
||
|
||
|
||
Add Translation to the Language Switcher
|
||
----------------------------------------
|
||
|
||
As soon as the translation hits:
|
||
|
||
- 100% of bugs.html with proper links to the language repository
|
||
issue tracker.
|
||
- 100% of tutorial.
|
||
- 100% of library/functions (builtins).
|
||
|
||
the translation can be added to the language switcher.
|
||
|
||
|
||
Previous Discussions
|
||
====================
|
||
|
||
- `[Python-ideas] Cross link documentation translations (January, 2016)`_
|
||
- `[Python-ideas] Cross link documentation translations (January, 2016)`_
|
||
- `[Python-ideas] https://docs.python.org/fr/ ? (March 2016)`_
|
||
|
||
|
||
.. _[Python-ideas] Cross link documentation translations (January, 2016):
|
||
https://mail.python.org/pipermail/python-ideas/2016-January/038010.html
|
||
|
||
.. _[Python-Dev] Translated Python documentation (February 2016):
|
||
https://mail.python.org/pipermail/python-dev/2017-February/147416.html
|
||
|
||
.. _[Python-ideas] https://docs.python.org/fr/ ? (March 2016):
|
||
https://mail.python.org/pipermail/python-ideas/2016-March/038879.html
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] [I18n-sig] Hello Python members, Do you have any idea about
|
||
Python documents?
|
||
(https://mail.python.org/pipermail/i18n-sig/2013-September/002130.html)
|
||
|
||
.. [2] [Doc-SIG] Localization of Python docs
|
||
(https://mail.python.org/pipermail/doc-sig/2013-September/003948.html)
|
||
|
||
.. [3] Tags for Identifying Languages
|
||
(http://tools.ietf.org/html/rfc5646)
|
||
|
||
.. [4] IETF language tag
|
||
(https://en.wikipedia.org/wiki/IETF_language_tag)
|
||
|
||
.. [5] GNU Gettext manual, section 2.3.1: Locale Names
|
||
(https://www.gnu.org/software/gettext/manual/html_node/Locale-Names.html)
|
||
|
||
.. [6] Semantic URL: Slug
|
||
(https://en.wikipedia.org/wiki/Semantic_URL#Slug)
|
||
|
||
.. [7] Tags for Identifying Languages: Formatting of Language Tags
|
||
(https://tools.ietf.org/html/rfc5646#section-2.1.1)
|
||
|
||
.. [8] Docsbuild-scripts github repository
|
||
(https://github.com/python/docsbuild-scripts/)
|
||
|
||
.. [9] i18n: Highlight untranslated paragraphs
|
||
(https://github.com/sphinx-doc/sphinx/issues/1246)
|
||
|
||
.. [10] Wikipedia: Simple English
|
||
(https://simple.wikipedia.org/wiki/Main_Page)
|
||
|
||
.. [11] Python-dev discussion about simplified English
|
||
(https://mail.python.org/pipermail/python-dev/2017-February/147446.html)
|
||
|
||
.. [12] Passing options to sphinx from Doc/Makefile
|
||
(https://github.com/python/cpython/commit/57acb82d275ace9d9d854b156611e641f68e9e7c)
|
||
|
||
.. [13] French translation progression
|
||
(https://mdk.fr/pycon2016/#/11)
|
||
|
||
.. [14] French translation contributors
|
||
(https://github.com/AFPy/python_doc_fr/graphs/contributors?from=2016-01-01&to=2016-12-31&type=c)
|
||
|
||
.. [15] Python-doc on Transifex
|
||
(https://www.transifex.com/python-doc/)
|
||
|
||
.. [16] French translation
|
||
(https://www.afpy.org/doc/python/)
|
||
|
||
.. [17] French translation github
|
||
(https://github.com/AFPy/python_doc_fr)
|
||
|
||
.. [18] French mailing list
|
||
(http://lists.afpy.org/mailman/listinfo/traductions)
|
||
|
||
.. [19] Japanese translation
|
||
(http://docs.python.jp/3/)
|
||
|
||
.. [20] Japanese github
|
||
(https://github.com/python-doc-ja/python-doc-ja)
|
||
|
||
.. [21] Spanish translation
|
||
(http://docs.python.org.ar/tutorial/3/index.html)
|
||
|
||
.. [22] [Python-Dev] Translated Python documentation: doc vs docs
|
||
(https://mail.python.org/pipermail/python-dev/2017-February/147472.html)
|
||
|
||
.. [23] Domains - SEO Best Practices | Moz
|
||
(https://moz.com/learn/seo/domain)
|
||
|
||
.. [24] Requirements for Internet Hosts -- Application and Support
|
||
(https://www.ietf.org/rfc/rfc1123.txt)
|
||
|
||
.. [25] Accept-Language
|
||
(https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language)
|
||
|
||
.. [26] Документация Python 2.7!
|
||
(http://python-lab.ru/documentation/index.html)
|
||
|
||
.. [27] Python-oktató
|
||
(http://harp.pythonanywhere.com/python_doc/tutorial/index.html)
|
||
|
||
.. [28] The Python-hu Archives
|
||
(https://mail.python.org/pipermail/python-hu/)
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|