Add PEP 545: Python Documentation Translations
This commit is contained in:
parent
978a93cfc9
commit
66d8117f86
|
@ -0,0 +1,572 @@
|
|||
PEP: 545
|
||||
Title: Python Documentation Translations
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Victor Stinner <victor.stinner@gmail.com>,
|
||||
Inada Naoki <songofacandy@gmail.com>,
|
||||
Julien Palard <julien@palard.fr>
|
||||
Status: Draft
|
||||
Type: Process
|
||||
Content-Type: text/x-rst
|
||||
Created: 04-Mar-2017
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
The intent of this PEP is to make existing translations of the Python
|
||||
Documentation more accessible and discoverable. By doing so,
|
||||
attracting and motivating new translators and new translations.
|
||||
|
||||
Translated documentation will be hosted on python.org. Examples of
|
||||
two active translation teams:
|
||||
|
||||
* http://docs.python.org/fr/: French
|
||||
* http://docs.python.org/jp/: Japanese
|
||||
|
||||
http://docs.python.org/en/ will redirect to http://docs.python.org/.
|
||||
|
||||
Sources of translated documentation will be hosted in the Python
|
||||
Documentation organization on GitHub: https://github.com/python-docs/.
|
||||
Contributors will have to sign the Python Contributor Agreement (CLA)
|
||||
and the license will be the PSF License.
|
||||
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
On the french ``#python-fr`` IRC channel on freenode, it's not rare to
|
||||
meet people who don't speak english and so are unable to read the
|
||||
Python official documentation. Python wants to be widely available,
|
||||
to all users, in any language: that's also why Python 3 now allows
|
||||
any non-ASCII identifiers:
|
||||
https://www.python.org/dev/peps/pep-3131/#rationale
|
||||
|
||||
There are a least 3 groups of people who are translating the Python
|
||||
documentation in their mother language (french [16]_ [17]_ [18]_,
|
||||
japanese [19]_ [20]_, spanish [21]_), even though their translation
|
||||
are not visible on d.p.o. Other less visible and less organized
|
||||
groups are also translating in their mother language, we heard of
|
||||
Russian, Chinese, Korean, maybe some others we didn't found yet. This
|
||||
PEP defines rules to move translations on docs.python.org so they can
|
||||
easily be found by developers, newcomers and potential translators.
|
||||
|
||||
The Japanese team currently (March 2017) translated ~80% of the
|
||||
documentation, french team ~20%. French translation went from 6% to
|
||||
23% in 2016 [13]_ with 7 contributors [14]_, proving a translation
|
||||
team can be faster than documentation mutates.
|
||||
|
||||
|
||||
Quoting Xiang Zhang about Chinese translations:
|
||||
|
||||
I have seen several groups trying to translate part of our official
|
||||
doc. But their efforts are disperse and quickly become lost because
|
||||
they are not organized to work towards a single common result and
|
||||
their results are hold anywhere on the Web and hard to find. An
|
||||
official one could help ease the pain.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
Translation
|
||||
-----------
|
||||
|
||||
Issue tracker
|
||||
'''''''''''''
|
||||
|
||||
Considering that issues opened about translations may be written in
|
||||
the translation language, which can be considered noise but at least
|
||||
is inconsistent, issues should be placed outside `bugs.python.org
|
||||
<https://bugs.python.org/>`_ (b.p.o).
|
||||
|
||||
As all translation must have their own github project (see `Repository
|
||||
for Po Files`_), they must use the associated github issue tracker.
|
||||
|
||||
Considering the noise induced by translation issues redacted in any
|
||||
languages which may beyond every warnings land in b.p.o, triage will
|
||||
have to be done. Considering that translations already exist and are
|
||||
not actually a source of noise in b.p.o, an unmanageable amount of
|
||||
work is not to be expected. Considering that Xiang Zhang and Victor
|
||||
Stinner are already triaging, and Julien Palard is willing to help on
|
||||
this task, noise on b.p.o is not to be expected.
|
||||
|
||||
Also, language team coordinators (see `Language Team`_) should help
|
||||
triaging b.p.o by properly indicating, in the issue author language if
|
||||
needed, the right issue tracker.
|
||||
|
||||
|
||||
Branches
|
||||
''''''''
|
||||
|
||||
Translation teams should focus on last stable versions, and use tools
|
||||
(scripts, translation memory, …) to automatically translate what is
|
||||
done in one branch to other branches.
|
||||
|
||||
.. note::
|
||||
Translation memories are a kind of database of previously translated
|
||||
paragraphs, even removed ones. See also `Sphinx Internationalization
|
||||
<http://www.sphinx-doc.org/en/stable/intl.html>`_.
|
||||
|
||||
The three stable branches will be translated [12]_: 2.7, 3.5, and 3.6.
|
||||
The scripts to build the documentation of older branches have to be
|
||||
modified to support translation [12]_, whereas these branches now only
|
||||
accept security-only fixes.
|
||||
|
||||
The development branch (master) should have a lower translation priority
|
||||
than stable branches. But docsbuild-scripts should build it anyway so
|
||||
it is possible for a team to work on it to be ready for the next
|
||||
release.
|
||||
|
||||
|
||||
Hosting
|
||||
-------
|
||||
|
||||
Domain Name, Content negociation and URL
|
||||
''''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
Different translation can be told appart by changing one of:
|
||||
Country Code Top Level Domain (CCTLD),
|
||||
path segment, subdomain, or by content negociation.
|
||||
|
||||
Buying a CCTLD for each translations is expensive, time-consuming, and
|
||||
sometimes almost impossible when already registered, this solution
|
||||
should be avoided.
|
||||
|
||||
Using subdomains like "es.docs.python.org" or "docs.es.python.org" is
|
||||
possible but confusing ("is it `es.docs.python.org` or `docs.es.python.org`?").
|
||||
Hyphens in subdomains like
|
||||
`pt-br.doc.python.org` in uncommon and SEOMoz [23]_ correlated the presence of
|
||||
hyphens as a negative factor. Usage of underscores in subdomain is
|
||||
prohibited by the RFC1123 [24]_, section 2.1. Finally using subdomains
|
||||
means creating TLS certificates for each languages, which is more
|
||||
maintenance, and will probably causes us troubles in language pickers
|
||||
if, like for version picker, we want a preflight to check if the
|
||||
translation exists in the given version: preflight will probably be
|
||||
blocked by same-origin-policy. Wildcard TLS certificates are very
|
||||
expensive.
|
||||
|
||||
Using content negociation (HTTP headers ``Accept-Language`` in the
|
||||
request and ``Vary: Accept-Language``) leads to a bad user experience
|
||||
where they can't easily change the language. According to Mozilla:
|
||||
"This header is a hint to be used when the server has no way of
|
||||
determining the language via another way, like a specific URL, that is
|
||||
controlled by an explicit user decision." [25]_. As we want to be
|
||||
able to easily change the language, we should not use the content
|
||||
negociation as a main language determination, so we need somthing
|
||||
else.
|
||||
|
||||
Last solution is to use the URL path, which looks readable, allows
|
||||
for an easy switch from a language to another, and nicely accepts
|
||||
hyphens. Typically something like: "docs.python.org/de/". Example
|
||||
with a hyphen: "docs.python.org/pt-BR/"
|
||||
|
||||
As for version, sphinx-doc does not support compiling for multiple
|
||||
languages, so we'll have full builds rooted under a path, exactly like
|
||||
we're already doing with versions.
|
||||
|
||||
So we can have "docs.python.org/de/3.6/" or
|
||||
"docs.python.org/3.6/de/". Question is "Does the language contains
|
||||
multiple version or does version contains multiple languages?" As
|
||||
versions exists in any cases, and translations for a given version may
|
||||
or may not exists, we may prefer "docs.python.org/3.6/de/", but doing
|
||||
so scatter languages everywhere. Having "/de/3.6/" is clearer about
|
||||
"everything under /de/ is written in deutch". Having the version at
|
||||
the end is also an habit taken by readers of the documentation: they
|
||||
like to easily change the version by changing the end of the path.
|
||||
|
||||
So we should use the following pattern:
|
||||
"docs.python.org/LANGUAGE_TAG/VERSION/".
|
||||
|
||||
Current documentation is not moved to "/en/", but "docs.python.org/en/"
|
||||
will redirect to "docs.python.org/en/".
|
||||
|
||||
|
||||
Language Tag
|
||||
''''''''''''
|
||||
|
||||
A common notation for language tags is the IETF Language Tag [3]_
|
||||
[4]_ based on ISO 639, alghough gettext uses ISO 639 tags with
|
||||
underscores (ex: ``pt_BR``) instead of dashes to join tags [5]_
|
||||
(ex: ``pt-BR``). Examples of IETF Language Tags: ``fr`` (French),
|
||||
``jp`` (Japanese), ``pt-BR`` (Orthographic formulation of 1943 -
|
||||
Official in Brazil).
|
||||
|
||||
It is more common to see dashes instead of underscores in URLs [6]_,
|
||||
so we should use IETF language tags, even if sphinx uses gettext
|
||||
internally: URLs are not meant to leak the underlying implementation.
|
||||
|
||||
It's uncommon to see capitalized letters in URLs, and docs.python.org
|
||||
don't use any, so it may hurt readability by attracting the eye on it,
|
||||
like in: "https://docs.python.org/pt-BR/3.6/library/stdtypes.html".
|
||||
RFC 5646 (Tags for Identifying Languages (IETF)) section-2.1 [7]_
|
||||
tells the tags are not case sensitive. As the RFC allows lower case,
|
||||
and it enhances readability, we should use lowercased tags like
|
||||
``pt-br``.
|
||||
|
||||
It's redundant to display both language and country code if they're
|
||||
the same, typically "de-DE", "fr-FR", although it make sense,
|
||||
respectively "Deutch as spoken in Germany" and "French as spoken in
|
||||
France", it's not a usefull information for the reader. So we may drop
|
||||
those redundencies. We should obviously keep the country part when it
|
||||
make sense like "pt-BR" for "Portuguese as spoken in Brazil".
|
||||
|
||||
So we should use IETF language tags, lowercased, like ``/fr/``,
|
||||
``/pt-br/``, ``/de/`` and so on.
|
||||
|
||||
|
||||
Fetching And Building Translations
|
||||
''''''''''''''''''''''''''''''''''
|
||||
|
||||
Currently docsbuild-scripts are building the documentation [8]_. These scripts
|
||||
should be modified to fetch and build translations.
|
||||
|
||||
Building new translations is like building new versions, so we're
|
||||
adding complexity, but not that much.
|
||||
|
||||
Two steps should be configurable distinctively: Build a new language,
|
||||
and add it to the language picker. This allows a transition step
|
||||
between "we accepted the language" and "it is translated enough to be
|
||||
made public", during this step, translators can review their
|
||||
modifications on d.p.o without having to build the documentation
|
||||
locally.
|
||||
|
||||
From the translations repositories, only the ``.po`` files should be
|
||||
opened by the docsbuild-script to keep the attack surface and probable
|
||||
bugs sources at a minimum. This mean no translation can patch sphinx
|
||||
to advertise their translation tool. (This specific feature should be
|
||||
handled by sphinx anyway [9]_).
|
||||
|
||||
|
||||
Community
|
||||
---------
|
||||
|
||||
Mailing List
|
||||
''''''''''''
|
||||
|
||||
The `doc-sig`_ mailing list will be used to discuss cross-language
|
||||
changes on translated documentations.
|
||||
|
||||
There is also the i18n-sig list but it's more oriented towards i18n APIs
|
||||
[1]_, than translation the Python documentation.
|
||||
|
||||
.. _i18n-sig: https://mail.python.org/mailman/listinfo/i18n-sig
|
||||
.. _doc-sig: https://mail.python.org/mailman/listinfo/doc-sig
|
||||
|
||||
|
||||
Chat
|
||||
''''
|
||||
|
||||
Python community being highly active on IRC, we should create a new
|
||||
IRC channel on freenode, typically #python-doc for consistency with
|
||||
the mailing list name.
|
||||
|
||||
Each language coordinator can organize its own team, even by choosing
|
||||
another chat system if the local usage asks for it. As local teams
|
||||
will write in their native languages, we don't want each team in a
|
||||
single channel, and it's also natural for the local teams to reuse
|
||||
their local channels like "#python-fr" for french translators.
|
||||
|
||||
|
||||
Repository for PO Files
|
||||
'''''''''''''''''''''''
|
||||
|
||||
Considering that each translation teams may want to use different
|
||||
translation tools, and that those tools should easily be synchronized
|
||||
with git, all translations should expose their ``.po`` files via a git
|
||||
repository.
|
||||
|
||||
Considering that each translation will be exposed via git
|
||||
repositories, and that Python has migrated to GitHub, translations
|
||||
will be hosted on github.
|
||||
|
||||
For consistency and discoverability, all translations should be in the
|
||||
same github organization and named according to a common pattern.
|
||||
|
||||
Considering that we want translations to be official, and that Python
|
||||
already have a github organization, translations should be hosted as
|
||||
projects of the `Python documentation GitHub organization`_.
|
||||
|
||||
For consistency, translations repositories should be called
|
||||
``python-docs-LANGUAGE_TAG`` [22]_.
|
||||
|
||||
The docsbuild-scripts may enforce this rule by refusing to fetch
|
||||
outside of the Python organization or a wrongly named repository.
|
||||
|
||||
The CLA bot may be used on the translation repositories, but with a
|
||||
limited effect as local coordinators may synchronize themselves
|
||||
translations from an external tool like transifex, loosing in the
|
||||
process who translated what.
|
||||
|
||||
Version can be hosted on different repositories, different directories
|
||||
or different branches. Storing them on different repositories will
|
||||
probably pollute the Python documentation github organization. As it
|
||||
is typical and natural to use branches to separate versions, branches
|
||||
should be used to do so.
|
||||
|
||||
.. _Python documentation GitHub organization: https://github.com/python-docs/
|
||||
|
||||
|
||||
Translation tools
|
||||
'''''''''''''''''
|
||||
|
||||
Most of the translation work is actually done on Transifex [15]_.
|
||||
|
||||
Other tools may be used later https://pontoon.mozilla.org/
|
||||
and http://zanata.org/
|
||||
|
||||
|
||||
Contributor Agreement
|
||||
'''''''''''''''''''''
|
||||
|
||||
Contributions to translated documentation will be requested to sign the
|
||||
Python Contributor Agreement (CLA):
|
||||
|
||||
https://www.python.org/psf/contrib/contrib-form/
|
||||
|
||||
|
||||
Language Team
|
||||
'''''''''''''
|
||||
|
||||
Each language team should have one coordinator responsible to:
|
||||
|
||||
- Manage the team
|
||||
- Choose and manage the tools its team will use (chat, mailing list, …)
|
||||
- Ensure contributors understand and agree with the CLA
|
||||
- Ensure quality (grammar, vocabulary, consistency, filtering spam, ads, …)
|
||||
- Do redirect to GitHub issue tracker issues related to its
|
||||
language on bugs.python.org
|
||||
|
||||
The license will be the `PSF License <https://docs.python.org/3/license.html>`_,
|
||||
and copyright should be transferable to PSF later.
|
||||
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Simplified English
|
||||
''''''''''''''''''
|
||||
|
||||
It would be possible to introduce a "simplified english" version like
|
||||
wikipedia did [10]_, as discussed on python-dev [11]_, targetting
|
||||
english learners and childrens.
|
||||
|
||||
Pros: It yields a single other translation, theorically readable by
|
||||
everyone, and reviewable by current maintainers.
|
||||
|
||||
Cons: Subtle details may be lost, and translators from english to english
|
||||
may be hard to find as stated by Wikipedia:
|
||||
|
||||
> The main English Wikipedia has 5 million articles, written by nearly
|
||||
140K active users; the Swedish Wikipedia is almost as big, 3M articles
|
||||
from only 3K active users; but the Simple English Wikipedia has just
|
||||
123K articles and 871 active users. That's fewer articles than
|
||||
Esperanto!
|
||||
|
||||
|
||||
Changes
|
||||
=======
|
||||
|
||||
Migrate GitHub Repositories
|
||||
---------------------------
|
||||
|
||||
We (authors of this PEP) already own french and japanese Git
|
||||
repositories, so moving them to the Python documentation organization will not be a
|
||||
problem. We'll however follow the `New Translation Procedure`_.
|
||||
|
||||
|
||||
Patch docsbuild-scripts to Compile Translations
|
||||
-----------------------------------------------
|
||||
|
||||
Docsbuild-script must be patched to:
|
||||
|
||||
- List the languages tags to build along with the branches to build.
|
||||
- List the languages tags to display in the language picker.
|
||||
- Find translation repositories by formatting
|
||||
``github.com:python-docs/python-docs-{language_tag}.git`` (See
|
||||
`Repository for Po Files`_)
|
||||
- Build translations for each branches and each languages
|
||||
|
||||
Patched docsbuild-scripts must only open ``.po`` files from
|
||||
translation repositories.
|
||||
|
||||
|
||||
List coordinators in the devguide
|
||||
---------------------------------
|
||||
|
||||
Add a page or a section with an empty list of coordinators to the
|
||||
devguide, each new coordinators will be added to this list.
|
||||
|
||||
|
||||
Create sphinx-doc Language Picker
|
||||
---------------------------------
|
||||
|
||||
Highly similar to the version picker, a language picker must be
|
||||
implemented. This language picker must be configurable to hide or
|
||||
show a given language.
|
||||
|
||||
|
||||
Enhance rendering of untranslated fuzzy translations
|
||||
----------------------------------------------------
|
||||
|
||||
It's an opened sphinx issue [9]_, but we'll need it so we'll have to
|
||||
work on it. Translated, fuzzy, and untranslated paragraphs should be
|
||||
differentiated. (Fuzzy paragraphs have to warn the reader what he's
|
||||
reading may be out of date.)
|
||||
|
||||
|
||||
New Translation Procedure
|
||||
=========================
|
||||
|
||||
Designate a Coordinator
|
||||
-----------------------
|
||||
|
||||
The first step is to designate a coordinator, see `Language Team`_,
|
||||
The coordinator must sign the CLA.
|
||||
|
||||
The coordinator should be added to a list of translation coordinator
|
||||
on the devguide.
|
||||
|
||||
|
||||
Create github repository
|
||||
------------------------
|
||||
|
||||
Create a repository named "python-docs-{LANGUAGE_TAG}" on the Python
|
||||
documentation github organization (See `Repository For Po Files`_.), and grant the
|
||||
language coordinator push rights to this repository.
|
||||
|
||||
|
||||
Add support for translations in docsbuild-scripts
|
||||
-------------------------------------------------
|
||||
|
||||
As soon as the translation hits its firsts commits, update the
|
||||
docsbuild-scripts configuration to build the translation (but not
|
||||
displaying it in the language picker).
|
||||
|
||||
|
||||
Add translation to the language picker
|
||||
--------------------------------------
|
||||
|
||||
As soon as the translation hits:
|
||||
|
||||
- 100% of bugs.html with proper links to the language repository
|
||||
issue tracker.
|
||||
- 100% of tutorial
|
||||
- 100% of library/functions (builtins)
|
||||
|
||||
the translation can be added to the language picker.
|
||||
|
||||
|
||||
Previous discussions
|
||||
====================
|
||||
|
||||
- `[Python-ideas] Cross link documentation translations (January, 2016)`_
|
||||
- `[Python-ideas] Cross link documentation translations (January, 2016)`_
|
||||
- `[Python-ideas] https://docs.python.org/fr/ ? (March 2016)`_
|
||||
|
||||
|
||||
.. _[Python-ideas] Cross link documentation translations (January, 2016):
|
||||
https://mail.python.org/pipermail/python-ideas/2016-January/038010.html
|
||||
|
||||
.. _[Python-Dev] Translated Python documentation (Febrary 2016):
|
||||
https://mail.python.org/pipermail/python-dev/2017-February/147416.html
|
||||
|
||||
.. _[Python-ideas] https://docs.python.org/fr/ ? (March 2016):
|
||||
https://mail.python.org/pipermail/python-ideas/2016-March/038879.html
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [1] [I18n-sig] Hello Python members, Do you have any idea about
|
||||
Python documents?
|
||||
(https://mail.python.org/pipermail/i18n-sig/2013-September/002130.html)
|
||||
|
||||
.. [2] [Doc-SIG] Localization of Python docs
|
||||
(https://mail.python.org/pipermail/doc-sig/2013-September/003948.html)
|
||||
|
||||
.. [3] Tags for Identifying Languages
|
||||
(http://tools.ietf.org/html/rfc5646)
|
||||
|
||||
.. [4] IETF language tag
|
||||
(https://en.wikipedia.org/wiki/IETF_language_tag)
|
||||
|
||||
.. [5] GNU Gettext manual, section 2.3.1: Locale Names
|
||||
(https://www.gnu.org/software/gettext/manual/html_node/Locale-Names.html)
|
||||
|
||||
.. [6] Semantic URL: Slug
|
||||
(https://en.wikipedia.org/wiki/Semantic_URL#Slug)
|
||||
|
||||
.. [7] Tags for Identifying Languages: Formatting of Language Tags
|
||||
(https://tools.ietf.org/html/rfc5646#section-2.1.1)
|
||||
|
||||
.. [8] Docsbuild-scripts github repository
|
||||
(https://github.com/python/docsbuild-scripts/)
|
||||
|
||||
.. [9] i18n: Highlight untranslated paragraphs
|
||||
(https://github.com/sphinx-doc/sphinx/issues/1246)
|
||||
|
||||
.. [10] Wikipedia: Simple English
|
||||
(https://simple.wikipedia.org/wiki/Main_Page)
|
||||
|
||||
.. [11] Python-dev discussion about simplified english
|
||||
(https://mail.python.org/pipermail/python-dev/2017-February/147446.html)
|
||||
|
||||
.. [12] Passing options to sphinx from Doc/Makefile
|
||||
(https://github.com/python/cpython/commit/57acb82d275ace9d9d854b156611e641f68e9e7c)
|
||||
|
||||
.. [13] French translation progression
|
||||
(https://mdk.fr/pycon2016/#/11)
|
||||
|
||||
.. [14] French translation contributors
|
||||
(https://github.com/AFPy/python_doc_fr/graphs/contributors?from=2016-01-01&to=2016-12-31&type=c)
|
||||
|
||||
.. [15] Python-doc on Transifex
|
||||
(https://www.transifex.com/python-doc/)
|
||||
|
||||
.. [16] French translation
|
||||
(https://www.afpy.org/doc/python/)
|
||||
|
||||
.. [17] French translation github
|
||||
(https://github.com/AFPy/python_doc_fr)
|
||||
|
||||
.. [18] French mailing list
|
||||
(http://lists.afpy.org/mailman/listinfo/traductions)
|
||||
|
||||
.. [19] Japanese translation
|
||||
(http://docs.python.jp/3/)
|
||||
|
||||
.. [20] Japanese github
|
||||
(https://github.com/python-doc-ja/python-doc-ja)
|
||||
|
||||
.. [21] Spanish translation
|
||||
(http://docs.python.org.ar/tutorial/3/index.html)
|
||||
|
||||
.. [22] [Python-Dev] Translated Python documentation: doc vs docs
|
||||
(https://mail.python.org/pipermail/python-dev/2017-February/147472.html)
|
||||
|
||||
.. [23] Domains - SEO Best Practices | Moz
|
||||
(https://moz.com/learn/seo/domain)
|
||||
|
||||
.. [24] Requirements for Internet Hosts -- Application and Support
|
||||
(https://www.ietf.org/rfc/rfc1123.txt)
|
||||
|
||||
.. [25] Accept-Language
|
||||
(https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language)
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
coding: utf-8
|
||||
End:
|
Loading…
Reference in New Issue