Spell checked

This commit is contained in:
David Goodger 2002-12-21 19:51:05 +00:00
parent 200959cb61
commit fb0080cbcb
1 changed files with 32 additions and 31 deletions

View File

@ -3,7 +3,7 @@ Title: New Import Hooks
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Just van Rossum <just@letterror.com>, Author: Just van Rossum <just@letterror.com>,
Paul Moore <gustav@morpheus.demon.co.uk> Paul Moore <gustav@morpheus.demon.co.uk>
Status: Draft Status: Draft
Type: Standards Track Type: Standards Track
Content-Type: text/plain Content-Type: text/plain
@ -24,7 +24,7 @@ Abstract
Motivation Motivation
The only way to customize the import mechanism is currently to The only way to customize the import mechanism is currently to
override the builtin __import__ function. However, overriding override the built-in __import__ function. However, overriding
__import__ has many problems. To begin with: __import__ has many problems. To begin with:
- An __import__ replacement needs to *fully* reimplement the entire - An __import__ replacement needs to *fully* reimplement the entire
@ -68,7 +68,7 @@ Use cases
database over a network. database over a network.
The work on this PEP was partly triggered by the implementation of The work on this PEP was partly triggered by the implementation of
PEP 273 [2], which adds imports from Zip archives as a builtin PEP 273 [2], which adds imports from Zip archives as a built-in
feature to Python. While the PEP itself was widely accepted as a feature to Python. While the PEP itself was widely accepted as a
must-have feature, the implementation left a few things to desire. must-have feature, the implementation left a few things to desire.
For one thing it went through great lengths to integrate itself with For one thing it went through great lengths to integrate itself with
@ -102,28 +102,29 @@ Use cases
emulation code. emulation code.
Before work on the design and implementation of this PEP was Before work on the design and implementation of this PEP was
started, a new BuildApplication-like tool for MacOSX prompted one of started, a new BuildApplication-like tool for MacOS X prompted one
the authors of this PEP (JvR) to expose the table of frozen modules of the authors of this PEP (JvR) to expose the table of frozen
to Python, in the imp module. The main reason was to be able to use modules to Python, in the imp module. The main reason was to be
the freeze import hook (avoiding fancy __import__ support), yet to able to use the freeze import hook (avoiding fancy __import__
also be able to supply a set of modules at runtime. This resulted support), yet to also be able to supply a set of modules at
in sf patch #642578 [6], which was mysteriously accepted (mostly runtime. This resulted in sf patch #642578 [6], which was
because nobody seemed to care either way ;-). Yet it is completely mysteriously accepted (mostly because nobody seemed to care either
superfluous when this PEP gets accepted, as it offers a much nicer way ;-). Yet it is completely superfluous when this PEP gets
and general way to do the same thing. accepted, as it offers a much nicer and general way to do the same
thing.
Rationale Rationale
While experimenting with alternative implementation ideas to get While experimenting with alternative implementation ideas to get
builtin Zip import, it was discovered that achieving this is built-in Zip import, it was discovered that achieving this is
possible with only a fairly small amount of changes to import.c. possible with only a fairly small amount of changes to import.c.
This allowed to factor out the Zip-specific stuff into a new source This allowed to factor out the Zip-specific stuff into a new source
file, while at the same time creating a *general* new import hook file, while at the same time creating a *general* new import hook
scheme: the one you're reading about now. scheme: the one you're reading about now.
An earlier design allowed non-string objects on sys.path. Such an An earlier design allowed non-string objects on sys.path. Such an
object would have the neccesary methods to handle an import. This object would have the necessary methods to handle an import. This
has two disadvantages: 1) it breaks code that assumes all items on has two disadvantages: 1) it breaks code that assumes all items on
sys.path are strings; 2) it is not compatible with the PYTHONPATH sys.path are strings; 2) it is not compatible with the PYTHONPATH
environment variable. The latter is directly needed for Zip environment variable. The latter is directly needed for Zip
@ -146,22 +147,22 @@ Rationale
To minimize the impact on import.c as well as to avoid adding extra To minimize the impact on import.c as well as to avoid adding extra
overhead, it was chosen to not add an explicit hook and importer overhead, it was chosen to not add an explicit hook and importer
object for the existing file system import logic (as iu.py has), but object for the existing file system import logic (as iu.py has), but
to simply fall back to the builtin logic if no hook on to simply fall back to the built-in logic if no hook on
sys.path_hooks could handle the path item. If this is the case, a sys.path_hooks could handle the path item. If this is the case, a
None value is stored in sys.path_importer_cache, again to avoid None value is stored in sys.path_importer_cache, again to avoid
repeated lookups. (Later we can go further and add a real importer repeated lookups. (Later we can go further and add a real importer
object for the builtin mechanism, for now, the None fallback scheme object for the built-in mechanism, for now, the None fallback scheme
should suffice.) should suffice.)
A question was raised: what about importers that don't need *any* A question was raised: what about importers that don't need *any*
entry on sys.path? (Builtin and frozen modules fall into that entry on sys.path? (Built-in and frozen modules fall into that
category.) Again, Gordon McMillan to the rescue: iu.py contains a category.) Again, Gordon McMillan to the rescue: iu.py contains a
thing he calls the "metapath". In this PEP's implementation, it's a thing he calls the "metapath". In this PEP's implementation, it's a
list of importer objects that is traversed *before* sys.path. This list of importer objects that is traversed *before* sys.path. This
list is yet another new object in the sys.module: sys.meta_path. list is yet another new object in the sys.module: sys.meta_path.
Currently, this list is empty by default, and frozen and builtin Currently, this list is empty by default, and frozen and built-in
module imports are done after traversing sys.meta_path, but still module imports are done after traversing sys.meta_path, but still
before sys.path. (Again, later we can add real frozen, builtin and before sys.path. (Again, later we can add real frozen, built-in and
sys.path importer objects on sys.meta_path, allowing for some extra sys.path importer objects on sys.meta_path, allowing for some extra
flexibility, but this could be done as a "phase 2" project, possibly flexibility, but this could be done as a "phase 2" project, possibly
for Python 2.4. It would be the finishing touch as then *every* for Python 2.4. It would be the finishing touch as then *every*
@ -184,12 +185,12 @@ Specification part 1: The Importer Protocol
mechanism. mechanism.
When an import statement is encountered, the interpreter looks up When an import statement is encountered, the interpreter looks up
the __import__ function in the builtin name space. __import__ is the __import__ function in the built-in name space. __import__ is
then called with four arguments, amongst which are the name of the then called with four arguments, amongst which are the name of the
module being imported (may be a dotted name) and a reference to the module being imported (may be a dotted name) and a reference to the
current global namespace. current global namespace.
The builtin __import__ function (known as PyImport_ImportModuleEx in The built-in __import__ function (known as PyImport_ImportModuleEx in
import.c) will then check to see whether the module doing the import import.c) will then check to see whether the module doing the import
is a package by looking for a __path__ variable in the current is a package by looking for a __path__ variable in the current
global namespace. If it is indeed a package, it first tries to do global namespace. If it is indeed a package, it first tries to do
@ -231,7 +232,7 @@ Specification part 1: The Importer Protocol
module name, for example "spam.eggs.ham". As explained above, when module name, for example "spam.eggs.ham". As explained above, when
importer.find_module("spam.eggs.ham") is called, "spam.eggs" has importer.find_module("spam.eggs.ham") is called, "spam.eggs" has
already been imported and added to sys.modules. However, the already been imported and added to sys.modules. However, the
find_module() method isn't neccesarily always called during an find_module() method isn't necessarily always called during an
actual import: meta tools that analyze import dependencies (such as actual import: meta tools that analyze import dependencies (such as
freeze, Installer or py2exe) don't actually load modules, so an freeze, Installer or py2exe) don't actually load modules, so an
importer shouldn't *depend* on the parent package being available in importer shouldn't *depend* on the parent package being available in
@ -252,8 +253,8 @@ Specification part 1: The Importer Protocol
worst case and multiple loading in the best. worst case and multiple loading in the best.
- The __file__ attribute must be set. This must be a string, but it - The __file__ attribute must be set. This must be a string, but it
may be a dummy value, for example "<frozen>". The priviledge of may be a dummy value, for example "<frozen>". The privilege of
not having a __file__ attribute at all is reserved for builtin not having a __file__ attribute at all is reserved for built-in
modules. modules.
- If it's a package, the __path__ variable must be set. This must - If it's a package, the __path__ variable must be set. This must
@ -262,10 +263,10 @@ Specification part 1: The Importer Protocol
- It should add an __importer__ attribute to the module, set to the - It should add an __importer__ attribute to the module, set to the
loader object. This is mostly for introspection, but can be used loader object. This is mostly for introspection, but can be used
for importer-specific extra's, for example getting data associated for importer-specific extras, for example getting data associated
with an importer. with an importer.
If the module is a Python module (as opposed to a builtin module or If the module is a Python module (as opposed to a built-in module or
an dynamically loaded extension), it should execute the module's an dynamically loaded extension), it should execute the module's
code in the module's global name space (module.__dict__). code in the module's global name space (module.__dict__).
@ -288,7 +289,7 @@ Specification part 2: Registering Hooks
There are two types of import hooks: Meta hooks and Path hooks. There are two types of import hooks: Meta hooks and Path hooks.
Meta hooks are called at the start of import processing, before any Meta hooks are called at the start of import processing, before any
other import processing (so that meta hooks can override sys.path other import processing (so that meta hooks can override sys.path
processing, or frozen modules, or even builtin modules). To processing, or frozen modules, or even built-in modules). To
register a meta hook, simply add the importer object to register a meta hook, simply add the importer object to
sys.meta_path (the list of registered meta hooks). sys.meta_path (the list of registered meta hooks).
@ -341,7 +342,7 @@ Packages and the role of __path__
also consulted when pkg.__path__ is traversed and importer objects also consulted when pkg.__path__ is traversed and importer objects
as path items are also allowed (yet, are discouraged for the same as path items are also allowed (yet, are discouraged for the same
reasons as they are discouraged on sys.path, at least for general reasons as they are discouraged on sys.path, at least for general
purpose code). Meta importers don't neccesarily use sys.path at all purpose code). Meta importers don't necessarily use sys.path at all
to do their work and therefore may also ignore the value of to do their work and therefore may also ignore the value of
pkg.__path__. In this case it is still advised to set it to list, pkg.__path__. In this case it is still advised to set it to list,
which can be empty. which can be empty.
@ -354,8 +355,8 @@ Integration with the 'imp' module
whether it's possible at all without breaking code; it is better to whether it's possible at all without breaking code; it is better to
simply add a new function to the imp module. The meaning of the simply add a new function to the imp module. The meaning of the
existing imp.find_module() and imp.load_module() calls changes from: existing imp.find_module() and imp.load_module() calls changes from:
"they expose the builtin import mechanism" to "they expose the basic "they expose the built-in import mechanism" to "they expose the basic
*unhooked* builtin import mechanism". They simply won't invoke any *unhooked* built-in import mechanism". They simply won't invoke any
import hooks. A new imp module function is proposed under the name import hooks. A new imp module function is proposed under the name
"find_module2", with is used like the following pattern: "find_module2", with is used like the following pattern:
@ -404,7 +405,7 @@ Open Issues
sys.prefix (or sys.exec_prefix). For example, looking in sys.prefix (or sys.exec_prefix). For example, looking in
os.path.join(sys.prefix, "data", package_name). os.path.join(sys.prefix, "data", package_name).
- Import hooks could offer a standard way of getting at datafiles - Import hooks could offer a standard way of getting at data files
relative to the module file. The standard zipimport object relative to the module file. The standard zipimport object
provides a method get_data(name) which returns the content of the provides a method get_data(name) which returns the content of the
"file" called name, as a string. To allow modules to get at the "file" called name, as a string. To allow modules to get at the