PEP: 246 Title: Object Adaptation Version: $Revision$ Author: aleax@aleax.it (Alex Martelli), cce@clarkevans.com (Clark C. Evans) Status: Draft Type: Standards Track Created: 21-Mar-2001 Python-Version: 2.5 Post-History: 29-Mar-2001, 10-Jan-2005 Abstract This proposal puts forth an extensible cooperative mechanism for the adaptation of an incoming object to a context which expects an object supporting a specific protocol (say a specific type, class, or interface). This proposal provides a built-in "adapt" function that, for any object X and any protocol Y, can be used to ask the Python environment for a version of X compliant with Y. Behind the scenes, the mechanism asks object X: "Are you now, or do you know how to wrap yourself to provide, a supporter of protocol Y?". And, if this request fails, the function then asks protocol Y: "Does object X support you, or do you know how to wrap it to obtain such a supporter?" This duality is important, because protocols can be developed after objects are, or vice-versa, and this PEP lets either case be supported non-invasively with regard to the pre-existing component[s]. Lastly, if neither the object nor the protocol know about each other, the mechanism may check a registry of adapter factories, where callables able to adapt certain objects to certain protocols can be registered dynamically. This part of the proposal is optional: the same effect could be obtained by ensuring that certain kinds of protocols and/or objects can accept dynamic registration of adapter factories, for example via suitable custom metaclasses. However, this optional part allows adaptation to be made more flexible and powerful in a way that is not invasive to either protocols or other objects, thereby gaining for adaptation much the same kind of advantage that Python standard library's "copy_reg" module offers for serialization and persistence. This proposal does not specifically constrain what a protocol _is_, what "compliance to a protocol" exactly _means_, nor what precisely a wrapper is supposed to do. These omissions are intended to leave this proposal compatible with both existing categories of protocols, such as the existing system of type and classes, as well as the many concepts for "interfaces" as such which have been proposed or implemented for Python, such as the one in PEP 245 [1], the one in Zope3 [2], or the ones discussed in the BDFL's Artima blog in late 2004 and early 2005 [3]. However, some reflections on these subjects, intended to be suggestive and not normative, are also included. Motivation Currently there is no standardized mechanism in Python for checking if an object supports a particular protocol. Typically, existence of certain methods, particularly special methods such as __getitem__, is used as an indicator of support for a particular protocol. This technique works well for a few specific protocols blessed by the BDFL (Benevolent Dictator for Life). The same can be said for the alternative technique based on checking 'isinstance' (the built-in class "basestring" exists specifically to let you use 'isinstance' to check if an object "is a [built-in] string"). Neither approach is easily and generally extensible to other protocols, defined by applications and third party frameworks, outside of the standard Python core. Even more important than checking if an object already supports a given protocol can be the task of obtaining a suitable adapter (wrapper or proxy) for the object, if the support is not already there. For example, a string does not support the file protocol, but you can wrap it into a StringIO instance to obtain an object which does support that protocol and gets its data from the string it wraps; that way, you can pass the string (suitably wrapped) to subsystems which require as their arguments objects that are readable as files. Unfortunately, there is currently no general, standardized way to automate this extremely important kind of "adaptation by wrapping" operations. Typically, today, when you pass objects to a context expecting a particular protocol, either the object knows about the context and provides its own wrapper or the context knows about the object and wraps it appropriately. The difficulty with these approaches is that such adaptations are one-offs, are not centralized in a single place of the users code, and are not executed with a common technique, etc. This lack of standardization increases code duplication with the same adapter occurring in more than one place or it encourages classes to be re-written instead of adapted. In either case, maintainability suffers. It would be very nice to have a standard function that can be called upon to verify an object's compliance with a particular protocol and provide for a wrapper if one is readily available -- all without having to hunt through each library's documentation for the incantation appropriate to that particular, specific case. Requirements When considering an object's compliance with a protocol, there are several cases to be examined: a) When the protocol is a type or class, and the object has exactly that type or is an instance of exactly that class (not a subclass). In this case, compliance is automatic. b) When the object knows about the protocol, and either considers itself compliant, or knows how to wrap itself suitably. c) When the protocol knows about the object, and either the object already complies or the protocol knows how to suitably wrap the object. d) When the protocol is a type or class, and the object is a member of a subclass. This is distinct from the first case (a) above, since inheritance (unfortunately) does not necessarily imply substitutability, and thus must be handled carefully. e) When the context knows about the object and the protocol and knows how to adapt the object so that the required protocol is satisfied. This could use an adapter registry or similar approaches. The fourth case above is subtle. A break of substitutability can occur when a subclass changes a method's signature, or restricts the domains accepted for a method's argument ("co-variance" on arguments types), or extends the co-domain to include return values which the base class may never produce ("contra-variance" on return types). While compliance based on class inheritance _should_ be automatic, this proposal allows an object to signal that it is not compliant with a base class protocol. If Python gains some standard "official" mechanism for interfaces, however, then the "fast-path" case (a) can and should be extended to the protocol being an interface, and the object an instance of a type or class claiming compliance with that interface. For example, if the "interface" keyword discussed in [3] is adopted into Python, the "fast path" of case (a) could be used, since instantiable classes implementing an interface would not be allowed to break substitutability. Specification This proposal introduces a new built-in function, adapt(), which is the basis for supporting these requirements. The adapt() function has three parameters: - `obj', the object to be adapted - `protocol', the protocol requested of the object - `alternate', an optional object to return if the object could not be adapted A successful result of the adapt() function returns either the object passed `obj', if the object is already compliant with the protocol, or a secondary object `wrapper', which provides a view of the object compliant with the protocol. The definition of wrapper is deliberately vague, and a wrapper is allowed to be a full object with its own state if necessary. However, the design intention is that an adaptation wrapper should hold a reference to the original object it wraps, plus (if needed) a minimum of extra state which it cannot delegate to the wrapper object. An excellent example of adaptation wrapper is an instance of StringIO which adapts an incoming string to be read as if it was a textfile: the wrapper holds a reference to the string, but deals by itself with the "current point of reading" (from _where_ in the wrapped strings will the characters for the next, e.g., "readline" call come from), because it cannot delegate it to the wrapped object (a string has no concept of "current point of reading" nor anything else even remotely related to that concept). A failure to adapt the object to the protocol raises an AdaptationError (which is a subclass of TypeError), unless the alternate parameter is used, in this case the alternate argument is returned instead. To enable the first case listed in the requirements, the adapt() function first checks to see if the object's type or the object's class are identical to the protocol. If so, then the adapt() function returns the object directly without further ado. To enable the second case, when the object knows about the protocol, the object must have a __conform__() method. This optional method takes two arguments: - `self', the object being adapted - `protocol, the protocol requested Just like any other special method in today's Python, __conform__ is meant to be taken from the object's class, not from the object itself (for all objects, except instances of "classic classes" as long as we must still support the latter). This enables a possible 'tp_conform' slot to be added to Python's type objects in the future, if desired. The object may return itself as the result of __conform__ to indicate compliance. Alternatively, the object also has the option of returning a wrapper object compliant with the protocol. If the object knows it is not compliant although it belongs to a type which is a subclass of the protocol, then __conform__ should raise a LiskovViolation exception (a subclass of AdaptationError). Finally, if the object cannot determine its compliance, it should return None to enable the remaining mechanisms. If __conform__ raises any other exception, "adapt" just propagates it. To enable the third case, when the protocol knows about the object, the protocol must have an __adapt__() method. This optional method takes two arguments: - `self', the protocol requested - `obj', the object being adapted If the protocol finds the object to be compliant, it can return obj directly. Alternatively, the method may return a wrapper compliant with the protocol. If the protocol knows the object is not compliant although it belongs to a type which is a subclass of the protocol, then __adapt__ should raise a LiskovViolation exception (a subclass of AdaptationError). Finally, when compliance cannot be determined, this method should return None to enable the remaining mechanisms. If __adapt__ raises any other exception, "adapt" just propagates it. The fourth case, when the object's class is a sub-class of the protocol, is handled by the built-in adapt() function. Under normal circumstances, if "isinstance(object, protocol)" then adapt() returns the object directly. However, if the object is not substitutable, either the __conform__() or __adapt__() methods, as above mentioned, may raise an LiskovViolation (a subclass of AdaptationError) to prevent this default behavior. If none of the first four mechanisms worked, as a last-ditch attempt, 'adapt' falls back to checking a registry of adapter factories, indexed by the protocol and the type of `obj', to meet the fifth case. Adapter factories may be dynamically registered and removed from that registry to provide "third party adaptation" of objects and protocols that have no knowledge of each other, in a way that is not invasive to either the object or the protocols. Intended Use The typical intended use of adapt is in code which has received some object X "from the outside", either as an argument or as the result of calling some function, and needs to use that object according to a certain protocol Y. A "protocol" such as Y is meant to indicate an interface, usually enriched with some semantics constraints (such as are typically used in the "design by contract" approach), and often also some pragmatical expectation (such as "the running time of a certain operation should be no worse than O(N)", or the like); this proposal does not specify how protocols are designed as such, nor how or whether compliance to a protocol is checked, nor what the consequences may be of claiming compliance but not actually delivering it (lack of "syntactic" compliance -- names and signatures of methods -- will often lead to exceptions being raised; lack of "semantic" compliance may lead to subtle and perhaps occasional errors [imagine a method claiming to be threadsafe but being in fact subject to some subtle race condition, for example]; lack of "pragmatic" compliance will generally lead to code that runs ``correctly'', but too slowly for practical use, or sometimes to exhaustion of resources such as memory or disk space). When protocol Y is a concrete type or class, compliance to it is intended to mean that an object allows all of the operations that could be performed on instances of Y, with "comparable" semantics and pragmatics. For example, a hypothetical object X that is a singly-linked list should not claim compliance with protocol 'list', even if it implements all of list's methods: the fact that indexing X[n] takes time O(n), while the same operation would be O(1) on a list, makes a difference. On the other hand, an instance of StringIO.StringIO does comply with protocol 'file', even though some operations (such as those of module 'marshal') may not allow substituting one for the other because they perform explicit type-checks: such type-checks are "beyond the pale" from the point of view of protocol compliance. While this convention makes it feasible to use a concrete type or class as a protocol for purposes of this proposal, such use will often not be optimal. Rarely will the code calling 'adapt' need ALL of the features of a certain concrete type, particularly for such rich types as file, list, dict; rarely can all those features be provided by a wrapper with good pragmatics, as well as syntax and semantics that are really the same as a concrete type's. Rather, once this proposal is accepted, a design effort needs to start to identify the essential characteristics of those protocols which are currently used in Python, particularly within the standard library, and to formalize them using some kind of "interface" construct (not necessarily requiring any new syntax: a simple custom metaclass would let us get started, and the results of the effort could later be migrated to whatever "interface" construct is eventually accepted into the Python language). With such a palette of more formally designed protocols, the code using 'adapt' will be able to ask for, say, adaptation into "a filelike object that is readable and seekable", or whatever else it specifically needs with some decent level of "granularity", rather than too-generically asking for compliance to the 'file' protocol. Adaptation is NOT "casting". When object X itself does not conform to protocol Y, adapting X to Y means using some kind of wrapper object Z, which holds a reference to X, and implements whatever operation Y requires, mostly by delegating to X in appropriate ways. For example, if X is a string and Y is 'file', the proper way to adapt X to Y is to make a StringIO(X), *NOT* to call file(X) [which would try to open a file named by X]. Numeric types and protocols may need to be an exception to this "adaptation is not casting" mantra, however. Guido's "Optional Static Typing: Stop the Flames" Blog Entry A typical simple use case of adaptation would be: def f(X): X = adapt(X, Y) # continue by using X according to protocol Y In [4], the BDFL has proposed introducing the syntax: def f(X: Y): # continue by using X according to protocol Y to be a handy shortcut for exactly this typical use of adapt, and, as a basis for experimentation until the parser has been modified to accept this new syntax, a semantically equivalent decorator: @arguments(Y) def f(X): # continue by using X according to protocol Y These BDFL ideas are fully compatible with this proposal, as are other of Guido's suggestions in the same blog. Reference Implementation and Test Cases The following reference implementation does not deal with classic classes: it consider only new-style classes. If classic classes need to be supported, the additions should be pretty clear, though a bit messy (x.__class__ vs type(x), getting boundmethods directly from the object rather than from the type, and so on). ----------------------------------------------------------------- adapt.py ----------------------------------------------------------------- class AdaptationError(TypeError): pass class LiskovViolation(AdaptationError): pass _adapter_factory_registry = {} def registerAdapterFactory(objtype, protocol, factory): _adapter_factory_registry[objtype, protocol] = factory def unregisterAdapterFactory(objtype, protocol): del _adapter_factory_registry[objtype, protocol] def _adapt_by_registry(obj, protocol, alternate): factory = _adapter_factory_registry.get((type(obj), protocol)) if factory is None: adapter = alternate else: adapter = factory(obj, protocol, alternate) if adapter is AdaptationError: raise AdaptationError else: return adapter def adapt(obj, protocol, alternate=AdaptationError): t = type(obj) # (a) first check to see if object has the exact protocol if t is protocol: return obj try: # (b) next check if t.__conform__ exists & likes protocol conform = getattr(t, '__conform__', None) if conform is not None: result = conform(obj, protocol) if result is not None: return result # (c) then check if protocol.__adapt__ exists & likes obj adapt = getattr(type(protocol), '__adapt__', None) if adapt is not None: result = adapt(protocol, obj) if result is not None: return result except LiskovViolation: pass else: # (d) check if object is instance of protocol if isinstance(obj, protocol): return obj # (e) last chance: try the registry return _adapt_by_registry(obj, protocol, alternate) ----------------------------------------------------------------- test.py ----------------------------------------------------------------- from adapt import AdaptationError, LiskovViolation, adapt from adapt import registerAdapterFactory, unregisterAdapterFactory import doctest class A(object): ''' >>> a = A() >>> a is adapt(a, A) # case (a) True ''' class B(A): ''' >>> b = B() >>> b is adapt(b, A) # case (d) True ''' class C(object): ''' >>> c = C() >>> c is adapt(c, B) # case (b) True >>> c is adapt(c, A) # a failure case Traceback (most recent call last): ... AdaptationError ''' def __conform__(self, protocol): if protocol is B: return self class D(C): ''' >>> d = D() >>> d is adapt(d, D) # case (a) True >>> d is adapt(d, C) # case (d) explicitly blocked Traceback (most recent call last): ... AdaptationError ''' def __conform__(self, protocol): if protocol is C: raise LiskovViolation class MetaAdaptingProtocol(type): def __adapt__(cls, obj): return cls.adapt(obj) class AdaptingProtocol: __metaclass__ = MetaAdaptingProtocol @classmethod def adapt(cls, obj): pass class E(AdaptingProtocol): ''' >>> a = A() >>> a is adapt(a, E) # case (c) True >>> b = A() >>> b is adapt(b, E) # case (c) True >>> c = C() >>> c is adapt(c, E) # a failure case Traceback (most recent call last): ... AdaptationError ''' @classmethod def adapt(cls, obj): if isinstance(obj, A): return obj class F(object): pass def adapt_F_to_A(obj, protocol, alternate): if isinstance(obj, F) and issubclass(protocol, A): return obj else: return alternate def test_registry(): ''' >>> f = F() >>> f is adapt(f, A) # a failure case Traceback (most recent call last): ... AdaptationError >>> registerAdapterFactory(F, A, adapt_F_to_A) >>> f is adapt(f, A) # case (e) True >>> unregisterAdapterFactory(F, A) >>> f is adapt(f, A) # a failure case again Traceback (most recent call last): ... AdaptationError >>> registerAdapterFactory(F, A, adapt_F_to_A) ''' doctest.testmod() Relationship To Microsoft's QueryInterface Although this proposal has some similarities to Microsoft's (COM) QueryInterface, it differs by a number of aspects. First, adaptation in this proposal is bi-directional, allowing the interface (protocol) to be queried as well, which gives more dynamic abilities (more Pythonic). Second, there is no special "IUnknown" interface which can be used to check or obtain the original unwrapped object identity, although this could be proposed as one of those "special" blessed interface protocol identifiers. Third, with QueryInterface, once an object supports a particular interface it must always there after support this interface; this proposal makes no such guarantee, since, in particular, adapter factories can be dynamically added to the registried and removed again later. Fourth, implementations of Microsoft's QueryInterface must support a kind of equivalence relation -- they must be reflexive, symmetrical, and transitive, in specific senses. The equivalent conditions for protocol adaptation according to this proposal would also represent desirable properties: # given, to start with, a successful adaptation: X_as_Y = adapt(X, Y) # reflexive: assert adapt(X_as_Y, Y) is X_as_Y # transitive: X_as_Z = adapt(X, Z, None) X_as_Y_as_Z = adapt(X_as_Y, Z, None) assert (X_as_Y_as_Z is None) == (X_as_Z is None) # symmetrical: X_as_Z_as_Y = adapt(X_as_Z, Y, None) assert (X_as_Y_as_Z is None) == (X_as_Z_as_Y is None) However, while these properties are desirable, it may not be possible to guarantee them in all cases. QueryInterface can impose their equivalents because it dictates, to some extent, how objects, interfaces, and adapters are to be coded; this proposal is meant to be not necessarily invasive, usable and to "retrofit" adaptation between two frameworks coded in mutual ignorance of each other without having to modify either framework. Transitivity of adaptation is in fact somewhat controversial, as is the relationship (if any) between adaptation and inheritance. The latter would not be controversial if we knew that inheritance always implies Liskov substitutability, which, unfortunately we don't. If some special form, such as the interfaces proposed in [4], could indeed ensure Liskov substitutability, then for that kind of inheritance, only, we could perhaps assert that if X conforms to Y and Y inherits from Z then X conforms to Z... but only if substitutability was taken in a very strong sense to include semantics and pragmatics, which seems doubtful. (For what it's worth: in QueryInterface, inheritance does not require nor imply conformance). This proposal does not include any "strong" effects of inheritance, beyond the small ones specifically detailed above. Similarly, transitivity might imply multiple "internal" adaptation passes to get the result of adapt(X, Z) via some intermediate Y, intrinsically like adapt(adapt(X, Y), Z), for some suitable and automatically chosen Y. Again, this may perhaps be feasible under suitably strong constraints, but the practical implications of such a scheme are still unclear to this proposal's authors. Thus, this proposal does not include any automatic or implicit transitivity of adaptation, under whatever circumstances. For an implementation of the original version of this proposal which performs more advanced processing in terms of transitivity, and of the effects of inheritance, see Phillip J. Eby's PyProtocols [5]. The documentation accompanying PyProtocols is well worth studying for its considerations on how adapters should be coded and used, and on how adaptation can remove any need for typechecking in application code. Questions and Answers Q: What benefit does this proposal provide? A: The typical Python programmer is an integrator, someone who is connecting components from various suppliers. Often, to interface between these components, one needs intermediate adapters. Usually the burden falls upon the programmer to study the interface exposed by one component and required by another, determine if they are directly compatible, or develop an adapter. Sometimes a supplier may even include the appropriate adapter, but even then searching for the adapter and figuring out how to deploy the adapter takes time. This technique enables suppliers to work with each other directly, by implementing __conform__ or __adapt__ as necessary. This frees the integrator from making their own adapters. In essence, this allows the components to have a simple dialogue among themselves. The integrator simply connects one component to another, and if the types don't automatically match an adapting mechanism is built-in. Moreover, thanks to the adapter registry, a "fourth party" may supply adapters to allow interoperation of frameworks which are totally unaware of each other, non-invasively, and without requiring the integrator to do anything more than install the appropriate adapter factories in the registry at start-up. As long as libraries and frameworks cooperate with the adaptation infrastructure proposed here (essentially by defining and using protocols appropriately, and calling 'adapt' as needed on arguments received and results of call-back factory functions), the integrator's work thereby becomes much simpler. For example, consider SAX1 and SAX2 interfaces: there is an adapter required to switch between them. Normally, the programmer must be aware of this; however, with this adaptation proposal in place, this is no longer the case -- indeed, thanks to the adapter registry, this need may be removed even if the framework supplying SAX1 and the one requiring SAX2 are unaware of each other. Q: Why does this have to be built-in, can't it be standalone? A: Yes, it does work standalone. However, if it is built-in, it has a greater chance of usage. The value of this proposal is primarily in standardization: having libraries and frameworks coming from different suppliers, including the Python standard library, use a single approach to adaptation. Furthermore: 0. The mechanism is by its very nature a singleton. 1. If used frequently, it will be much faster as a built-in. 2. It is extensible and unassuming. 3. Once 'adapt' is built-in, it can support syntax extensions and even be of some help to a type inference system. Q: Why the verbs __conform__ and __adapt__? A: conform, verb intransitive 1. To correspond in form or character; be similar. 2. To act or be in accord or agreement; comply. 3. To act in accordance with current customs or modes. adapt, verb transitive 1. To make suitable to or fit for a specific use or situation. Source: The American Heritage Dictionary of the English Language, Third Edition Backwards Compatibility There should be no problem with backwards compatibility unless someone had used the special names __conform__ or __adapt__ in other ways, but this seems unlikely, and, in any case, user code should never use special names for non-standard purposes. This proposal could be implemented and tested without changes to the interpreter. Credits This proposal was created in large part by the feedback of the talented individuals on the main Python mailing lists and the type-sig list. To name specific contributors (with apologies if we missed anyone!), besides the proposal's authors: the main suggestions for the proposal's first versions came from Paul Prescod, with significant feedback from Robin Thomas, and we also borrowed ideas from Marcin 'Qrczak' Kowalczyk and Carlos Ribeiro. Other contributors (via comments) include Michel Pelletier, Jeremy Hylton, Aahz Maruch, Fredrik Lundh, Rainer Deyke, Timothy Delaney, and Huaiyu Zhu. The current version owes a lot to discussions with (among others) Phillip J. Eby, Guido van Rossum, Bruce Eckel, Jim Fulton, and Ka-Ping Yee, and to study and reflection of their proposals, implementations, and documentation about use and adaptation of interfaces and protocols in Python. References and Footnotes [1] PEP 245, Python Interface Syntax, Pelletier http://www.python.org/peps/pep-0245.html [2] http://www.zope.org/Wikis/Interfaces/FrontPage [3] http://www.artima.com/weblogs/index.jsp?blogger=guido [4] http://www.artima.com/weblogs/viewpost.jsp?thread=87182 [5] http://peak.telecommunity.com/PyProtocols.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: