Safe Lazy Upgrades in Persistent Object Stores

Principal Investigator Barbara Liskov

Co-investigator Liuba Shrira

This project is concerned with the problem of how to do efficient automatic upgrades for objects in a distributed persistent object store. A persistent object store contains conventional objects similar to what one might find in an object-oriented language such as Java. Applications access persistent objects within atomic transactions, since this is necessary to ensure consistency for the stored objects; transactions allow for concurrent access and they mask failures. Persistent object stores provide long term storage for objects and therefore we can expect that changes in implementations of persistent objects will sometimes be needed. Upgrades are needed to to improve object implementations, to correct errors, or even to change interfaces in the face of changing application requirements. This includes incompatible changes to interfaces where the new interface does not support the same methods as the old one. Providing a satisfactory solution for upgrades in persistent object stores has been a long-standing challenge. Upgrades must be performed in a way that does not interfere with application access to the store, and must be done efficiently in both space and time. In addition, upgrades must be done safely so that important persistent state is not corrupted.

It is desirable to have a way to reason statically about the correctness of transform functions; this issue has been the focus of our recent work. Our work on static reasoning has focused on the issue of static enforcement of object encapsulation. Object encapsulation is important because it provides the ability to reason locally about program correctness. Reasoning about a class in an object-oriented program involves reasoning about the behavior of objects belonging to the class. Typically objects point to other subobjects, which are used to represent the containing object. Local reasoning about class correctness is possible if the subobjects are fully encapsulated, that is, if all subobjects are accessible only within the containing object. This condition supports local reasoning because it ensures that outside objects cannot interact with the subobjects without calling methods of the containing object. And therefore the containing object is in control of its subobjects. However, full encapsulation is often more than is needed. Encapsulation is only required for subobjects that the containing object depends on. An object x depends on subobject y if x calls methods of y and furthermore these calls expose mutable behavior of y in a way that affects the invariants of x. Thus, a Stack object s implemented using a linked list depends on the list but not on the items contained in the list. If code outside could manipulate the list, it could invalidate the correctness of the Stack implementation. But code outside can safely use the items contained in s because s doesn’t call their methods; it only depends on the identities of the items and the identities never change. Similarly, a Set of immutable elements does not depend on the elements even if it invokes a.equals(b) to ensure that no two elements a and b in the Set are equal, because the elements are immutable. Local reasoning about a class is possible if objects of that class encapsulate every object they depend on. But strict object encapsulation is too constraining: it prevents efficient implementation of important constructs like iterators. For example, to run efficiently, an iterator over the above-mentioned Stack object s needs access to the list nodes in s. To provide this access, we have to allow objects like iterators to violate encapsulation. Local reasoning is still possible provided all violations of encapsulation are limited to code contained in the same module. For example, if both the Stack and its iterator are implemented in the same module, we can still reason about their correctness locally, by examining the code of that module. To enable static reasoning, we have developed a variant of ownership types. The ownership type system allows object encapsulation to be defined statically: if x is supposed to encapsulate y, this information can be captured as part of defining the class of x. Unlike earlier work, our ownership-type system is both sufficiently constrained and expressive: it ensures local reasoning, yet allows the definition of constructs like iterators that require special access to the representation of the object being iterated over. The ownership type system enables static analysis to detect when transform functions are defined in such a way that the lazy scheme works correctly. An additional benefit is that in this case transform functions can be reasoned about modularly: they can be thought of as an extra method of the old-class.