ANGEL APPLICATION Overview
Current storage systems are generally not limited by storage capacity, but rather by reliability (i.e. protection against data loss) as well as scalability (storage capacity grows with requirements), accessibility (data remains accessible to increasingly mobile users) and security considerations (data is protected from unauthorized access). Backup policies implemented to address these issues typically fail to account for one or more of them, and tend to be either ad hoc (irregular, manual backups) or expensive (centralized backup services in large organizations), generally requiring human intervention at regular intervals.
Due to the widespread use of broadband internet access a new class of storage solutions has recently received significant attention by the academic community: redundant, self-replicating, distributed, community-driven (peer-to-peer) file systems [1-3]. Here, the central idea is that users can collaborate to form a shared storage space, consuming the storage they need and providing the storage they don't need to other users (see Figure 1). By ensuring that multiple (encrypted, checksummed) copies of each file are stored in different locations at any given time, and autonomously replacing corrupted copies, expected data life-times can be extended to arbitrary time-scales (de facto hundreds to thousands of years) while incurring only a logarithmically scaling (with regard to the number of copies) storage and communication overhead. Additionally, a suitable algorithm should be able to autonomously balance requirements and capacity between individual users -- the system capacity (storage space and bandwith) and requirements should grow in synchronously.
- Figure 1: Redundancy and distribution enhance storage reliability.
However, no readily accessible, easy-to-use implementation of such a system currently exists. It seems plausible to assume that this is at least partially due to the following reasons: (i) the system does not provide a single point of control (e.g. a server of some sort), since that would correspond to a single point of failure, (ii) the source code of the system must be accessible in order for the system to be sufficiently trustworthy to be used for long-term storage (it seems unreasonable to rely on e.g. a company which might go bankrupt anytime), (iii) the system must be extremely easy to use in order to be viral and attract a sufficient number of users. Together, these reasons effectively prohibit the development of such a system in a commercial environment, since it is virtually impossible to sell the system as a shrink-wrapped software package (must be open source), or to provide a commercial on-line service (no centralized server). Academia, on the other hand, seems to only show a moderate interest in producing a sufficiently polished product after having described the key concepts in great detail.
In contrast, the motivation behind the development of MISSION ETERNITY's ANGEL APPLICATION, is purely feature-based: the primary interest is the access to secure, affordable, easy-to-use, ultra-long-term storage. Considerations such as commercial success or academic credit, while certainly attractive, are clearly secondary. Given these special circumstances, we are in the process of finishing a functional prototype of the ANGEL APPLICATION (source code available under an open source license), where users can collaborate via two-party contracts, storing MISSION ETERNITY's ARCANUM-CAPSULES (as well as each other’s data) in encrypted form, effectively forming a social file system which provides safe, reliable, globally accessible storage for anyone willing to participate. Ease of use is provided by hiding most of the complex, networked environment behind standard network-storage interfaces (WebDAV at this time, see Figure 2).
- Figure 2: Screenshot of the ANGEL APPLICATION prototype in action. The data stored in the system is encrypted (RSA; see "Terminal" window). However, if the required cryptographic keys are installed on the system, it may be accessed transparently via a "shared folder" (the "localhost" folder on the desktop). The necessary cryptographic and networking processes are invisible to the end-user.
The ANGEL APPLICATION prototype is implemented in Python2.4, using the twisted matrix library for networking and the extended filesystem attributes (xattr) for metadata storage (cryptographic keys, signatures, file references). It currently runs on Mac OS X and Linux.
Outlook: The primary goal for the ANGEL APPLICATION has been to provide easy to use redundant storage. This goal is within reach. However, it should be noted that the current implementation may be seen as the first step towards a distributed, scalable, transparently accessible (i.e. full support for native file system semantics) data storage and distribution system. The decoupling from the underlying hardware introduced with the ANGEL APPLICATION implies that in principle unlimited amounts of data can be stored and retrieved from the system without any a priori bandwith limitation (it is similar to bittorrent in this regard, i.e. limited by the number of participants, rather than the hardware infrastructure of any individual participant). It is therefore extremely well suited as an infrastructure for tasks such as open content archival and analysis (e.g. a community-operated search engine). We are currently actively soliciting support (from developers and potential investors) to further investigate these opportunities.
[1] Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" http://oceanstore.cs.berkeley.edu/publications/papers/pdf/asplos00.pdf, U.C. Berkeley Technical Report, 2000, http://oceanstore.cs.berkeley.edu/ .
[2] L. P. Cox and B. D. Noble., "Pastiche: making backup cheap and easy.", Fifth USENIX Symposium on Operating Systems Design and Implementation, December, 2002, Boston, MA, http://mobility.eecs.umich.edu/pastiche/pastiche.html .
[3] Landon P. Cox and Brian D. Noble. "Samsara: Honor Among Thieves in Peer-to-Peer Storage" http://mobility.eecs.umich.edu/pastiche/papers/sosp03.pdf. In the 19th ACM Symposium on Operating Systems Principles. Bolton Landing, NY, October 2003 .