Traditionally, peer-to-peer (P2P) applications have been associated with pirated media content. But serious developers and architects have been looking at P2P architectures for several years as a way of improving scalability and resiliency, and to lower the costs of deploying large-scale distributed applications.
The BitTorrent protocol and infrastructure, which has emerged as one of the most popular approaches for disseminating files, leverages the Internet connection of each client. The company behind the effort recently announced Project Maelstrom as an infrastructure for hosting websites. Van Jacobson, a primary contributor to the TCP/IP protocol stack, worked on the Name Data Networking (NDN) and Content Centric Networking (CCN) protocols during the past decade. These protocols make content directly addressable rather than via IP addresses. Preliminary work continues on CCN and NDN.
In another effort, John Benet, founder of Protocol Labs, is developing a novel P2P server infrastructure for Web applications called the InterPlanetary File System (IPFS). IPFS is meant as a different transport than HTTP. Benet said, "IPFS could be a way to generate need for CCN/NDN in the long term." IPFS can replace traditional file servers with a distributed file system that can be secured, updated and located using a dynamic naming system. Beta code for setting up IPFS clients and servers is available online.
In practice, this can be very fast, except for users who are totally disconnected for periods of time.
founder, Protocol Labs
Benet said that although BitTorrent has risen in popularity, no general file system has emerged that offers global, low-latency and decentralized distribution. IPFS blends together elements from BitTorrent architecture, Git, self-certifying file systems, and Bitcoin-like block-chains. This approach could make it easier to protect content from censorship, improve resiliency and allow Web applications to run in the presence of Web firewalls.
A new approach required for naming
Instead of location-based addressing, content addressing is important to make distributed content accessible with less need for a consolidated backbone. With content addressing, content could be posted once, and when the network goes down, users could still access it from a mirror.
If someone wants to publish something, they just have to create a record and sign it with a private key. This is important even when the data itself does not need to be protected. "It protects that content from unwanted changes, and provides a chain of custody between creators and consumers of data," Benet said.
IPFS already has support for mutability, through the InterPlanetary Naming System (IPNS). An example of IPNS CLI usage is available on GitHub. Benet said, "IPNS is a simple solution to a very complicated problem, whose utility seems now much larger than we anticipated."
One of the key concepts is the use of a Merkle directed acyclic graph, which is drawn from the Git architecture. Another is the use of a self-certifying file system that uses public key cryptography to protect content from unauthorized changes. At a bare minimum, IPFS allows the creation of global, mounted, versioned file system and name space.
Ensuring permanence and data sovereignty
Benet believes that a solid P2P server infrastructure needs to support a notion of permanence so that if one node is lost, the data can be recreated from other nodes. It is also important to allow support for data sovereignty so that users could take ownership of information posted to a social media ecosystem.
Bandwidth and latency problems are a concern if the content has to request a massive duplication and go through a backbone to pull it back down. With video, for example, each user might be pulling down large amounts of data from the server. "That's silly if our devices cannot talk to each other," Benet said.
Another problem lies in collaborative environments in which users are near each other, but disconnected from the Internet backbone. This could be critical in cases where a local government tries to disconnect the backbone, such as occurred when Egypt shut down the Internet.
Stepping into the future of P2P
In the future, servers in a P2P network might be decoupled and still provide access. Someone might host a centralized copy, but content like Wikipedia could be re-hosted by others. Going forward Benet's team is working a pub/sub (or multicast) type system on IPFS, live chat and a conflict-free replicated data type infrastructure for applications like Google Docs.
IPFS could provide the transport for any sophisticated application, with eventual consistency across the wide Internet. "In practice," Benet said, "this can be very fast, except for users who are totally disconnected for periods of time."