Sergey Nivens - Fotolia
Peer-to-peer networking technologies are commonly associated with file-sharing movies and music on the internet. But the core techniques show promise for developing novel architectures for enterprise applications as well. An Israeli startup called Hola is leveraging P2P principles to rethink CDN server applications by building a free anonymizing and anonymous browsing service.
"There are a lot of use cases for using the Hola app to eliminate geographic restrictions on the Internet," said Ofer Vilenski, Hola CEO and founder. The company started with a free, anonymous browsing consumer application called Hola that allows people to bypass censorship and controls. This has quickly grown into a distributed HTTP overlay used by more than 86 million people. This infrastructure is funded by Luminati.io, a SaaS product that allows businesses to conduct brand monitoring from millions of consumer IP addresses. The service lets enterprises monitor distributor pricing, identify ad fraud, see how their own websites are performing globally and check prices on competitor sites.
There are a variety of different use cases in enterprise settings that go far beyond simply helping with anonymous browsing. Major retailers commonly block competitors IP addresses from doing price checks or generating fake pricing data. Luminati allows a company to crawl a website such that each query appears to come from a different IP address. This approach also helps to maintain honesty in retail distribution networks. In some cases, a retailer will show one price to requests from a manufacturer's IP addresses and others to consumers. Luminat.io allows a company to see the same pricing as consumers see. Over 500 enterprises are now using this service.
P2P provides back-end efficiency
Ofer VilenskiHola CEO and founder
Other CDNs typically attempt to direct clients to the closest server. As a result, servers in particular markets tend to become overused during peak viewing hours in a particular region. In addition, traditional CDN networks commonly maintain a static connection between a single CDN endpoint and the client. Using P2P protocols between servers and between the servers and client allows the distribution to dynamically adjust to changing loads.
Start time is critical in the beginning of a streaming session, which shows up as the amount of time between clicking a link and the beginning of a video. But it is less critical after a client has buffered some video. The P2P distribution approach allows the client to download packets from the quickest servers in the beginning of a session and the most cost effective servers later once the buffer is full.
When serving content and performing anonymous browsing via a distributed network, servers need to effectively propagate content among themselves with minimal overhead, while ensuring the content served is not stale (e.g., an expired video). In addition, algorithms need to ensure that the selection of content stored on these servers maximizes the cache hit rate. This usually means keeping content that is requested frequently for long periods and purging content which is requested infrequently. In the Hola app's P2P CDN network, servers effectively propagate content using P2P algorithms in the cloud. Granular control of objects maximizes the cache hit rate.
Use P2P to load balance from the client
"If a CDN is implemented with a load balancer, then that balancer needs to be in the same geography as the end user, and thus all the traffic needs to run through that balancer as it's getting it from multiple servers," said Vilenski. This then requires deploying servers in each POP to serve those geographies at peak hours. In the Hola app's software architecture, the video manager on the client does the load balancing among the various servers around the world. As a result, the back-end infrastructure only requires a few servers in each region to initiate quick starts.
A load balancer dividing a load among multiple servers operating in the same location requires enough servers to handle peak loads. During off-peak hours, these servers will be idle. Distributing the same content across a network of servers located in different time zones allows for a higher use level of these servers because peak hours in France are actually off-peak hours in the U.S., for example. This improved use means that on average fewer servers are needed.
This approach also means a client can retrieve fragments of content from multiple locations in parallel. In other words, instead of downloading from a single source, the client side has multiple servers to choose from. A single user will receive fragments of content from multiple servers, so client-side algorithms are used to assemble fragments back into a single stream. Client-side algorithms can download content from multiple sources simultaneously, compensating for variations in latency and bandwidth.
Why your browser isn't ever anonymous
How to web surf anonymously
Learn the best ways for employees to surf the Web