Monday, December 29, 2008

The United Federation of Cloud Providers

A fundamental challenge in creating and managing a globally decentralized cloud computing environment is that of maintaining consistent connectivity between various untrusted components that are capable of self-organization while remaining fault tolerant. In the next few years the a key opportunity for the emerging cloud industry will be on defining a federated cloud ecosystem by connecting multiple cloud computing providers using an agreeing upon standard or interface. In this post I will examine some of work being done in cloud federation ranging from adaptive authentication to modern P2P botnets.

Cloud Computing is undoubtedly a hot topic these days, lately it seems just about everyone is claiming to be a cloud of some sort. At Enomaly our focus is on the supposed "cloud enabler" Those daring enough to go out and create their very own computing clouds, either privately or publicly. In our work it has become obvious the the real problems are not in building these large clouds, but in maintaining them. Let me put it this way, deploying 50,000 machines is relatively straight forward, updating 50,000 machines or worst yet taking back control after a security exploit is not.

There are a number of organizations looking into solving the problem of cloud federation. Traditionally, there has been a lot of work done in the grid space. More recently, a notable research project being conducted by Microsoft called the “Geneva Framework" has been focusing on some the issues surrounding cloud federation. Geneva is described as a Claims Based Access Platform and is said to help simplify access to applications and other systems with an open and interoperable claims-based model.

In case you're not familiar with the claims authentication model, the general idea is using claims about a user, such as age or group membership, that are passed to obtain access to the cloud environment and to systems integrated with that environment. Claims could be built dynamically, picking up information about users and validating existing claims via a trusted source as the user traverses a multiple cloud environments. More simply, the concept allows for multiple providers to seamlessly interact with another. The model enables developers to incorporate various authentication models that works with any corporate identity system, including Active Directory, LDAPv3-based directories, application-specific databases and new user-centric identity models, such as LiveID, OpenID and InfoCard systems, including Microsoft’s CardSpace and Novell's Digital Me. For Microsoft, Authentication seems to be at heart of their interoperability focus. For anyone more microsoft inclined, Geneva is certainly worth a closer look.

For the more academically focused, I recommend reading a recent paper titled Decentralized Overlay for Federation of Enterprise Clouds published by Rajiv Ranjan and Rajkumar Buyya at the The University of Melbourne. The team outlines the need for cloud decentralization & federation to create a globalized cloud platform. In the paper they say that distributed cloud configuration should be considered to be decentralized if none of the components in the system are more important than the others, in case that one of the component fails, then it is neither more nor less harmful to the system than caused by the failure of any other component in the system. The paper also outlines the opportunities to use Peer2Peer (P2P) protocols as the basis for these decentralized systems.

The paper is very relevant given the latest discussions occurring in the cloud interoperability realm. The paper outlines several key problems areas:
  • Large scale – composed of distributed components (services, nodes, applications,users, virtualized computers) that combine together to form a massive environment. These days enterprise Clouds consisting of hundreds of thousands of computing nodes are common (Amazon EC2, Google App Engine,Microsoft Live Mesh) and hence federating them together leads to a massivescale environment;
  • Resource contention - driven by the resource demand pattern and a lack of
    cooperation among end-user’s applications, particular set of resources can get
    swamped with excessive workload, which significantly undermines the overall
    utility delivered by the system;
  • Dynamic – the components can leave and join the system at will.
Another topic of the paper is on the challenges in regards to the design and development of decentralized, scalable, self-organizing, and federated Cloud computing system as well as a applying the the characteristics of a peer-to-peer resource protocols, which they call Aneka-Federation. (I've tried to find any other references to Aneka, but it seems to be a term used solely withing the university of Melbourne, interesting none the less)

Also interesting was the problems they outline with earlier distributed computing projects such as Seti@home saying they these systems do not provide any support for multi-application and programming models. A major factors driving some of the more traditional users of grid technologies to the use of cloud computing.

One the of questions large scale cloud computing opens is not about how to many a few thousand machines, but how do you manage a few hundred thousand machines? A lot of the work being done in decentralized cloud computing can be traced back to the emergence of modern botnets. A recent paper titled "An Advanced Hybrid Peer-to-Peer Botnet" Ping Wang, Sherri Sparks, Cliff C. Zou at The University of Central Florida outlines some of the "opportunities" by examining the creation of a hybrid P2P botnet.

In the paper the UCF team outlines the problems encountered by P2P botnets which appear surprisingly similar to the problems being encountered by the cloud computing community. The paper lays out the following practical challenges faced by botmasters; (1). How to generate a robust botnet capable of maintaining control of its remaining bots even after a substantial portion of the botnet population has been removed by defenders? (2). How to prevent significant exposure of the network topology when some bots are captured by defenders? (3). How to easily monitor and obtain the complete information of a botnet by its botmaster? (4). How to prevent (or make it harder) defenders from detecting bots via their communication traffic patterns? In addition, the design should also consider many network related issues such as dynamic or private IP addresses and the diurnal online/offline property of bots. A very interesting read.

I am not condoning the use of botnets, but architecturally speaking we can learn a lot from our more criminally focused colleagues. Don't kid yourselves, they're already looking at ways to take control of your cloud and federation will be a key aspect in how you protect yourself and your users from being taken for a ride.

#DigitalNibbles Podcast Sponsored by Intel

If you would like to be a guest on the show, please get in touch.

Instagram