Wednesday, February 4, 2009

Cloud Control with Google Talk (Botnets)

In digging through Googles XMPP and GTalk developers documentation I may have stumbled upon the biggest threat/opportunity in Google's cloud toolbox. Before I go any deeper into this, I will warn you that this technology can also be potentially used for a host of illicit activities. None of which I condone.

Google Talk is very possibly the most advanced open communication platform on the globe. It's an extremely flexible instant messaging platform which utilizes an open protocol called XMPP. This protocol allows users of other XMPP clients to communicate securely with Google Talk users using Google's extensive global infrastructure. It also uses an innovative P2P approach within it's VoIP library based around the Jingle protocol. A major driver of the Google Talk service is focused on interoperability or what they describe as Open Communications.

The Google Talk network supports open interoperability with hundreds of other communications service providers through a process known as federation. This means that a user on one service can communicate with users on another service without needing to sign up for, or sign in with, each service. At it's heart, this is where some of the biggest opportunities for Google Talk are. Any users of the Google Talk platform or any other XMPP based platform just need to support the XMPP standard for server-to-server federation and their users will be able to talk to the Google users (gmail & google apps).

Google Talk is by far one of the biggest deployments of an XMPP based cloud infrastructure anywhere. To help enable this "interoperable" and federated XMPP platform is a component Google offers called Libjingle. The component is described as an open source P2P Interoperability Library and is made available under an open source Berkeley-style license. More specifically Libjingle is a set of components that provides the ability to interoperate with Google Talk's XMPP peer-to-peer file sharing and voice calling capabilities in a variety of amazing ways.

Unfortunately the closest real world example for Libjingle is that of a modern P2P botnet and its use of distributed command-and-control (C&C) systems, which are typically embedded directly into the botnet itself. Similarly Libjingle can act as dynamic updating component capable of transversing networks and proxies through builtin negotiation classes. Another interesting feature is its use of P2P and distributed architectures which makes it able to avoid any single points of failure. Through Google Talk's "Group Chat" an administrator can be identified solely through secure keys and all data except the binary itself can be encrypted making the communication channels extremely secure. Similarly a shady individual could potentially create several anonymous gmail accounts for the basis of a Google Talk based C&C. (Yup, kind of scary.)

I'm not going to get into all the finer details of the library except to say this software extremely versatile.

There are a number of interesting usecases for Google Talk in the context of cloud based monitoring as well as command and control. The simplest example is that of a cloud system monitor & scaling agent. Google Talk, like many chat clients, lets a user display a custom status message to other users. The Google Talk server stores lists of recently used status messages, and you can request and modify these values. An XMPP XEP extension enables a client to retrieve and modify these stored message lists, and also provides notifications so that all resources can report an updated status (i.e. up,down, under heavy load, here's my ip, not responding, offline, etc) which are available for all other approved "group members" to see. Whenever any resource changes its status a message is sent to all other resources which are notified with the new values and can adjust accordingly. For example if an EC2 node goes down, every other node could be almost instantly notified using a gossip protocol which spreads the message that ABC node is no longer available over a huge number of interconnected application nodes in very short period of time. So in a sense it's a perfect distributed command and control "cloud".

For me, I think the more interesting use case is a highly available cloud based communications platform. It is a fact that most cloud's can and will go down. One of the most highly available infrastructures with little doubt is Google's. So even if your Cloud goes down, your command and control will remain allowing you to rapidly adapt.

#DigitalNibbles Podcast Sponsored by Intel

If you would like to be a guest on the show, please get in touch.