There has been an enormous amount of discussion over the past two years about new technologies that are going to set a new standard in the application hosting space. But what is fact and what is fiction, and more importantly, should you care?
The first time I heard about Grid Computing I was still up at school in the late 90s which, not surprisingly, was around the same time I was engrossed in my parallel computing course. All the talk of IPC and modulo arithmetic joined in an unholy matrimony with massive amounts of Mountain Dew and 3am whiteboarding in the labs, the offspring of which was a simple understanding of why and when to use parallel computing.
Fast forward to today and Silicon Valley’s VCs have eliminated all but the most trendy of ideas to back, the recent darlings being so-called Cloud computing. So what is Cloud Computing? Supposedly The Cloud is an ultra-scalable architecture that has implicit redundancy so no one ever has to go through the painful process of upgrading hardware; is this not the panacea of Datacenter logistics - completely automated vertical scalability?
The Promise
Current providers of Cloud provisioning software, such as 3Tera, promise less cost, ease of maintenance, and simple scalability. Need a new server? Just click a few buttons and you have a new virtual server, possibly pre-configured with software and even your specific settings.
The Failure
While having a large and dynamic infrastructure is appealing, there are inherent problems with the current approaches, the most obvious of which is the lack of failure abstraction. While products like VMWare have the ability to provide fail-over automatically, the fact that offerings like Applogic and EC2 do not provide such capabilities out-of-the-box is very telling about the underlying architecture. While many can argue that Xen-based hypervisors can be instrumented to failover, the fact that the technology does not, as a matter of its DNA, provide failover is The Failure. In order to provide the promises marketing for the various grid services has suggested, not only do we need to scale easily, but we need reliability; you cannot provide one and not the other.
Commentary
We all know that technology changes rapidly, but vendors and pundits fail when they speak as if change is good just for the sake of itself. Recently we’ve seen many old architectures rehashed: mainframe dumb-terminals have become remote desktops and terminal services, talk of Ethernet’s limitations being replaced with a protocol that involves a “Token” and the new SMP craze disguised as multi-core processing. It’s time for some real progress. It may be in the same way virtualization has brought us the Cloud that the Cloud itself may be an intermediate step to something better. At the very least let’s hope that the next step on the ladder is up, not down.



AppLogic absolutely DOES provide failure detection and correction, and has since release 1.0. Data redundancy is included as well, and our disaster recovery suite is making it easy for users to get active-active redundancy with data replication between sites.
Your post revolves around providing servers on-demand via virtualization, but that only scratches the surface.
AppLogic is unique in that it provides a new abstraction layer, the application, allowing users to package existing application code, data and infrastructure into portable, scalable, instantiable entities. Want 20 copies of a 10 server app for QA - it’s one command. Want to backup that 30 server ERP system - it’s one command. Need that 100 server web service moved accross the country to a new data center - it’s still one command. No human intervention. No code modification. No rebuilds. No risk.
With all the hype around lately, it’s sometimes tough to cut through the marketing blitz, but there is real technology in the cloud and it is effecting substantial change for users.
Bert Armijo
Reply to Bert Armijo3tera
I will have to take issue with your article. (I can’t quite read Bert’s reply - it seems to have been mangled by FireFox.) We are a managed cloud hosting provider that have been using AppLogic to serve our customers for 18 months. AppLogic does indeed have an abstracted failover model: any applications running on servers that fail are automatically restarted on standby application servers. This is the identical model to VMWare’s failover model. The time to accomplish this depends on various system parameters as well as the memory image size of the application.
Where you can run into problems is if you use software which cannot tolerate unexpected downtime (such as I saw today with MySQL on a MyISAM table.) Even AppLogic’s data redundancy wasn’t able to solve the fact that the restarted MySQL failed to come up due to data corruption. However, this would be true under VMWare as well. What is missing is carefully designed applications that take advantage of the failover capabilities of tools like AppLogic. When we deploy our customers’ code to an AppLogic Grid, we work with them to produce restartable deployments, which turn out to survive failures very successfully.
Reply to Eric Novikoff@Bert -
Thanks for the insight. My conversations with Vlad specifically talking about Layered Tech’s initial implementation(TGL) revolved around customer feedback on major issues with the lack of failover. This very well may have changed; in the article I tried to focus on how important it is to not have failover as a bolt-on application. VMWare is guilty of this to a varying degrees, as is every Cloud platform I am aware of at this time. SMP APIs for the past few decades have addressed this problem, though it seems Cloud infrastructures have not heeded the lessons. I’m only a single voice, but I would like to see a Cloud platform that at its core is not meant to scale vertically, but horizontally.
@Eric -
Thanks for taking the time to post. I agree there are a number of problems with getting non-grid/cloud applications to utilize the underlying services, but I think that is a litmus test of the grid/cloud platform. Most any client/server application should be able to work on a Cloud without modification and reap all the benefits the infrastructure provides, including distributed filesystems, which apparently is still an issue.
Reply to karl