Fail Whale

Few companies in the world can match the systems engineering expertise of Web giants like Google, Twitter, and Facebook. This talent advantage, combined with the millions of man-hours spent optimizing, mean that these organizations operate some of the largest and most sophisticated datacenters and cloud computing networks in the world.

Today, a few veterans of these elite sysadmin teams are making it possible for enterprises of all sizes to run their data centers with world class efficiency. Mesosphere, a one-year-old San Francisco-based startup founded by Twitter and AirBnB veterans, has productized the open source Apache Mesos datacenter resource management software credited with helping make the famed Twitter “Fail Whale” a thing of the past.

Mesosphere today announced that it has closed $10.5 million in new Series A funding led by Andreessen Horowitz* with participation from Data Collective and Fuel Capital. Andreesen Horowitz partner and resident virtualization expert, former Xensource CEO Peter Levine will join the company’s board of directors.

“Growing numbers of companies are building reliable and high-performance datacenters using Mesos software,” Levine said today in a statement. “This is the direction of the new datacenter and the shift will be as transformational as virtualization has been over the past decade.”

Mesos was first created by a team of Berkley AMPLab researchers and is now maintained as an open source project. But without a team of expert systems engineers, deploying the technology can be a costly and challenging process. Mesosphere looks to commercialize this raw technology by offering complementary products and support.

“The problem is, there simply aren’t that many people out there who have five-plus years of experience working with Mezos and operating cloud infrastructure at this scale – most of those who do work for us,” says Mesosphere CEO Florian Leibert.

Using Mesosphere, enterprises can manage datacenters of any size as if they were a single, large computer. This is similar to the way that PC operating systems efficiently provision multi-core processors for everyday computing tasks. In this way, it seems appropriate to think of Mesosphere as an operating system for cloud datacenters and datacenters like multi-thousand core laptops. The goal isn’t to save companies on the cost of their cloud infrastructure, but rather on the cost of the support resources required to operate it.

“For example, AirBnB has two cloud clusters, one that runs on Mesos and one that does not. The one not running on Mesos requires seven to eight full-time engineers to maintain. The mesos cluster takes a fraction of one engineer,” Leibert says.

Once Mesosphere is installed, the process of provisioning new machines and installing distributed cloud applications like Hadoop or Cassandra becomes as simple as installing a smartphone app, according to Leibert. “Without Mesosphere, it can take weeks for IT to provision new machines – we can make it almost instant,” he says.

Mesosphere can be deployed onto any public or private cloud infrastructure, such as Amazon Web Services, Google Cloud Platform, Microsoft Azure, Rackspace, or Digital Ocean. Early Mesosphere clients are what Leibert calls lighthouse clients, including, Netflix, eBay, PayPal, Shopify, OpenTable, HubSpot and others. Longer-term, the goal is to open the product up to a wider audience.

Mesosphere is currently in the process of building out its enterprise sales and services team to target Global 2000 enterprises. More broadly, the company is looking to partner with public cloud platforms like Amazon to make its software available to the long tail of small and mid-sized businesses. Mesosphere has a 23-person team today consisting almost entirely of engineers that is spread between San Francisco and Hamburg, Germany.

The timing is right for a solution like this. As recently as a handful of years ago, most applications weren’t written to run across multiple machines. For example, early versions of Twitter were written as a single monolithic Ruby on Rails application, Leibert says. It was a concept that scaled, until it didn’t, and in 2009 the now infamous Fail Whale became a routine occurrence.

Now nearly all new software is written to run on distributed cloud networks. The only issue is deploying and maintaining these networks. Mesosphere looks to make sure that it doesn’t require Twitter- or Google-like resources. With software now “eating the world,” there’s no shortages of companies that could benefit from a bit of sysadmin leverage.

(*Disclosure: Andreessen Horowitz partners Marc Andreessen, Jeff Jordan, and Chris Dixon are investors in Pando.)

[Image via Yiying Lu]