Messy Data

Ask anyone involved in the big data sector about the most glaring problems preventing companies from extracting more insights from their data, and you’re likely to get a rant about messy and unstructured data. For years, companies have been storing as many bits and bytes as possible relating to every interaction with their potential customers, internal business process, and other potentially revealing operational areas. But the task of turning this raw data into something suitable to feed into the myriad of business intelligence and analytics platforms is an arduous and time-intensive process.

By some estimates, the average data scientist spends as much as 80 percent of his time cleansing data. It’s a problem that will only get worse with time as data sets grow larger and more complex. Enter Trifacta, a data transformation platform that automates much of this essential by otherwise inefficient and resource-intensive prep work and allows non-technical workers to extract value from big data.

Today the company announced a $12 million Series B round of financing led by Greylock Partners*, with participation from existing investor Accel Partners*. Trifacta has now raised a total of $16.3 million.

“We view them as democratizing access to big data,” says Accel partner Ping Li. “Messy data a problem that everyone in the industry can relate to, but Trifacta is the first to offer a solution that successfully marries backend data technology with an intuitive front user interface. I liken it to the iPhone, which wasn’t the first smartphone, but it was the first to unlock the potential of the form factor.”

Trifacta was founded by a team of academics, including UC Berkley and University of Washington computer science professors Joel Hellerstein and Jeffrey Heer, along with Stanford PhD Sean Kandel. The trio launched the company in 2012 and entered the market with a beta product earlier this year. Hellerstein anticipates launching its commercial product in Q1 2014.

“Data transformation is everybody’s problem, but nobody’s job,” says Hellerstein, Trifacta’s CEO. “The pace of business is different today, which means it’s no longer acceptable to have multi-month cycles from collecting data to analyzing it. Things need to happen in near real-time.”

The key to Trifacta, beyond its intelligent backend engineering, is the introduction of a simple and elegant user interface. “Much like the graphical user interface changed the way that people work with computers, Trifacta is changing the way that people work with data,” Hellerstein adds. “It’s really a technology plus human interaction problem.”

Given the scale of the problem, there’s very obviously a large opportunity awaiting anyone who can solve it. The question becomes whether such a solution belongs as a standalone company or whether it’s a feature that would belongs within the existing platform offerings. Trifacta and its investors are betting big that it’s the latter.

“Platforms hate this part of the value chain because it only indirectly makes their product better and the consumer often takes it for granted,” Li says. “The platforms have proven willing to partner.”

It’s not just company insiders who speak highly of Trifacta’s potential impact on the big data sector. The company has worked closely with a number of the top data analytics platforms including Tableau and Cloudera, and has received strong endorsements from each. For example, Cloudera co-founder and CSO Mike Olson says:

Unlocking the value of multi-structured data demands intelligent application-level interfaces engineered for simplicity and scalability. Users must have good tools to ingest and explore that data. Trifacta’s new interaction technology allows analysts to transform data at scale, efficiently, making [our Enterprise Data Hub] an even better home for data that matters to the business.

With the technology foundation built and a newly raised warchest, the big challenge awaiting Trifacta is its go-to-market execution. It’s one thing to hire a bunch of sales guys, but the effectively selling SaaS licenses and then effectively servicing those accounts is a costly and challenging endeavor that could not be more divorced from engineering and product development.

Trifacta has not announced pricing yet, but given the time and cost savings the platform can drive, Hellerstein doesn’t seem too concerned about demonstrating value.

“One customer told that they were able to complete a project they expected to take six weeks in just a single day – that’s a quantitative difference,” he says. “It’s not just that we make the process faster, but we’re actually enabling companies to take on different projects that otherwise wouldn’t even contemplate.”

Structuring data may not be the sexiest of problems, but it’s proven to be a nearly universal one in the big data sector. Both Greylock and Accel have seen this problem emerge in their own portfolio companies, much like Trifacta’s founders did in their academic research.

“One of the reasons we were so excited to invest early on is because to the Joe, Jeff, and Sean, data transformation is sexy,” Li says.

If the young company proves anywhere near as effective at building a sales organization as it has been thus far at product development, this could be the next giant to come out of the big data sector. In Silicon Valley, building massive companies is the sexiest thing of all.

[*Disclosure: Accel and Greylock are investors in PandoDaily.]

[Image via Blude, Flickr]

  1. Big Data is not just big—it’s messy, complicated and coming at most businesses faster than they can rationalize, much less analyze. As a result, data professionals spend far more time wrangling and cleansing data than analyzing it. And business professionals are left on the outside looking in.

    Trifacta provides a breakthrough user experience that makes data experts far more productive and allows business analysts to work directly with Big Data. This means faster time to analysis. And perhaps more significantly, it unlocks the potential of a wide variety of data that was previously left aside as being too hard to use. The result: better, faster, more informed business decisions.