There’s nothing more uninspiring than a successful person telling a room of fans that their success can’t be replicated. And yet, that’s really all they can say — this worked for me, but it probably won’t work for you. All I can offer is the lessons I’ve learned.
Today Nate Silver of Fivethirtyeight delivered a somewhat uninspiring, but realistic, perspective on big data to a crowded room at a SXSW. Speaking around 100 miles an hour, Silver told startups to go for the low hanging fruit when it comes to ideas around big data. Find an area where competition is low. “Look for fields that have not been thought about in analytic way before and where you have data available,” he said.
That’s why he has been so successful predicting baseball and political outcomes. In those areas, the competition wasn’t great, he said. In most other fields, though, the “water level” is really high. “All profit or competitive advantage will come from marginal gains,” he added.
Startups should be more skeptical about data, balancing their curiosity with skepticism. “I see a lot of curiosity in the room, which is great, but we should know its hard work to take this big data and turn it into progress,” he said.
He pointed to the last time we had a massive influx of data, with the invention of the printing press in 1450. It took 300 years for technology related to the printing press to produce tangible dividends for society.
That influx in data has happened again with the Web — something like 90 percent of the data in the universe was created in the last 10 years. (75 percent of that is remixes of the Harlem Shake, he joked.) Making sense of all that data has entrepreneurs and investors salivating, and its turned big data into one of those buzzwords like “cloud” or “gamification” or “enterprise” that startups tack onto their boiler plates in the hopes of riding a hot trend.
Despite so many people working on the data problem, Silver believes the signal-to-noise ratio is actually getting worse. Humans are trained to make decisions quickly and recognize patterns. When we’re presented with so much information, we can misperceive random correlations for real signals. And that’s when we start telling stories around data, which is what gets us in trouble. “Stories are the best way to communicate,” he said. “But you have to make sure the stories you tell are representative of a bigger picture and that they testify to the truth.”
“Why am I being skeptical to an audience full of tech geeks?” he asked. “Progress in society is tough, and its tough to make a living doing it. When you’re working toward any type of problem, you have diminishing returns as you add more features.”
[Image via GregPC]