By Jim Kaskade
Big data without the right analytic application is just that -- a big pile of data. Ideally, such an application should be foolproof. In other words, one that you can deploy organization-wide to leapfrog competition, win the hearts of customers, and drive revenue by drawing well-informed, real-time conclusions.
In reality, the path to application development is fraught with problems that make it very difficult for developers to deliver analytics that fit this bill. If companies are experiencing disillusionment with big data, it may be in part due to the processes and infrastructure most companies employ, which makes it impossible for them to deliver the right app at the right time -- almost as if designed to work against them!
Today, businesses must be able to analyze problems/opportunities and act in real time. From the time you begin to implement the needed infrastructure, it can take up to 2 years to derive insight from big data applications. In today’s biz climate, that might as well be 2000 years, because if you can’t address the problem in 30 days, you’re too late -- your analytic application is already outdated by the time it gets put to use. Enterprises fail at producing meaningful impact with analytic applications primarily because of slow time-to-market, and the overly broad scope of the project -- two direct outgrowths of the antiquated techniques they employ:
Data is coming from every place imaginable at breakneck speed, however, data architectures are traditionally highly rigid, locked into fixed data models and process analytics. Businesses need architecture capable of integrating unstructured and structured data together, as well as external and internal data. To counter this, organizations need to embrace open-standards, trusting the technologies that have already been proven to work at scale from Google, Yahoo!, Twitter, Linkedin, and Facebook. These companies have handed us scalable, open source big data technology capable of executing on top of cloud resources. If we want to create foolproof applications, we'd be very foolish not to use them.
"Boil the Ocean" scope
Equally important, most departments have the process backwards -- big data starts with the application, not the infrastructure. Rather than building big data sandboxes, stacks and Hadoop clusters without a purpose, start with a specific business problem. Once you’ve pinpointed and defined what you want to accomplish, you can narrow the scope of your technology choices. What we see too often is companies boiling the ocean, when all they really need is a single hard-boiled egg. Of course, what they find is that they’re never able to get their project off the ground. Their egg goes rotten waiting for the ocean to boil. If you start with a narrow thesis and apply it through data analytics, you’ll be much more likely to solve the problem. And once you’ve achieved success on a single small project, you’ll find it much easier to get the funds you need to tackle additional problems and expand your big data capabilities.
Create an Analytic Application Factory
Cars were around long before Ford, but his mass production strategy proved to be what really kick started the industry. The same holds true with data-driven apps. Rather than building them from the ground up each time, developers need to be able to leverage blueprints that empower them to get their applications up and running quickly, just like pumping model Ts out of a factory.
While each application is going to have a slightly different purpose, there are certain commonalities to industry-specific applications that can be leveraged to create blueprints. If you can templatize the data sources, models, pipelines, connectors, analytics, and structured and unstructured data sources that fit best with the specific needs of the organization, you will greatly streamline the process. Once again, I refer back to the open-source, webscale infrastructures that have been freely offered up by Google and other companies -- by leveraging what’s already out there and proven to work, you can skip some of the most time-consuming steps. Instead of starting from scratch each time, you simply iterate on what you’ve already got -- the bulk of the legwork has been done for you.
What should a modern application factory look like? Let’s take a quick comparison between how things have been happening, and how they need to happen in order to empower developers to make foolproof applications that can be deployed across the organization, and produce measurable value:
- While the old guard view leverages small samples of data and employs complex algorithms, the modern application factory leverages all the pertinent data, and simple algorithms. How do we expand the number of data streams we are able to use? Read on.
- Traditional reliance on legacy platform technology makes pretty much everything more expensive and slower. Possibly even more importantly, it limits your data sources. The new school approach, relying on webscale, open source technology, infinitely expands the scope of data you are able to use. Old: deploy within the safety of your own IT...New: outsource the platform 100%.
- You know you’re operating old school when you’re focusing on speeds and feeds -- this is a sign that you’re mired in infrastructure. On the other hand, if you’re focused on the specific questions you want answered, you’re on the right track.
- Furthermore, if you’re doing a huge implementation, trying to bring the benefits of big data to the entire organization, that’s a sign that you’re thinking is stuck in the past. Rather, start with one use case and then expand.
- Old School: Keep IT, analytics, and application resources separate… New School: combine them into a single organization.