The Many Faces of Big
Our friend Jim Ericson at Information Management has some thoughts on a recent DM Radio segment with James Kobelius of Forrester research and Neil McGovern from Sybase. The topic of the program was real-time data warehousing. Jim notes that one of the key buzzwords to emerge in the discussion was Big Data, and that some pretty interesting stuff starts to happen when big data meets CEP (complex event processing):
McGovern was talking about complex event processing or CEP (another pregnant term) and the idea of bringing the data to the query, rather than the other way around, which is something we’re seeing more of now with in-memory analytic processing and the advent of the petabyte-scale data warehouse.
That’s what CEP tends to be about, a kind of forensic analysis (like a series of corpses to pick over moving by on a conveyor belt) of something that’s actually just an invisible blur. McGovern says “one of the faster feeds” he knows about in financial services uses an engine that monitors one million transactions per second.
Ahem. I stopped to confirm he’d said “one million transactions per second.”
Not that long ago, the only attribute of data bigness that got much attention was the size of the data itself. And it’s still a huge consideration — if you’ll pardon the expression. Managing exponential data growth is important; we do live in an age of petabyte-sized data warehouses (and of taking the data to the query.) But increasingly we are realizing that it isn’t just about size, that there are a number of different factors that contribute to the bigness of Big Data.
To name just a few…
- The number of users accessing the data
- The volume of queries / analyses that data is subject to
- The complexity of the queries / analyses performed
- The speed with which results must be returned
Even a modest amount of data becomes Big Data as such requirements become more and more demanding. In the end, a Big Data solution has to be one that accommodates more than one factor at a time. As organizations experience growing pains with data “bigness” emerging along multiple, often unexpected dimensions, the need for an analytics environment that scales flexibly, and in line with what the business demands, becomes increasinglyclear.