Clive Humby coined in 2006 the phrase “data is the new oil”. As this was already true for companies like Google and Facebook it was the promotion of Hadoop into the state of an Apache top level project in 2008 which made a technology accessible to a broader public to turn his vision into global reality.
For the years to come companies began to collect as much data as possible from a variety of sources with the hope in mind that it may turn into gold some time if just the right algorithm was applied to it.
Some figured out how to transform raw data to business insights others are still struggling with the same question while their mostly uncatalogued data is getting (c)older and less valuable.
Although data of the aforementioned companies is aging as well, they still know how to turn it into action and keep up with the market. But as they develop a false sense of security the next evolutionary stage is around the corner.
Not to stay with the market but getting ahead of it requires immediate insights from data generated just the other moment. That is why we are currently seeing an emerging development from data at rest (batch mode processing) to data in motion (stream processing).
Like in the oil business it is not only about owning the raw material but it requires to master the refinement process as well in order to be successful.
In contrast to the early stages of big data processing this phase does not only require a new technology but demands for a paradigm shift as well.
While processing was bound to stable datasets, could be performed with a widely known language based on set theory (SQL) and provided reproducible results, working with data in motion is more complex as time, context and uncertainty are significant characteristics.
Thus evolving a big data architecture by adding the concepts of data in motion must go hand in hand with a significant shift in how data is understood and worked with.
Although both aspects imply changes the latter is the most important as a technology simply provides the foundation but it is the person who exploits it that is turning data to insights and finally to revenue.
Like any evolutionary step this one implies a cultural change as well.
The mission of this project or blog is to shed some light on the aspects of fast data architectures and provide a practical guide on how to turn them into a success – from a technological as well as cultural perspective.