NoHadoop is not only Hadoop. Why?
According to the 2014 Big Data & Advanced Analytics Survey conducted by the
market research firm Evans Data, only 16% of over 400 developers surveyed
worldwide indicated that Hadoop batch processing was satisfactory in all use
cases. 71% of developers also expressed a need for real-time complex event
processing more than half the time in their applications, and 27% said they
use it all the time.
Hadoop has evolved from MapReduce and HDFS in the very beginning to a set of
technologies, including Hive, HBase, Sqoop, Flume, Pig, Mahout, etc. Though,
Hadoop was originally designed for batch processing. It was based on the map
and reduce programming model and it is an overstretch for real-time
transactions. Various efforts have been made to enhance Hadoop. For example,
YARN was designed to decouple the resource management in the underlying... (more)
The adoption of Hadoop has been increasing. More and more organizations are
using Hadoop for various solutions. Some companies replace the existing data
stores with Hive and HBase. Some firms make use of Mahout for machine
learning. Others are building new applications on the Hadoop platform from
the ground up. Now the critical question is where Hadoop can and should be
Hadooplicability is the measure of the Hadoop applicability. It helps users
find the most applicable areas where Hadoop can be leveraged. While
Hadoopability, as defined before in another post of this blog,... (more)
A capability model is a structure that represents the core abilities and
competencies of an entity (department, organization, person, system, and
technology) to achieve its objectives, especially in relation to its overall
mission and functions.
The Big Data Capability Model (BDCM) is defined as the key functionalities in
dealing with Big Data problems and challenges.
It describes the major features, behaviors, practices and processes in an
organization, which can reliably and sustainably produce required outcomes
for Big Data demands. BDCM consist of the following elements:
MapReduce is a programming model for processing massive amounts of data in
parallel, and the model was first implemented by Google. MapReduce is
typically used to run distributed jobs on clusters of computers.
This model inherits the concepts of the map and reduce functions commonly
used in functional programming, even though their purpose in the MapReduce
framework is significantly different from their original forms.
A well-known implementation is the open source Apache Hadoop. Amazon offers a
cloud-based solution called Elastic MapReduce (EMR), which utilizes a hosted
Hadoop fra... (more)
As we are approaching the year end in 2013, we have seen significant growth
of big data this year in the rear mirror. Looking forward, what will be the
forecast for the upcoming new year? Where are technologies headed in 2014?
There are many interesting movements and activities in the technology
frontier. It is exciting that more open source innovations are steadily
changing the paradigm. The following items are predicted to be the Top 5 hot
areas next year: Connection, Hybrid, Analytics, SDX, and Mobile (CHASM).
Connection: The Internet of everything enables people, devices and e... (more)