Big data
analytics is new emerging topic & also need of the market in next few
months. In shortest terms, Big data
analytics is process that examines large amounts of data to uncover hidden
patterns, correlations and other insights. With today’s technology, it’s
possible to analyze your data and get answers from it almost immediately – an
effort that’s slower and less efficient with more traditional business
intelligence solutions.
Regardless
of how one defines it, the phenomenon of Big Data is ever more present, ever more
pervasive, and ever more important. There is enormous value potential in Big
Data: innovative insights, improved understanding of problems, and countless
opportunities to predict—and even to shape—the future. Data Science is the
principal means to discover and tap that potential. Data Science provides ways
to deal with and benefit from Big Data: to see patterns, to discover
relationships, and to make sense of stunningly varied images and information.
Not everyone
has studied statistical analysis at a deep level. People with advanced degrees
in applied mathematics are not a commodity. Relatively few organizations have
committed resources to large collections of data gathered primarily for the
purpose of exploratory analysis. And yet, while applying the practices of Data
Science to Big Data is a valuable differentiating strategy at present, it will
be a standard core competency in the not so distant future. How does an
organization operationalize quickly to take advantage of this trend? that exact
purpose we should discuss. India Training Services has been listening to
the industry and organizations, observing the multi-faceted transformation of
the technology landscape, and doing direct research in order to create
curriculum and content
to help
individuals and organizations transform themselves. For the domain of Data
Science and Big Data Analytics, our educational strategy balances three things:
people—especially
in the context of data science teams,
processes—such
as the analytic lifecycle approach presented in this book, and
tools and
technologies—in this case with the emphasis on proven analytic tools.
The concept
of big data has been around for years; most organizations now understand that
if they capture all the data that streams into their businesses, they can apply
analytics and get significant value from it. But even in the 1950s, decades
before anyone uttered the term “big data,” businesses were using basic
analytics (essentially numbers in a spreadsheet that were manually examined) to
uncover insights and trends.
The new
benefits that big data analytics brings to the table, however, are speed and
efficiency. Whereas a few years ago a business would have gathered information,
run analytics and unearthed information that could be used for future
decisions, today that business can identify insights for immediate decisions.
The ability to work faster – and stay agile – gives organizations a competitive
edge they didn’t have before.
As an
analyst let’s start with definition of big data, Big Data are high-volume,
high-velocity, and/or high-variety information assets that
require new forms of processing to enable enhanced decision making, insight
discovery and process optimization.
So we are
discussing
- Volume: Size of data (how big it is)
- Velocity: How fast data is being generated
- Variety: Variation of data types to include source, format, and structure
- Changing rapidly (I have added this)
There is a
lot of data, it is coming into the system rapidly, and it comes from many
different sources in many different formats.
The definition may seem vague given that it is describing a technical
item, but to accurately capture the scope of Big Data the definition itself
must be “big.”
IT companies
are investing billions of dollars into research and development for Big Data,
Business Intelligence (BI), data mining, and analytic processing technologies.
This fact underscores the importance of accessing and making sense of Big Data
in a fast, agile manner. Big Data is important; those who can harness Big Data
will have the edge in critical decision making. Companies utilizing advanced
analytics platforms to gain real value from Big Data will grow faster than
their competitors and seize new opportunities.
Changing
scenario, explosive data growth by itself, however, does not accurately
describe how data is changing; the format and structure of data are changing.
Rather than being neatly formatted, cleaned, and normalized data in a corporate
database, the data is coming in as raw, unstructured text via Twitter Tweets on
smart phones, spatial data from tracking devices, Radio Frequency
Identification (RFID) devices, and audio and image files updated via smart
devices.
Mission
critical example, NASA reportedly has accumulated so much data from space
probes, generating such a data backlog, that scientists are having difficulty
processing and analyzing data before the storage media it resides on physically
degrades.
Traditional
BI tools that rely exclusively on well-defined data warehouses are no longer
sufficient. A well-established RDBMS does not effectively manage large datasets
containing unstructured and semi-structured formats. To support Big Data,
modern analytic processing tools must
ü
Shift away from traditional, rearward-looking BI
tools and platforms to more forward-thinking analytic platforms
ü
Support a data environment that is less focused
on integrating with only traditional, corporate data warehouses and more
focused on easy integration with external sources
ü
Support a mix of structured, semi-structured,
and unstructured data without complex, time-consuming IT engineering efforts
ü
Process data quickly and efficiently to return
answers before the business opportunity is lost
ü
Present the business user with an interface that
doesn’t require extensive IT knowledge to operate
Fortunately,
IT vendors and the IT open source community are stepping up to the challenge of
Big Data and have created tools that meet these requirements. Popular software
tools include
Hadoop:
Open-source software from Apache Software Foundation to store and process large
nonrelational data sets via a large, scalable distributed model. Commercialized
Hadoop distributions are also available
NoSQL:
A class of database systems that are optimized to process large unstructured
and semi-structured data sets. Commercialized NoSQL distributions are available
The impact
of cloud computing on Big Data is huge. Data sources can be from public,
private, or community clouds. For example, customer demographic data can come
from a public cloud, but complex scientific collection information or
industry-sensitive data would be from community clouds. Any Big Data Analytic
platform should be able to access any cloud platform and be able to publish
results to any environment.
Unlocking
the value in data is the key to providing value to the business. Too often IT
infrastructure folks focus on data capacity or throughput speed. Business
Intelligence vendors extol the benefits of executive-only dashboards and
visually stunning graphical reports. While both perspectives have some merit,
they only play a limited role in the overall mission of bringing real value to
those in the company who need it.
Value is
added by using an approach and platform to bring Big Data into the hands of
those who need it in a fast, agile manner to answer the right business
questions at the right time. Knowing what data is needed to answer questions
and where to find it is critical; having the analytic tools to capitalize on
that knowledge is even more critical. It is through those platforms that real
value is realized from Big Data.
In Big Data
world technology alone doesn’t generate real value from Big Data. Data
analysts, empowered with the right analytic technology platform, humanize Big
Data, which is how companies realize value. Analytic platforms & tools
make extracting value from Big Data possible. Important benefits to businesses
that the analytics platform should provide
- Improving the self-sufficiency of decision makers to run and share analytic applications with other data users.
- Data analysts who understand the business should develop good analytic applications that are shared for everyone’s benefit
- Injecting Big Data into strategic decisions without waiting months for an IT infrastructure and data project. the tool should cook the data into the hands of decision makers so that businesses can identify and capitalize on opportunities
- Delivering the power of predictive analytics to everyone, not just a few executive decision makers far removed from operations. Ensuring that the right data is readily available to all authorized parties leads to making the best possible decisions
The nature of Big Data is large data,
usually from multiple sources. Some data will come from internal sources, but
increasing data is coming from outside sources.
Let’s start understanding tools & techniques available, used by you & share your experiences at
ravindrapande@gmail.com so that we could make this blog a live & useful as
a reference for next chapter. Thanks a lot for writing me on my last blog I
have appreciated & applied the changes accordingly. Feel free to visit http://www.indiatrainingservices.in/
as well for suitable training.
No comments:
Post a Comment