Tuesday, May 30, 2017

Big data analytics



Big data analytics is new emerging topic & also need of the market in next few months.  In shortest terms, Big data analytics is process that examines large amounts of data to uncover hidden patterns, correlations and other insights. With today’s technology, it’s possible to analyze your data and get answers from it almost immediately – an effort that’s slower and less efficient with more traditional business intelligence solutions.

Regardless of how one defines it, the phenomenon of Big Data is ever more present, ever more pervasive, and ever more important. There is enormous value potential in Big Data: innovative insights, improved understanding of problems, and countless opportunities to predict—and even to shape—the future. Data Science is the principal means to discover and tap that potential. Data Science provides ways to deal with and benefit from Big Data: to see patterns, to discover relationships, and to make sense of stunningly varied images and information.

Not everyone has studied statistical analysis at a deep level. People with advanced degrees in applied mathematics are not a commodity. Relatively few organizations have committed resources to large collections of data gathered primarily for the purpose of exploratory analysis. And yet, while applying the practices of Data Science to Big Data is a valuable differentiating strategy at present, it will be a standard core competency in the not so distant future. How does an organization operationalize quickly to take advantage of this trend? that exact purpose we should discuss. India Training Services has been listening to the industry and organizations, observing the multi-faceted transformation of the technology landscape, and doing direct research in order to create curriculum and content
to help individuals and organizations transform themselves. For the domain of Data Science and Big Data Analytics, our educational strategy balances three things:
people—especially in the context of data science teams,
processes—such as the analytic lifecycle approach presented in this book, and
tools and technologies—in this case with the emphasis on proven analytic tools.

The concept of big data has been around for years; most organizations now understand that if they capture all the data that streams into their businesses, they can apply analytics and get significant value from it. But even in the 1950s, decades before anyone uttered the term “big data,” businesses were using basic analytics (essentially numbers in a spreadsheet that were manually examined) to uncover insights and trends.
The new benefits that big data analytics brings to the table, however, are speed and efficiency. Whereas a few years ago a business would have gathered information, run analytics and unearthed information that could be used for future decisions, today that business can identify insights for immediate decisions. The ability to work faster – and stay agile – gives organizations a competitive edge they didn’t have before.
As an analyst let’s start with definition of big data, Big Data are high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.
So we are discussing

  • Volume: Size of data (how big it is)
  • Velocity: How fast data is being generated
  • Variety: Variation of data types to include source, format, and structure
  • Changing rapidly (I have added this)

There is a lot of data, it is coming into the system rapidly, and it comes from many different sources in many different formats.  The definition may seem vague given that it is describing a technical item, but to accurately capture the scope of Big Data the definition itself must be “big.”
IT companies are investing billions of dollars into research and development for Big Data, Business Intelligence (BI), data mining, and analytic processing technologies. This fact underscores the importance of accessing and making sense of Big Data in a fast, agile manner. Big Data is important; those who can harness Big Data will have the edge in critical decision making. Companies utilizing advanced analytics platforms to gain real value from Big Data will grow faster than their competitors and seize new opportunities.

Changing scenario, explosive data growth by itself, however, does not accurately describe how data is changing; the format and structure of data are changing. Rather than being neatly formatted, cleaned, and normalized data in a corporate database, the data is coming in as raw, unstructured text via Twitter Tweets on smart phones, spatial data from tracking devices, Radio Frequency Identification (RFID) devices, and audio and image files updated via smart devices.
Mission critical example, NASA reportedly has accumulated so much data from space probes, generating such a data backlog, that scientists are having difficulty processing and analyzing data before the storage media it resides on physically degrades.
Traditional BI tools that rely exclusively on well-defined data warehouses are no longer sufficient. A well-established RDBMS does not effectively manage large datasets containing unstructured and semi-structured formats. To support Big Data, modern analytic processing tools must
ü  Shift away from traditional, rearward-looking BI tools and platforms to more forward-thinking analytic platforms
ü  Support a data environment that is less focused on integrating with only traditional, corporate data warehouses and more focused on easy integration with external sources
ü  Support a mix of structured, semi-structured, and unstructured data without complex, time-consuming IT engineering efforts
ü  Process data quickly and efficiently to return answers before the business opportunity is lost
ü  Present the business user with an interface that doesn’t require extensive IT knowledge to operate

Fortunately, IT vendors and the IT open source community are stepping up to the challenge of Big Data and have created tools that meet these requirements. Popular software tools include
Hadoop: Open-source software from Apache Software Foundation to store and process large nonrelational data sets via a large, scalable distributed model. Commercialized Hadoop distributions are also available
NoSQL: A class of database systems that are optimized to process large unstructured and semi-structured data sets. Commercialized NoSQL distributions are available
The impact of cloud computing on Big Data is huge. Data sources can be from public, private, or community clouds. For example, customer demographic data can come from a public cloud, but complex scientific collection information or industry-sensitive data would be from community clouds. Any Big Data Analytic platform should be able to access any cloud platform and be able to publish results to any environment.
Unlocking the value in data is the key to providing value to the business. Too often IT infrastructure folks focus on data capacity or throughput speed. Business Intelligence vendors extol the benefits of executive-only dashboards and visually stunning graphical reports. While both perspectives have some merit, they only play a limited role in the overall mission of bringing real value to those in the company who need it.
Value is added by using an approach and platform to bring Big Data into the hands of those who need it in a fast, agile manner to answer the right business questions at the right time. Knowing what data is needed to answer questions and where to find it is critical; having the analytic tools to capitalize on that knowledge is even more critical. It is through those platforms that real value is realized from Big Data.
In Big Data world technology alone doesn’t generate real value from Big Data. Data analysts, empowered with the right analytic technology platform, humanize Big Data, which is how companies realize value. Analytic platforms & tools make extracting value from Big Data possible. Important benefits to businesses that the analytics platform should provide

  • Improving the self-sufficiency of decision makers to run and share analytic applications with other data users.
  • Data analysts who understand the business should develop good analytic applications that are shared for everyone’s benefit
  • Injecting Big Data into strategic decisions without waiting months for an IT infrastructure and data project. the tool should cook the data into the hands of decision makers so that businesses can identify and capitalize on opportunities
  • Delivering the power of predictive analytics to everyone, not just a few executive decision makers far removed from operations. Ensuring that the right data is readily available to all authorized parties leads to making the best possible decisions


The nature of Big Data is large data, usually from multiple sources. Some data will come from internal sources, but increasing data is coming from outside sources.

Let’s start understanding tools & techniques available, used by you & share your experiences at ravindrapande@gmail.com so that we could make this blog a live & useful as a reference for next chapter. Thanks a lot for writing me on my last blog I have appreciated & applied the changes accordingly. Feel free to visit http://www.indiatrainingservices.in/ as well for suitable training.

Monday, May 15, 2017

Machine-to-machine communications



Machine-to-Machine (M2M) communication is the next-generation telemetry which is used for automatic transmission of data gathered from remote sensors to a central unit for analysis, either by human beings or software agents. Unlike traditional Human-to-Human (H2H) communication, the human is not the typical initiator of the communication process. That is, the human is merely the recipient and possibly the respondent for the output. In contrast to conventional telemetry, M2M encompasses a broad spectrum of applications rather than just relegated to highly esoteric applications such as aerospace, water treatment and natural gas pipeline monitoring. Furthermore, M2M communications systems are composed of a myriad of machines that are connected to the Internet using public fixed and/or wireless communications infrastructure. Latest commercial forecasts are for fifty billion machines connected to the Internet worldwide by the end of the decade.

A machine-to-machine (M2M) communications eco-system is a large-scale network with diverse applications and a massive number of interconnected heterogeneous machines (e.g., sensors, vending machines and vehicles). Cellular wireless technologies will be a potential candidate for providing the last mile M2M connectivity. Thus, the Third-Generation Partnership Project (3GPP) and IEEE 802.16p, have both specified an overall cellular M2M reference architecture. The European Telecommunications Standards Institute (ETSI), in contrast, has defined a service- oriented M2M architecture. This article reviews and compares the three architectures. As a result, the 3GPP and 802.16p M2M architectures, which are functionally equivalent, complement the ETSI one. Therefore, we propose to combine the ETSI and 3GPP architectures, yielding a cellular-centric M2M service architecture. Our proposed architecture advocates the use of M2M relay nodes as a data concentrator. 

The M2M relay implements a tunnel-based aggregation scheme which coalesces data from several machines destined to the same tunnel exit-point. The aggregation scheme is also employed at the M2M gateway and the cellular base station. Numerical results show a significant reduction in protocol overheads as compared to not using aggregation at the expense of packet delay. However, the delay rapidly decreases with increasing machine density.

Let’s discuss one by one start with the underlining communications
Machine-to-machine (M2M) communication allows machines and devices to pass along small amounts of information to other machines. This includes communication to and from smoke detectors, door locks, alarms, water meters, agricultural sensors, smart buildings, smart lighting, environmental sensors, and more. Every IoT application has a different set of constraints in terms of wireless range and energy consumption it needs to achieve. Therefore, M2M network architecture is about properly utilizing radio resources. Each network listed below utilizes a different method for handling these resources. Cellular, for instance, is the only type of ubiquitous M2M network that uses its own licensed frequency space. The rest typically coexist using free, unlicensed frequencies. Due to regulatory constraints, companies are not allowed to design their networks to have an unfair advantage over other networks, so the question for these companies when creating network architecture is how to utilize the unlicensed spectrum efficiently.

Below, we’ll walk through the benefits and considerations of a few M2M network architectures currently in use. As you can see, there are many IoT networks available. Each of them is trying a unique approach to solve a standard engineering problem: how to trade off cost, performance, and complexity. Every engineer knows you can’t have the best of all of those things—but you can create a network that will cater to specific applications. We’re eager to see how these network architectures improve, evolve, and grow in the coming years. 

Cellular communication (communication based on communicating thru cellular network) has dominated the M2M space for a long time. The primary benefit of cellular is the ubiquitous coverage, but major disadvantages of cellular are short battery life, high-cost end points, and high recurring fees. Any battery-powered application will have a hard time using a cell modem. Cellular networks are constantly changing, as well. For example, when M2M started, most of the cellular world was using GSM-based technology (which is now being phased out). GSM has mostly been replaced by 3G and LTE, and there is talk that those technologies for M2M applications will eventually be phased out and replaced by LTE-M. So, companies who deployed cellular modems should be aware that their hardware may not be supported in coming years.

Great way to understand this please visit AT &T program for IoT enthusiasts with industrial application started to rollout in late sixteens https://starterkit.att.com/
 
WiFi has become a more prevalent M2M option in the last five years. This is due in part to new WiFi chip manufacturers who are now targeting the space by making lower cost, lower power chip sets with a very simple interface. With these new chips, you don’t need a computer and a WiFi driver; you can use a universal asynchronous receiver/transceiver (UART) instead.  But while cellular coverage is ubiquitous, WiFi coverage is not, which is one of WiFi’s main downfalls in the M2M market. For example, if you’re building a keycard door lock for every apartment in a New York high-rise and using WiFi, provisioning is going to be a nightmare.

Bluetooth, this option that’s become available in the last four years is Bluetooth Low Energy (BLE), which is also called Bluetooth 4.0 or Bluetooth Smart. BLE uses considerably less power than traditional Bluetooth, but like its predecessor, users are pretty limited by range and packet sizes. BLE is meant to transmit only very small bits of information online through a phone or computer. That makes BLE ideal for applications like heart rate monitors or fitness trackers, but it’s not ideal for anything that needs a stronger power draw or wider range.

ZigBee is a mesh network protocol that is trying to solve the issue of range. While it offers considerably better range than something like BLE, there are range constraints and downfalls that come with the mesh network. For example, some of the nodes in a mesh network are there just to relay information, which causes a constant (and somewhat unnecessary) power draw. This makes ZigBee a bad candidate for battery-powered devices but good for something like electric grid monitoring, which has an unlimited power source. In short, ZigBee continues to be adopted by some niche markets, but it won’t meet the needs of everyone in the M2M space.

The low power, wide-area network (LPWAN) space has recently become more saturated—and right now the leader in the group is SIGFOX. This M2M network sends small, slow bursts of data, which makes it ideal for things like alarm systems or simple meters. Due to its asymmetric link budget, the network only allows for limited r bi-directionality, so it isn’t able to send data back from the gateway to nodes at the fringes of the network. (This is a problem other LPWAN players are looking to solve.)

LoRaWAN is the M2M protocol created by the LoRa Alliance to create an ecosystem of M2M applications all using the LoRa physical layer. Like SIGFOX, LoRaWAN is an uplink-focused network and thus works well for sensor-based devices. This is partially due to regulations in Europe, which hold every device (including the gateway) to a 1% duty cycle. Because of the regulatory differences here in the U.S., a big segment of the market can be addressed by designing a protocol that allows more “command and control”-based applications. And that’s where we at Link Labs have tried to put our focus.

Symphony Link is the IoT network we at Link Labs developed in an effort to solve some of the challenges presented by other M2M architectures. For instance, a single Symphony gateway can be used to talk to 10,000 nodes, and thus cover an entire building. Symphony also targets battery life; a node on our network that sends a message every 10 minutes could feasibly last between eight to 10 years depending on the application.

this just the first part of IoT communication many to come as markets evolve. Feel free to contact me at ravindrapande@gmail.com. I would like to understand if I am missing some important angle in this technology & your view on my writing as well.