Welcome to the CIO Wiki. The IT Management Glossary. 

We are building a glossary of IT management terms, and topics. We invite you to participate. Learn. Share. Network.


Big Data

Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And big data may be as important to business – and society – as the Internet has become. Why? More data may lead to more accurate analyses.More accurate analyses may lead to more confident decision making. And better decisions can mean greater operational efficiency, cost reductions and reduced risk. 1

Data volumes are growing and the pace of that growth is accelerating. Sensor data, log files, social media and other sources have emerged, bringing a volume, velocity, and variety of data that far outstrips traditional data warehousing approaches. Forward-looking organizations are harnessing these new sources in creative ways to achieve unprecedented value and competitive advantage. It’s not as simple as putting all of this data in one place. The real business value of these “big data” sources is always unlocked through specific use cases and applications. Those applications can vary widely across departments and industries. While there are interesting technical challenges associated with integrating and managing all of this data, organizations should first take the time to identify and crystallize the right use case or use cases for their own business needs. This is a critical first step to understand the key business insights they stand to gain and the improved results they can achieve with those insights.

Big data analytics is the use of advanced analytic techniques against very large, diverse data sets that include different types such as structured/unstructured and streaming/batch, and different sizes from terabytes to zettabytes. Big data is a term applied to data sets whose size or type is beyond the ability of traditional relational databases to capture, manage, and process the data with low-latency. And it has one or more of the following characteristics – high volume, high velocity, or high variety. Big data comes from sensors, devices, video/audio, networks, log files, transactional applications, web, and social media - much of it generated in real time and in a very large scale. Analyzing big data allows analysts, researchers, and business users to make better and faster decisions using data that was previously inaccessible or unusable. Using advanced analytics techniques such as text analytics, machine learning, predictive analytics, data mining, statistics, and natural language processing, businesses can analyze previously untapped data sources independent or together with their existing enterprise data to gain new insights resulting in significantly better and faster decisions.2

Defining Big Data
Big data typically refers to the following types of data:
• Traditional enterprise data – includes customer information from CRM systems, transactional ERP data, web store transactions, general ledger data.
• Machine-generated /sensor data – includes Call Detail Records (“CDR”), weblogs, smart meters, manufacturing sensors, equipment logs (often referred to as digital
exhaust), trading systems data.
• Social data – includes customer feedback streams, micro-blogging sites like Twitter, social media platforms like Facebook3

Five High Value Uses for Big Data
IBM has conducted surveys, studied analysts’ findings, talked with more than 300 customers and prospects and implemented hundreds of big data solutions. As a result, it has identified the top five high value use cases, which could form first steps into big data, as follows:
1. Big data exploration: find, visualize and understand big data to improve decision making
2. 360-degree view of the customer: enhance the existing customer view by incorporating internal and external information sources
3. Security/intelligence extension: reduce risk, detect fraud and monitor security in real time
4. Operations analysis: analyze a variety of machine data for better business results and operational efficiency
5. Data warehouse augmentation: integrate big and traditional data warehouse capabilities to gain new business insights while optimizing the existing warehouse infrastructure4

Why is Big Data Important?
Big data analytics helps organizations harness their data and use it to identify new opportunities. That, in turn, leads to smarter business moves, more efficient operations, higher profits and happier customers. In his report Big Data in Big Companies, IIA Director of Research Tom Davenport interviewed more than 50 businesses to understand how they used big data. He found they got value in the following ways:
Cost reduction. Big data technologies such as Hadoop and cloud-based analytics bring significant cost advantages when it comes to storing large amounts of data – plus they can identify more efficient ways of doing business.
Faster, better decision making. With the speed of Hadoop and in-memory analytics, combined with the ability to analyze new sources of data, businesses are able to analyze information immediately – and make decisions based on what they’ve learned.
New products and services. With the ability to gauge customer needs and satisfaction through analytics comes the power to give customers what they want. Davenport points out that with big data analytics, more companies are creating new products to meet customers’ needs.5

Image
source:SAS

Practical Big Data Benefits
  • Dialogue with consumers: Today’s consumers are a tough nut to crack. They look around a lot before they buy, talk to their entire social network about their purchases, demand to be treated as unique and want to be sincerely thanked for buying your products. Big Data allows you to profile these increasingly vocal and fickle little ‘tyrants’ in a far-reaching manner so that you can engage in an almost one-on-one, real-time conversation with them. This is not actually a luxury. If you don’t treat them like they want to, they will leave you in the blink of an eye.
    Just a small example: when any customer enters a bank, Big Data tools allow the clerk to check his/her profile in real-time and learn which relevant products or services (s)he might advise. Big Data will also have a key role to play in uniting the digital and physical shopping spheres: a retailer could suggest an offer on a mobile carrier, on the basis of a consumer indicating a certain need in the social media.
  • Re-develop your products: Big Data can also help you understand how others perceive your products so that you can adapt them, or your marketing, if need be. Analysis of unstructured social media text allows you to uncover the sentiments of your customers and even segment those in different geographical locations or among different demographic groups.
    On top of that, Big Data lets you test thousands of different variations of computer-aided designs in the blink of an eye so that you can check how minor changes in, for instance, material affect costs, lead times and performance. You can then raise the efficiency of the production process accordingly.
  • Perform risk analysis: Success not only depends on how you run your company. Social and economic factors are crucial for your accomplishments as well. Predictive analytics, fueled by Big Data allows you to scan and analyze newspaper reports or social media feeds so that you permanently keep up to speed on the latest developments in your industry and its environment. Detailed health-tests on your suppliers and customers are another goodie that comes with Big Data. This will allow you to take action when one of them is in risk of defaulting.
  • Keeping your data safe: You can map the entire data landscape across your company with Big Data tools, thus allowing you to analyze the threats that you face internally. You will be able to detect potentially sensitive information that is not protected in an appropriate manner and make sure it is stored according to regulatory requirements. With real-time Big Data analytics you can, for example, flag up any situation where 16 digit numbers – potentially credit card data - are stored or emailed out and investigate accordingly.
  • Create new revenue streams: The insights that you gain from analyzing your market and its consumers with Big Data are not just valuable to you. You could sell them as non-personalized trend data to large industry players operating in the same segment as you and create a whole new revenue stream. One of the more impressive examples comes from Shazam, the song identification application. It helps record labels find out where music sub-cultures are arising by monitoring the use of its service, including the location data that mobile devices so conveniently provide. The record labels can then find and sign up promising new artists or remarket their existing ones accordingly.
  • Customize your website in real time: Big Data analytics allows you to personalize the content or look and feel of your website in real time to suit each consumer entering your website, depending on, for instance, their sex, nationality or from where they ended up on your site. The best-known example is probably offering tailored recommendations: Amazon’s use of real-time, item-based, collaborative filtering (IBCF) to fuel its ‛Frequently bought together’ and ‛Customers who bought this item also bought’ features or LinkedIn suggesting ‛People you may know’ or ‛Companies you may want to follow’. And the approach works: Amazon generates about 20% more revenue via this method.
  • Reducing maintenance costs: Traditionally, factories estimate that a certain type of equipment is likely to wear out after so many years. Consequently, they replace every piece of that technology within that many years, even devices that have much more useful life left in them. Big Data tools do away with such unpractical and costly averages. The massive amounts of data that they access and use and their unequalled speed can spot failing grid devices and predict when they will give out. The result: a much more cost-effective replacement strategy for the utility and less downtime, as faulty devices are tracked a lot faster.
  • Offering tailored healthcare: We are living in a hyper-personalized world, but healthcare seems to be one of the last sectors still using generalized approaches. When someone is diagnosed with cancer they usually undergo one therapy, and if that doesn’t work, the doctors try another, etc. But what if a cancer patient could receive medication that is tailored to his individual genes? This would result in a better outcome, less cost, less frustration and less fear. With human genome mapping and Big Data tools, it will soon be commonplace for everyone to have their genes mapped as part of their medical record. This brings medicine closer than ever to finding the genetic determinants that cause a disease and developing drugs expressly tailored to treat those causes — in other words, personalized medicine.
  • Offering enterprise-wide insights: Previously, if business users needed to analyze large amounts of varied data, they had to ask their IT colleagues for help as they themselves lacked the technical skills for doing so. Often, by the time they received the requested information, it was no longer useful or even correct. With Big Data tools, the technical teams can do the groundwork and then build repeatability into algorithms for faster searches. In other words, they can develop systems and install interactive and dynamic visualization tools that allow business users to analyze, view and benefit from the data.
  • Making our cities smarter: To help them deal with the consequences of their fast expansion, an increasing number of smart cities are indeed leveraging Big Data tools for the benefit of their citizens and the environment. The city of Oslo in Norway, for instance, reduced street lighting energy consumption by 62% with a smart solution. Since the Memphis Police Department started using predictive software in 2006, it has been able to reduce serious crime by 30 %. The city of Portland, Oregon, used technology to optimize the timing of its traffic signals and was able to eliminate more than 157,000 metric tonnes of CO2 emissions in just six years.6

Big Data Challenges
One of the reasons big data is so underutilized is because big data and big data technologies also present many challenges. One survey found that 55% of big data projects are never completed. This finding was repeated in a second survey, that found the majority of on-premises big data projects aren’t successful.
  • Scalabity: With big data, it’s crucial to be able to scale up and down on-demand. Many organizations fail to take into account how quickly a big data project can grow and evolve. Constantly pausing a project to add additional resources cuts into time for data analysis. Big data workloads also tend to be bursty, making it difficult to predict where resources should be allocated. The extent of this big data challenge varies by solution. A solution in the cloud will scale much easier and faster than an on-premises solution
  • Lack of Talent: Businesses are feeling the data talent shortage. Not only is there a shortage of data scientists, but to successfully implement a big data project requires a sophisticated team of developers, data scientists and analysts who also have a sufficient amount of domain knowledge to identify valuable insights. Many big data vendors seek to overcome this big data challenge by providing their own educational resources or by providing the bulk of the management.
  • Hadoop is Hard: While Hadoop and the surrounding ecosystem of tools is lauded for its ability to handle massive volumes of structured and unstructured data, the software isn’t easy to manage or use. Since the technology is relatively new, many data professionals aren’t familiar with how to manage Hadoop. Add to that the fact that Hadoop frequently requires extensive internal resources to maintain, and many companies are left devoting most of their resources to the technology rather than to the actual big data problem they are trying to solve. In the survey mentioned above, 73% of respondents claimed understanding the big data platform was the most significant challenge of a big data project.
  • Actionable Insights: Having more data doesn’t necessarily lead to actionable insights. A key challenge for data science teams is to identify a clear business objective and the appropriate data sources to collect and analyze to meet that objective. The challenge doesn’t stop there, however. Once key patterns have been identified, businesses must be prepared to act and make necessary changes in order to derive business value from them.
  • Data quality: is not a new concern, but the ability to store every piece of data a business produces in its original form compounds the problem. Dirty data costs companies in the United States $600 billion every year. Common causes of dirty data that must be addressed include user input errors, duplicate data and incorrect data linking. In addition to being meticulous at maintaining and cleaning data, big data algorithms can also be used to help clean data
  • Security: Keeping that vast lake of data secure is another big data challenge. Specific challenges include:
    1. User authentication for every team and team member accessing the data.
    2. Restricting access based on a user’s need.
    3. Recording data access histories and meeting other compliance regulations
    4. Proper use of encryption on data in-transit and at rest.
  • Cost Management: It’s difficult to project the cost of a big data project, and given how quickly they scale, can quickly eat up resources. The challenge lies in taking into account all costs of the project from acquiring new hardware, to paying a cloud provider, to hiring additional personnel. Businesses pursuing on-premises projects must remember the cost of training, maintenance and expansion. Big data in the cloud projects must carefully evaluate the service-level agreement with the provider to determine how usage will be billed and if there will be any additional fees.7


See Also»

Hadoop


References»

1What is Big Data on SAS Institute
2What is Big Data Analytics
3Big Data for the Enterprise
4Smarter security intelligence: Leverage big data analytics to improve enterprise security
5Why is big data analytics important?
6Ten Practical Big Data Benefits
7Big Data Challenges and Opportunity



External References»

Big Data: What it is and why it matters
A Very Short History Of Big Data
Big Data: The Management Revolution
Why “Big Data” Is a Big Deal



CIO Desk Reference»

(Relevant content on this topic in the CIO Toolkit on CIO Index)

Big Data 101
Big Data Analytics Guide
An Architects Guide to Big Data
A Practical Guide to Big Data
Demystifying Big Data
A Framework for Big Data Strategy
Introduction to Hadoop
Case Study: A real life enterprise implementation of Big Data
How are Organizations Using Big Data
Big Data, Little IT
What is Big Data
The Organizational Challenge of Big Data
Big Data's Security Challenge



Modified on 2016/08/11 19:49 by Sourabh Hajela  
Tags: Not Tagged