x

Machine Learning Everywhere – the new normal for competitive advantage

Learn how to put the power of data science and machine learning to work for your business.

Research from Gartner

Five Ways Data Science and Machine Learning Deliver Business Impacts

Data science and machine learning can have profound impact on a business, and are becoming critical for differentiation and sometimes survival. Being able to quickly identify that impact into one of five categories this research presents will help data and analytics leaders further drive results.

Impacts

  • Data and analytics leaders should understand the powerful and tangible business benefits that data science projects can deliver prior to rounding up and engaging with their analytical teams, to have a greater chance of success.

Recommendations

Data and analytics leaders responsible for analytics and BI strategies should:

  • Ensure that senior data scientists are part of innovation projects – only then can you be sure not to miss out on innovations that could be framed as data science projects.
  • Use your data science team to support production teams for continuously improving enterprisewide model management and performance monitoring.
  • Create a portfolio of analytical scenarios and use cases, including those that your organization is already executing or planning, to better rationalize funding decisions for data science projects.
  • Implement and maintain a continuous and intense dialogue between the data science team, the business functions (lines of business) and the executives charting the corporate strategy.

Strategic Planning Assumption

By 2020, more than 40% of data science tasks will be automated, resulting in increased productivity and broader usage by citizen data scientists.

Analysis

As data is already everywhere and consistently growing in volume and complexity, so data science problems are becoming increasingly prevalent. Some organizations are facing a large number of use cases to which data science could be applied. To better cope with the sheer mass of projects, some leading organizations are starting data science teams whose general mission is to become a shared resource across the organization.

Analytics themes still outrank in executives priorities other popular technology topics. Organizations are actually funding that curiosity by increasing their investment in analytics from $31 billion in 2013 to an estimated $114 billion in 2018.1 The same study also confirmed that 60% of executives in 2016 believed that analytics would disrupt their industry within the next three years – and it has. According to McKinsey Global Institute, such disruption, fueled by analytics, could actually generate much bigger benefits and savings. In the U.S. for example, where healthcare spending is 18% of GDP, savings could amount to $600 annually per person, or 1% to 2% of GDP. In transportation, thanks to data-driven thinking and innovation initiatives (including high-scale real-time analytics), improvements to inefficient supply-demand matching could potentially create between $850 billion to $2.5 trillion in economic impact.2

At their level, leveraging such radical societal changes and disruptions, organizations with data science expertise can expect significant returns. Figure 1 summarizes the impact of disruption levels (and type of business value) brought about by data science teams.

Figure 1. Mission Statement of Data Science Teams

Mission Statement of Data Science Teams

LOB = line of business; ROI = return on investment; SWAT = special weapons and tactics
Source: Gartner (October 2017)

Mobile POS

At the macrolevel, data and analytics leaders can use data science projects to deliver the following high-level business impacts, which we discuss throughout the note in more detail:

  1. Innovation – Foster new thinking and business disruptions based on data science.
  2. Exploration – Explore unknown transformative patterns in data.
  3. Prototyping – Challenge the status quo with radical new solutions.
  4. Refinement – Continuously improve existing in-production solutions.
  5. Firefighting – Identify the drivers of certain upcoming situations.

At the microlevel, of course, data science projects and teams can contribute in many more ways:

  • Coaching citizen data scientists and validating their work.
  • Educating the entire organization to become more analytics-driven and moderating the data discussion.
  • Fostering networking, expertise sharing and innovation thinking across an organization.
  • Cataloging internal and relevant external data sources to improve decision making and precision in predictions and recommendations.
  • Suggesting data development initiatives for acquiring more data, whether internal or external, or through customer communities.
  • Evaluating analytical tools and service providers or recommending better integration of analytical assets in production environments.
  • Creating a portfolio of analytical scenarios and use cases, including those your organization is already executing or planning, to better rationalize funding decisions for data science projects.
  • Studying what your industry peers and adjacent industries are doing, and including these activities in your portfolio of analytical scenarios.

Figure 2 summarizes the top recommendation per area of impact, which we discuss in more detail below.

Figure 2. Impacts and Top Recommendations for Data and Analytics Leaders

Impacts and Top Recommendations for Data and Analytics Leaders

BU = business unit
Source: Gartner (October 2017)

Impacts and Recommendations

Data and analytics leaders should understand the powerful and tangible business benefits that data science projects can deliver prior to rounding up and engaging with their analytical teams, to have a greater chance of success

1. Innovation – Foster New Thinking Based on Data Science

Without data scientists and their knowledge, many issues surrounding the digital business age will remain unresolved – possibly even untouched. Data scientists frame complex business problems as machine-learning or operations research problems. Data scientists know which new information sources should be collected or acquired from external sources, to solve old and pivotal business issues in radically new ways. Some of those iconoclastic ideas can find their way to the most unexpected places; think of Moneyball, the 2003 book and 2011 movie, where sabermetrics was popularized to completely question the old method of evaluating the performance of individuals and teams in baseball.3

There are many more examples of disruptive projects and new business moments (see Note 2 for a business moments definition) that are made possible through data science.

Case Examples: Innovation

  • In the mid-1990s, Amazon started one of the earliest recommendation services ("here are four other items that customers buying this product also bought"). This became one of the most prominent and lucrative data science projects in history. Rumor has it that 15% to 20% of Amazon's retail business is due to this simple product recommendation. In fact, it became a desirable feature, with customers wanting to explore related items for any given product.4
  • UPS On-Road Integrated Optimization and Navigation (ORION) revamped route optimization using many new data sources. It has enabled UPS to significantly improve its routing schedules, saving hundreds of millions of dollars per year while improving customer service.5
  • IBM Watson's Jeopardy-winning natural-language system was based on crowdsourced data and cutting-edge assembly of different machine-learning and natural-language approaches.6
  • While using machine learning to predict and reduce engine repair costs (saving $63 million in two years), a U.S.-based aircraft engine manufacturer realized that it could better estimate fuel consumption and engine usage time by uploading in real-time thousands of sensor data points to the cloud. It therefore switched its entire business model from selling engines to plane manufacturers to "leasing hours of flight" while guaranteeing fuel consumption levels – a major expense for airlines. This was a revolutionary business move.7

Companies also use data and the corresponding analytics in novel ways. For example, Progressive was one of the first insurers to create an insurance product that used GPS-based location intelligence to keep it better informed about the actual risks against which it is insuring.

Many online companies have been masters of data-driven innovation. The likes of Amazon, Google, Airbnb, Uber and Facebook constantly introduce new systems to collect better information. This enables them to create better or new services.

Recommendations for data and analytics leaders:

  • Use your data science team to frame complex business problems not yet sufficiently understood as data science problems.
  • Find inspiration for data-driven innovation from three sources:
    • Internal curiosity – You are your most important source of inspiration. Constantly think about your own business model, industry (or other industries) and understanding of new types of customer or equipment interaction points – that is, keep dreaming up those new business moments through "what if" scenarios.
    • Technology screening – Learn what you can from successful case studies from your own industry or other industries. But be cautious: Many publicly available case studies may not fully reflect exactly what happened, so consider reaching out to the actual implementer.
    • Induction from data – Examine how data expeditions can support your thinking process, and how they can uncover novel and insightful patterns that teach you more about unsuspected underlying business mechanics.

2. Exploration – Explore Unknown Patterns in Data

Data scientists must engage with big data expeditions, especially when there is no clear objective other than to explore the data for insights and tidbits. Such expeditions are a form of inductive thinking or inductive reasoning (see Note 3) – an example of "letting the data speak." The process can be tactical and ad hoc. Alternatively, it can be part of a more systematic practice in which you give the data science team or lab (see Note 4) a data dump for diving into and exploring. The lab then looks for anomalies, seeking something new. It then drills deeper into the shape of the data using more-advanced techniques, which might include cluster and factor analysis, anomaly detection, regression, decision trees, Monte Carlo simulation and link analysis.

Case Example: Exploration

  • While providing ship classification services, a Japanese maritime service provider realized that it was gathering very valuable asset management data. By leveraging advanced analytics methods and exploring its asset management data mother lode, the company discovered that it could help ship operators reduce equipment failures and lifetime maintenance costs by 10% as well as reduce anomaly detection costs by 90%. The provider increased its market share by 20% by offering that unexpected and valuable service.8

The objective of data exploration is always to discover which events are drivers – or inhibitors – of other events, or of good or bad outcomes (such as reducing equipment failure and increasing customer satisfaction). It could also lead to gaining an understanding of events that could be new customer touchpoints or engagement points. Such information could be used to foster data-based innovation.

However, these kinds of projects can be a bit like fishing expeditions. The available data may give hints about what you may gain from the process or give you a better understanding of underlying business mechanics. They could also help you uncover very valuable data assets seen to that point as merely data side effects (like in the Japanese maritime service provider case). Finally, those projects could validate that the data is clean or point to additional data sources to enhance internal sources.

Recommendations for data and analytics leaders:

  • Use your data science team to spot anomalies in data to anticipate any problems, rather than reacting after a crisis happens – such as when police do regular patrols or people go for routine medical checks, for example.
  • Ask your data science team to take another look at the data when new information sources become available or when you gain new understanding.
  • Organize internal or external "analytics competitions" such as hackathons to promote innovative thinking and uncover analytics talents.

3. Prototyping – Challenge the Status Quo With Radical New Solutions

Data science and especially machine learning excel in solving complex, data-rich business problems where traditional approaches, such as human judgment and exact solutions, are increasingly showing their limits due to the escalation of problem complexity and ever-expanding volume of available data. Data science methods have been proven to often deliver superior results, when the space of critical variables is highly dimensional and very noisy.

Data science teams could tackle hundreds of new business problems. Companies are already using data science teams for tasks such as:

  • Improving product categorization. Many large online retailers have realized that their product categorization may have errors or not align to the way customers think. Data science teams are seeking to improve this by using all available features, including look, shape and purpose codes (such as European Article Numbering and North American Industry Classification System codes), product text descriptions and user-generated tags.
  • Predicting more accurately passenger no-shows. More accurate predictions enable airlines to more safely overbook their planes. This minimizes potential lost revenue from empty seats as well as the risk of passengers arriving to find no seat available for them.

Case Examples: Prototyping

  • In April 2015 a competition on Kaggle concerned the early detection of diabetic retinopathy.9 The competition challenged data scientists to design a model that would result in an enhanced automated detection system for this disease.
  • Facing a rising crime rate, a U.S.-based police department needed an efficient and cost-effective way to analyze crime data, assess public safety risks and make intelligent decisions about personnel deployment. It used predictive analytics to discover hidden relationships in the data and automatically generate crime forecasts. By optimizing the deployment of police forces, homicides in the city fell by 35% and robberies by 20% year over year. The solution ROI was estimated at 863%.10

Recommendations for data and analytics leaders:

  • Assess whether it would be best to design a radical new solution or to buy or outsource one. It is often better for the business to have a good solution now than a great solution in a year.
  • Be cautious when your data science team uses particular data for the first time. Some data was never intended for serious advanced analytics, so scrutinizing data lineage (including its legal validity) and making the data make sense are paramount.
  • Leverage automated machine-learning capabilities, which involves a metasearch that tweaks a set of acceptable solutions to increase lift, classification or estimation accuracy.
  • Involve line-of-business units and internal business partners as early as possible to determine the appropriate key performance indicators of those new solutions. Close the decision management loop by constantly monitoring (and adjusting the solutions to) their business results.

4. Refinement – Continuously Improve Existing In-Production Solutions

Most data scientists work in the production part of their business. In such areas, established models are already "in production." For example:

  • Banks, retailers, telcos and insurance companies are constantly refining their existing customer segmentation, in order to gain a better understanding of customer profitability, behavior and engagement optimization.
  • Retailers keep recalibrating propensity-to-buy models. Online retailers specifically are constantly improving and updating price elasticity prediction, in order to optimize their dynamic pricing.
  • Financial services providers are continuously working to improve their risk models – the more accurate their assessment of risk, the better their chances of profitability.

Case Examples: Refinement

  • As a race is taking place, an F1 racing team streams data to the cloud and shares it with the pit crew teams, who are equipped with mobile technology. The data is analyzed in real time by researchers at the engine manufacturer's R&D facility in Japan and the F1 team in the U.K. Transmitting this analysis using streaming technology as the race is taking place allows for adjustments to basic metrics such as temperature, pressure and power levels, which help improve the vehicle's performance.11
  • By embedding analytics (through fraud detection models) in its claims handling workflow, a U.S. insurance company cut referral times from 14 days to less than 24 hours on special investigation claims. It also identified and addressed subrogation claims at twice the speed – from 26 to 10 days, and losses passed for subrogation rose from 15% to 16% to 19% to 20%. The company saw a payback for its solution in three months, for an estimated ROI of 403%.12

As is the way in all these use cases, organizations must constantly improve their data science practices as new data becomes constantly available, as new products are created, and as consumers or ecosystem partners share data on the usage of these products. Other improvements are induced by customer behavior changes – not only day by day or through different seasons, but also year after year, through competition, the zeitgeist and an ever-changing marketplace. Data science teams must also adjust to fast and constant changes around customer touchpoints, with new devices and wearables regularly released by equipment manufacturers and quickly adopted by consumers. Finally, new customer contextualization strategies can lead to better results, and require many existing models and data sources inputs to be adjusted.

Recommendations for data and analytics leaders:

  • Make sure that the data science team stays close to the business units, and keep sharing their experience and ideas. In turn, ensure that business units keep the data science team aware of changing market and business conditions.
  • Use your data science team to support production teams in creating and improving enterprisewide model management and performance monitoring.
  • Use your data science team to help production teams create a more homogeneous and cutting-edge compute architecture in terms of hardware, cloud and software stack.
  • Ensure that your data science and production teams jointly explore the external data landscape and deploy cutting-edge algorithms (for example, ensemble techniques).

5. Firefighting – Identify the Drivers of Certain Upcoming Situations

It is not always possible (almost by definition) to avoid a crisis; its causes might be unpredictable or led by a priori uncorrelated events. This situation is a variation of the exploration category. Many analytics projects are triggered by crises whose symptoms are usually well-identified, such as:

  • Customer complaints suddenly rising
  • Customer retention falling dramatically
  • Quality defects dramatically increasing
  • Profitability dropping precipitously

This means that the data science team has to identify "only" the cause, which narrows the datasets it must scrutinize.

Everything else in this use scenario is very similar to the work the data science lab does in "exploration" mode – that is, the lab does not know at the outset whether it can identify the cause of the problem. If the events are totally uncorrelated or rarely occurring issues, the lab may never be able to identify the cause.

Basic data discovery/self-service BI can often help. However, a deeper dive by a data science team can extract more from the data about what is really happening. For example:

  • Manufacturers worldwide are looking into the causes of quality fluctuations by combining "what if" analysis with sensitivity analysis or inversion of predictive models. Given the increasing complexity and change cadence of devices manufactured, prior data might not be readily available.
  • Technical support operations are trying to understand the drivers of maintenance costs. It is known that certain customer segments are more difficult to deal with than others. Factoring these risks into pricing can be crucial and is a well-established practice in the insurance industry – especially considering the dramatic changes and uncertainties brought by increasingly unpredictable weather patterns.
  • Online retailers are investigating the reasons why customers return purchased goods when prices are lower than the competition, delivery times are very competitive and the goods are of irreproachable quality.

Recommendations for data and analytics leaders:

  • Apply the Occam's Razor principle: Data scientists shall establish trust by applying the simplest methods that still deliver the key insight.13
  • Leverage firefighting projects to expand the data science team's corporate network whenever possible.
  • Build versatile skills and domain knowledge within the data science team, allowing members to be ultrareactive when their services are required.

From tactical and immediate impacts to strategic transformations and even disruptive ideas, data science and machine-learning projects can exert a profound influence on an organization. Impressive business impacts have been documented across industries showing that these technologies are becoming critical factors of differentiation and sometimes survival. Being able to quickly identify and categorize that impact can further improve on those already outstanding results and contributions.

Maturity Levels Acronym Key and Glossary Terms

BI business intelligence

Evidence

1 "Cracking the Data Conundrum: How Successful Companies Make Big Data Operational," Capgemini Consulting, 14 January 2015.

2 "The Age of Analytics: Competing in a Data-Driven World," McKinsey Global Institute, December 2016.

3 M. Lewis, "Moneyball: The Art of Winning an Unfair Game," W. W. Norton & Co., 2003.

4 "How Retailers Can Keep Up With Consumers," McKinsey Quarterly, October 2013.

5 "ORION Backgrounder," UPS

6 "IBM Puts Its Faith in Watson," E-Commerce Times, 20 January 2014.

7 "Pratt & Whitney Taps IBM to Capture Value of Big Data to Improve Aircraft Engine Performance," IBM, 17 July 2014.

8 "ClassNK, IHIMU, DU and IBM Develop Ship Maintenance Software,"The Maritime Executive, 12 November 2012; "ClassNK Develops Ship Maintenance Software With IHIMU, DU and IBM," gCaptain, 12 November 2012.

9 The Diabetic Retinopathy Detection competition on Kaggle started on 17 February 2015 and is due to finish on 27 July 2015. The California Healthcare Foundation sponsored it with a reward of $100,000.

10 "Memphis Police Department Reduces Crime Rates With IBM Predictive Analytics Software," IBM, 21 July 2010.

11 "Honda, Watson IoT and Formula 1," IBM, 22 November 2016.

12 "Infinity Property and Casualty Builds a Smarter System for Fraud," InformationWeek, 30 November 2011.

13 "Why You're Not Getting Value From Your Data Science," Harvard Business Review, 7 December 2016.

Source: Gartner Research Note G00343858, Erick Brethenoux, Alexander Linden, 19 October 2017

Note 1
Data Science

Data science is the discipline of extracting nontrivial knowledge from all kinds of data, to improve decision making. It involves a variety of steps, ranging from business understanding and data preparation to building and deploying analytic models. It is, to some extent, a replacement term for data mining, but is also much more: data science is the unification of several quantitative disciplines (statistics, machine learning, operations research, computational linguistics and others). For the first time, people trained in these different disciplines are all willing to unite behind the banner of data science – a very profound development.

During the past year, this notion of data science has become more widely used, and many more academic institutions now offer data science courses and degrees. In addition, organizations hiring data scientists and building data science teams and labs are on the rise. Gartner expects that within a few years, the term "data science" will gain widespread recognition as an umbrella term for many forms of sophisticated analytics.

Organizations that want to increase the maturity of their analytics and extend their portfolio of analytics capabilities need to improve their data science skills. They need to leverage new data sources and demonstrate business value using predictive and prescriptive (and often diagnostic) capabilities. However, organizations must recognize that data scientists are in very short supply – recruiting them internally may be difficult, but not impossible. They must also leverage their "citizen data scientists" in their lines of business to increase the reach and impact of analytics.

Data science drives a vast array of use cases across all industries – for example, customer relationship management, supply chain management, optimization and automation of diverse production processes, drug research, quality and risk management, smart cities, smart systems, and many more.

Note 2
Business Moments

Gartner defines a business moment as a transient opportunity that is exploited dynamically. It is very short in duration – perhaps only seconds, depending on the nature of the opportunity. This catalyst sets in motion a series of events involving people, businesses and "things" that span multiple industries and multiple ecosystems.

Note 3
Inductive Reasoning

Inductive reasoning aims at creating broader generalization from observations. Even though the facts that produce the generalization can be true, the generalization itself might not always be accurate. For example, if it has been sunny each time you have visited Dusseldorf in Germany, you might conclude – falsely – that it is always sunny in Dusseldorf.

Note 4
Data Science Labs/Teams

A data science lab is a team disconnected from – but close to – the BI competency center. Its individual members usually have different skills. For example, these might be in:

  • Advanced statistics
  • Business process engineering
  • Programming of distributed processing
  • Information architecture
  • Management

A data science team becomes a "lab" when you provide it with resources, such as server and storage sandboxes or relief from daily workload. It often has a ratio of solutions to "dead-end" efforts in the region of 1:10.