Gartner Research

An Introduction to and Evaluation of Apache Spark for Big Data Architectures

Published: 20 August 2018

ID: G00360975

Analyst(s): Sanjeev Mohan

Summary

Apache Spark is an open-source unified analytics engine that reduces the time between data acquisition and business insights delivery. Technical professionals can create batch and streaming pipelines, data transformation, machine learning and analytical reporting using common APIs.

Table Of Contents

Analysis

  • Background
    • What Is Spark?
    • Is Spark Going to Displace Hadoop?
  • Spark Architecture
    • Spark Components
    • Summary
  • Spark Use Cases
    • Data Ingest
    • Data Transform
    • Machine Learning
    • Advanced Analytics
  • Strengths
  • Weaknesses

Guidance

The Details

Gartner Recommended Reading

©2019 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. This publication may not be reproduced or distributed in any form without Gartner’s prior written permission. It consists of the opinions of Gartner’s research organization, which should not be construed as statements of fact. While the information contained in this publication has been obtained from sources believed to be reliable, Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartner research may address legal and financial issues, Gartner does not provide legal or investment advice and its research should not be construed or used as such. Your access and use of this publication are governed by Gartner’s Usage Policy. Gartner prides itself on its reputation for independence and objectivity. Its research is produced independently by its research organization without input or influence from any third party. For further information, see Guiding Principles on Independence and Objectivity.

Already have a Gartner Account?

Become a client