On 26 February 2013, Intel announced that it has made available Intel Distribution for Apache Hadoop, which includes Intel Manager for Apache Hadoop Software. Intel stated that more than 20 partners in hardware, software and services will support the launch of this open platform, including SAP, SAS and Tableau, as well as others in the areas of engineering, co-marketing and sales.
The Intel distribution and Microsoft's and EMC's recent Hadoop-related announcements signal a new phase in Hadoop’s maturation. Most major players in this market have their own distributions or partner with pure-play providers, but Gartner believes that the market could consolidate further as hardware-based performance improvements influence organizations' purchase decisions.
Intel is no stranger to Apache Hadoop, having supplied a distribution to customers in China since mid-2012. To differentiate its Hadoop offering, Intel is using deep hardware-level enhancements and proprietary Intel setup and management tools to address performance, security and management challenges associated with Hadoop clusters. The company hopes to exploit its prowess in hardware and its microprocessor architecture, which enables high-performance enterprise-class features for the Hadoop stack, such as Xeon processor networking and I/O optimizations, and hardware-based encryption using AES New Instructions (AES-NI). Intel's Hadoop distribution will also leverage:
Its innovations within the open API model of Apache Hadoop, and its contributions to open-source projects such as HiBench and HiTune that help in benchmarking and tuning Hadoop clusters.
Its recent acquisition of Whamcloud, which commercialized the Lustre file system. Lustre is popular in supercomputing environments, offering significant I/O improvements over Hadoop Distributed File System (HDFS) data and a standard Posix interface.
For Intel, the Hadoop distribution offers both new opportunities and a way to influence the playing field by:
Stemming potential moves away from its silicon and toward competitors such as AMD and ARM for underpinning big data infrastructures.
Using OEM relationships with server and big data appliance vendors such as Cisco, Dell, EMC, HP, IBM and Oracle to influence software engineering optimization for x86-based Apache Hadoop scale-out clusters and commodity hardware.
Intel will face significant entry barriers:
Its team for delivering professional and implementation services is relatively unproven, compared to established pure-play Hadoop distribution vendors (such as Cloudera, Hortonworks and MapR) and vendors that focus on large enterprises.
It will need time and resources to build out its partner ecosystem.
It will compete with similar enterprise-targeted efforts, such as EMC’s Pivotal HD distribution, which joins full ANSI SQL to a Hadoop stack, and Microsoft’s launch of the Windows-based HDInsights distribution with Hortonworks.
In future, Gartner expects Intel will target the growing group of service providers offering Hadoop-as-a-service in the cloud, currently led by Amazon’s Elastic MapReduce. We expect other public and private cloud providers will offer this service in 2013 and beyond.
Evaluate carefully the integration and support Intel offers for hardware and software platforms in your big data environment and its capabilities for turnkey big data projects that will require extensive integration, consulting and training services.
Ensure that Intel's distribution includes and supports the Apache projects necessary for your use case.
Outline support and maintenance service-level agreements for production deployments and contractually stipulate Intel's ability to meet them.
Some documents may not be available as part of your current Gartner subscription.
"How to Choose the Right Apache Hadoop Distribution"
— Vendors offer Apache Hadoop distributions with preintegrated projects, but different vendors offer different combinations, at differing release levels.
By Merv Adrian
"Six Best Practices for Apache Hadoop Pilot Projects"
— Identifying the right use cases, choosing the appropriate distribution or projects to use, bridging the acute skills shortage and dealing with data challenge can limit an organization's ability to reap the benefits of Hadoop.
By Arun Chandrasekaran and Merv Adrian