PC Hardware Reliability Improvements Lead to Longer Useful Life and Shorter Warranties
Improved PC hardware reliability suggests new best practices in PC hardware refresh rates and warranty coverage. This information is important to IT leaders who are trying to manage PC life cycles and warranty costs, or determine if a particular string of failures is above normal.
- All annual failure rates (AFRs) have improved significantly since Gartner's last benchmarking exercise in 2006.
- Desktop PC AFRs are running at low single-digit rates, averaging between 3% and 4% per year for the first four years of system life.
- Notebook AFRs are averaging 8% to 12% in the first year of use, with a slight increase every year afterward. There is a lot of variation depending on form factor, geography, type of user and usage pattern.
- For fully stationary PC hardware devices using solid-state drives (SSDs), the AFRs appear to be similar to devices using hard-disk drives (HDDs). In notebooks, the lack of mechanical parts in SSDs seems to provide a reliability advantage for users operating the notebooks when traveling or moving around.
- Consider buying only a single-year warranty (or the minimum allowed under local law) for desktop PCs with a time and materials maintenance plan for the rest of the system's life. Continue to match notebook warranty coverage to the expected useful life of the system.
- Consider extending the life of notebooks to four years, but recognize the types of users or usage patterns where the move may not be appropriate.
- Continue to specify SSD for performance gains over HDD, but do not expect significant reliability gains for any stationary or rarely moved systems. Use SSDs for notebooks that users will take with them when traveling or will operate while they are in motion (e.g., walking).
PC hardware reliability has improved significantly during the past few years, although there is a great deal of variability due to user types and usage patterns, especially in notebooks. Understanding the underlying reliability of different platforms and the factors that may cause the performance variability can help IT leaders make more informed decisions about replacement strategies and warranty coverage. Understanding the normal failure rates for various PC hardware platforms can help IT leaders determine whether a string of failures falls within expected limits or requires escalation to the hardware provider.
Trying to find objective, independent PC failure rate data can be frustrating, unrewarding and often misleading. There is simply no publicly available source of PC or notebook hardware reliability figures, because PC OEMs consider the information proprietary and will not disclose it.
Many PC OEMs, component manufacturers and warranty repair providers have been willing to provide reliability information to Gartner specifically for benchmarking purposes. We have cross-checked this input against the feedback from Gartner clients, many of whom manage installed bases of 50,000 or more units and are willing to share their reliability experiences. While the information from all of these sources remains anecdotal, there is enough input gathered during a long-enough period to aggregate into a consistent industrywide estimate for business-class desktop PCs and notebooks.1
We define hardware failure as any repair incident that requires the replacement of a part. The part that is swapped out can be as trivial as a latch or as important as a motherboard (see Note 1). While there are many ways to measure failures, we look at AFRs for enterprise-class desktop PCs and notebooks. There is not yet enough history for enterprise-class tablets to draw conclusions about reliability.
In 2013, there have been significant improvements in AFRs across all platforms, compared with 2006 and 2004, which represented the nadir of PC hardware reliability (see Table 1 and Table 2). Not only are the AFRs lower now, but they are staying relatively flat compared with the historical trend of rising AFRs every year of a particular model's life.
Systems Purchased in 2006
Systems Purchased in 2004
Source: Gartner (November 2013)
Source: Gartner (November 2013)
Desktop PCs are averaging 3% to 4% AFR throughout their life.
Notebook AFRs remain higher than for desktop PCs (see Note 2) with a slight increase year over year. Ultrathin notebooks have slightly higher failure rates than standard notebooks, since the thinner form factor makes it more difficult to dissipate heat and build in structural rigidity. Even factoring in the introduction of ultrathin notebooks, there is significant improvement over 2006 and 2004. In 2004, nearly one in three notebooks was likely to fail in its third year of life. In 2013, the likelihood of failure in the third year is between 1 in 8 to 1 in 10, depending on the type of notebook. Some of the improvement comes from better casing materials, such as aluminum, magnesium and high strength carbon fiber. Hard drive failures have been reduced by using better suspension mounts for HDDs and accelerometers that park the head and prevent a disk crash if a fall is detected.
In 2006, motherboards and HDDs were tied as the components with the highest failure rate. In 2013, motherboards are the single component most likely to fail. Storage is second, but at a considerably lower rate (due to accelerometers that park the heads during a drop and improved suspension mounts). Power supplies, keyboards/mice/touchpads and LCDs (touch and electronic packaging) were the other components listed among the top five most likely to fail, but in no consistent ranking across platforms or OEMs. Notably missing was case damage (cracking, corner breaks, hinge failures) due to improved materials, such as aluminum, magnesium and high-strength carbon fiber.
There is still some question about the relative reliability of HDD versus SSD. When the systems containing the drives are fully stationary (for example, in data centers), the AFRs appear to be similar. While SSDs do not have the mechanical problems of rotating magnetic HDDs, they are still prone to firmware problems, and the flash management abilities can vary by supplier.
SSDs from the leading vendors (usually the ones used by enterprise notebook OEMs) appear to have AFRs from 0.6% to 1.2%, but the average across all suppliers and channels is closer to 1.5% to 2% (see "Market Trends: Evolving HDD and SSD Storage Landscapes"). HDD AFRs can range from 1.5% to 4%, depending on the HDD manufacturer and the amount of movement it is subjected to.
In notebooks, the lack of mechanical parts in SSD only seems to provide a reliability advantage for notebooks that travel or move while in operation. This explains why the reliability of standard notebooks with SSDs and those with HDDs are virtually identical, while there is a difference in ultrathin notebooks. Ultrathin notebooks are preferred by many traveling workers and are more likely to be used outside a traditional office setting than standard notebooks, which may travel less and spend more time being docked.
In the systems where there is a difference between SSD and HDD AFRs, the variation appears to increase as the HDDs age and have more mechanical problems.
A close look at notebooks versus desktop PC AFRs by geography and usage shows:
- There is a greater level of variability in notebook AFRs than in desktop PCs when assessing the impact of geography and user/usage patterns.
- Day extender notebook AFRs tend to be several points below the average, while traveling worker notebook AFRs tend to be higher.
- A breakdown by region shows that:
- EU countries tend to have slightly lower AFRs than North America.
- Latin American AFRs are slightly higher compared with North America.
- Asia/Pacific AFRs are also higher than North America, which is the reverse of 2006 benchmark findings.
- K-12 and lower-form students are unanimously cited as the most destructive notebook users — even more than soldiers in active combat zones, traveling workers or an outbound sales force.
Consumer desktop PC and notebook reliability cannot be inferred from these benchmarks. Consumer products often use lower-grade components and casing materials to reduce costs.
Consider buying only a single year of warranty (or the minimum allowed under local law) for desktop PCs with a time and materials maintenance plan for the rest of the system's life. Continue to match notebook warranty coverage to the expected useful life of the system. Among the benefits are:
- While a three-year warranty is standard, reducing the warranty period by two years could save at least $50 or more, depending on the purchase volumes. With a 3% failure rate, the cost of time and materials maintenance from a PC hardware OEM or a third-party maintenance organization is likely to average out to less than $10 per system per year.
- To validate the repair versus warranty costs in your environment, calculate the annual cost of maintenance after the first year. First, have the warranty repair provider supply a list of repairs during the previous year and then ask for price quotes on doing the same repairs on a time and materials basis. Be sure to get the time and materials repair costs for key components, such as the motherboard and HDD. Multiply the price quotes by the failure rates for comparison. Switch to a time and materials strategy if there is a sufficient savings. With desktop PCs, the AFRs are stable over time and predictable, making such a strategy viable (see Note 3 for an example).
- Having the warranty on desktop PCs for the first year covers DOA systems and any early failures. With the reliability improvements seen over the past few years, early failures may occur less frequently, but they still exist, especially for recently introduced models. For desktop PCs, if the out-of-warranty repair is 50% or more of the replacement cost, then a new PC should be swapped in.
- Having 1% to 2% additional systems on hand for loaners during repair turnaround allows for longer SLAs and lower costs.
- The exception to the SLA differentiation would be for a widely distributed workforce (satellite offices or large work-at-home population) where it would be difficult to get consistent maintenance coverage or to stock spares. Packing and shipping for depot service is time-consuming and expensive for desktop PCs. Warranty coverage can be contracted from the OEM, reseller or third-party maintenance organization.
- Keep notebooks under warranty, since the failure rate is three to four times more than for desktop PCs under the best of circumstances. Furthermore, because there is a wide range of variability in reliability across geographies and user types/usage patterns, there is a greater possibility for unpredictable failure patterns. New form factors, such as ultrathin notebooks or tablets, could impact AFRs in unexpected ways.
- A warranty is a way of managing risk. The risk of failure is low and predictable for desktop PCs. For notebooks, the AFRs have dropped significantly during the past six years, but there is still a wide range of variability in the AFRs, which adds a level of unpredictability or risk of being unprepared to meet the repair demands.
Consider extending the life of notebooks to four years, but recognize the types of users or usage patterns where the move may not be appropriate. For example:
- Desktop PC AFRs have been in the single-digit range for more than a decade. In general, it is not unusual for desktop PCs to be operational for as much as six years. As a result, reliability has not been a factor in the replacement strategy. Instead, the ability to run the current OS, applications and peripherals is the major decision criterion (see "PC Hardware Replacement Strategies: Desktop PCs, Thin Clients and Zero Clients").
- Notebook replacement cycles once were strongly influenced by reliability issues. The high risk of failure, especially while a worker was traveling, made it a best practice to replace notebooks every three years (see "PC Hardware Replacement Strategies: Notebooks, Ultrabooks and Media Tablets").
- Although notebook reliability in 2013 is still less than with desktop PCs, the significant
reduction of notebook AFRs since the last benchmark makes a four-year life cycle viable
in many cases. As with desktop PCs, one of the key decision criterion remains the
ability to run the current software load. Notebook replacement strategies must also
take users and usages style into account. For example:
- Day extenders tend to keep their notebooks docked on their desks most of the time, usually only moving the system to take it home one or two nights a week or over the weekends. Less wear and tear usually translates into lower AFRs, making day extender notebooks excellent candidates for four-year life cycles.
- Traveling workers have a totally different usage style. They tend to carry and use their systems in multiple locations, sometimes in moving vehicles resulting in higher AFRs. Just the act of taking the notebook in and out of a carrying case increases the risk of drops. Depending on the criticality of the traveling worker's role and the time it would take to replace a failing notebook, a three-year replacement strategy may still be optimal.
- Another factor pointing toward a three-year life cycle for traveling workers is the demand for portability. If a notebook will be carried all day, then size, weight and battery life are key considerations. Given the rapid year-over-year industry improvements in these factors, keeping a three-year replacement strategy for traveling workers would ensure timely access to more portable systems.
- Note that all users, regardless of their roles, will complain about older notebooks. With the rapid updates in consumer notebooks, users will be constantly exposed to the latest and the greatest, making their particular enterprise notebook seem totally out of fashion. The unintended consequence of extending notebook life cycles could be increased use of unsanctioned personal notebooks and/or bring your own (BYO) tablets as a result.
Continue to consider SSD for performance gains, but do not expect significant reliability gains for any stationary or rarely moved systems. Use SSDs for notebooks that are expected to be used for travel or that operate while the user is in motion. Consider:
- For traveling workers, SSDs can lead from 1% to 3% fewer overall failures per year under some circumstances, especially with ultrathin notebooks. While a warranty will cover the repair expense of HDDs that fail, eliminating even a few percentage points of overall failures with SSDs for traveling workers can have a significant impact on reducing downtime.
- Turnaround time on repairing a failed drive for a traveling worker or salesperson on the road can be as much as five days. The downtime, in turn, could lead to lost sales, reduced productivity or failure to provide critical customer deliverables.
1 Additional information on SSD versus HDD reliability comes from interviews with storage distributors, marketing managers, product planners and procurement managers, as well as sales and corporate executives.
We define a desktop PC or notebook hardware failure as any repair incident that requires a replacement part, whether the part is a motherboard, a drive or a simple latch. Some hardware failure terms are:
- AFR — Percentage of systems within an installed base that require a hardware component replacement over a 12-month period
- Dead on arrival (DOA) — Out-of-box hardware failures
- Early failures — Hardware failures within the first 90 days of product use
- Mean time between failures (MTBF) — A component-level tracking statistic, rather than an overall notebook or desktop system reliability measure
Although we usually discuss annual failure rates, there is often an early shakeout period with high failure rates that drop back to lower levels after 60 to 90 days. This can be especially true for recently introduced models. Some PC hardware vendors exclude the DOA and mortality rates when reporting an AFR. However, all the failures should be included to get the best picture of overall reliability.
AFRs for notebooks are considerably higher than for desktop PCs. The smaller size and thickness of notebooks reduces the amount of structural rigidity that can be designed into notebooks, making them more vulnerable to breakage through dropping. Nearly all the components that can spread out over a large desktop system motherboard are compressed into a smaller space, creating higher heat buildup that can damage sensitive electronic parts. The act of docking and undocking a notebook adds mechanical stress to the docking connector, the casing and, sometimes, the motherboard. Lastly, users transport their notebooks by hand, car, bus, train or airplane to locations that range from comfortable home offices to delivery vehicles to war zones. Not only are notebooks more likely to be dropped, but they can be exposed to heat, water, vibration and dust — all leading to potential breakage or component failures.
AFRs for ultrathin notebooks are generally higher than for standard notebooks, because the z-axis (i.e., the thickness) is even more highly compressed than for standard notebooks. There is even less room for building in structural rigidity and for heat dissipation. Ultrathin notebooks are more likely to be used for travel and be subjected to more extreme use cases.
In a population of 1,000 desktop PCs for a given year, it would not be unusual to have failures in the range of 1% (10) motherboards with replacement cost for each averaging $600, 0.5% (five) hard drives with replacement cost for each averaging $300 and 1.5% (15) miscellaneous other things that might include mouse, keyboard or power supply with replacement costs for each averaging less than $100. The total cost of repairs under time and materials maintenance across 1,000 systems for one year would be $9,000 or $9 per system. The results will vary with different populations.
The next step is to compare the cost of time and materials repairs with the cost of warranty for the year. Annual warranty costs (for depot service) could range from $25 to $50 per system.
Under such a scenario, it would be worth considering time and materials maintenance from the second year onward. It is still worth having warranty coverage for the first year for desktop PCs, because there may be a certain number of systems and early failures in the first year.