COVID-19 presents a potential threat and opportunity for cloud providers. Reactions will vary from testing their services for outages to offering services at a discount to support remote work. Technology general managers must prepare and test their offerings to thrive during and after a pandemic.
Most corporate networks are not prepared for the onslaught of remote work being driven by COVID-19 (aka coronavirus disease). While companies have spent money to build out their existing capacity to handle remote work, few have come close to provisioning capacity many times their established norms.
Customers’ concerns about performance problems that are building due to the increased load of remote workers are not initially focused on the hyperscale cloud providers. Instead, customers are noting slowdowns in connections and bandwidth limitations in their normal workplace and collaboration applications.
Many in the workforce do not have a position that is conducive to remote working (e.g., cafeteria workers and janitorial services). Consequently, the microeconomies associated with office complexes are being impacted as remote workers stay away.
Technology general managers responsible for building a product portfolio strategy must:
Ease customers’ concerns by demonstrating a strong ability to handle spikes to both their VPNs and cloud-supported applications that are brought on by rapid increases in remote workers.
Build customer confidence by both stress-testing cloud data centers, networks and services and releasing the results of such testing to customers.
Prevent both financial and work hardships during the COVID-19 outbreak period by engaging in customer and employee philanthropy to act as a stopgap.
The global spread of COVID-19, the novel coronavirus, has driven organizations to rethink the way they work and operate. Many employees have been told to work from home and to use videoconferencing and online collaboration services. Not only does this stress the limits of back-end supporting services, which in many cases are cloud services, but it significantly increases the volume of traffic in the networks connecting users to their services.
The COVID-19 pandemic has left many organizations unsure whether their business continuity strategy is sufficiently robust. Some hard questions related to the use of cloud services in situations like this are being asked, including:
Is the public cloud model sufficiently scalable and resilient to handle unforeseen spikes in demand?
Do public cloud providers maintain excess capacity to rapidly deploy new services when needed?
Are supporting infrastructure requirements sufficiently robust to ensure continued access to public cloud services?
Can a public cloud service continue to deliver services when support personnel are impacted by illness?
Is the telecommunications and networking infrastructure prepared to handle the increase in traffic volume as organizations leverage the internet to access services?
Are there any concerns around security (physical, perimeter, customer data and other) of cloud data centers as the workforce doing those functions are asked to work from home?
Addressing these questions will require an understanding of the threats, opportunities and events that will occur as a result of the COVID-19 pandemic. Table 1 provides a brief outline of these three items that will be discussed in more detail later.
To assist their clients in this difficult time, technology general managers must be prepared to deliver cloud services in line with their published SLAs and demonstrate an uninterrupted customer experience. Only providers that have built a robust and redundant architecture as well as disaster recovery plans, procedures and policies to respond to such a pandemic, will be able to manage the increased load.
The COVID-19 pandemic presents a number of threats and opportunities that must be evaluated. There are also specific events that technology general managers must be aware of. Below is a detailed explanation of the threats, opportunities and events that will occur as a result of the COVID-19 pandemic.
Cloud service providers will face significant challenges to their service due to increased demand. Specifically, service providers should be aware of the following:
Remote workers will demand greater capacity from networks, storage and services during the crisis because they will work remotely either by choice or by mandate.
Digital events to avoid in-person transmission of the virus (such as videoconferences in place of live meetings) will stress the capabilities of services.
Independent software vendors (ISVs) and content providers like Netflix will face an increase in demand for streaming as people confined to their home will increase their use of digital media and streaming services. This will also create additional demand for bandwidth on the internet as well as on cloud providers’ networks.
Operations teams supporting the cloud service offerings will be challenged to maintain service availability and performance, while working remotely, or with reduced staff.
The supply chain will be impacted as components powering the cloud data center resources, such as chips in servers or the servers themselves, become in short supply due to the manufacturing facilities being based in China and other impacted areas.
Well-architected and well-run cloud services are designed to handle unexpected spikes in demand. Pandemics such as COVID-19 will challenge providers to demonstrate their preparedness and weed out the cloud service offerings that are not prepared to handle the unexpected spikes in demand. Stress testing must have occurred in advance of the event.
Other nontechnology challenges could include:
Cloud service providers’ inability to continue their go-to-market efforts, given the cancellation of their industry customer events.
Payment collection in a contactless manner to avoid the spread of the disease.
Cloud computing is a model that is inherently designed to satisfy fluctuating demand. If implemented correctly, cloud services should be well positioned to support rising requests, such as those exhibited during this COVID-19 crisis. Elasticity and hyperscale capabilities stand ready to answer the call for as long or as short a time as needed, assuming the cloud providers have reserved enough capacity to adjust for the increase. The reality, however, is that it is unlikely that they have done so, given the unprecedented nature of this pandemic. How do they then address the problem in real time or near real time? And what kind of capital expenditure (capex) spending should they be prepared for to address this issue. And what should be the preferred mode of acquiring these assets (lease versus buy), keeping in mind that this is capacity they may not need for a long time when the pandemic subsides.
Cloud providers can demonstrate the strength and adaptability of their respective cloud services by:
Providing cloud-based collaboration and conferencing capabilities at a discount or for free (at least during the early stages of the crisis). For example, Cisco Webex, Google, Microsoft, Slack and Zoom are offering many of their software capabilities for free to assist organizations with remote workers.
Sharing cultural tips to provide a template for how remote work at massive scale must be done.
Demonstrating the power of new technologies as people seek ways to create more rich digital experiences for collaboration and meetups. For example, virtual reality (VR) capabilities like those available through the Oculus Quest make virtual meetings feel more real.
Segment and prioritize workloads, especially those that produce significant networking traffic, to balance the immediate needs of stay-at-home workers with processes that can execute with a lower priority.
Allocating additional networking bandwidth, including private, inter-data-center connections, to help increase available networking bandwidth for cloud service consumers.
Forming more partnerships with telecom providers to ramp up their telco cloud offerings. This includes the use of high-bandwidth wireless technologies such as 5G to support applications and workloads such as remote patient monitoring (real-time doctor interaction) and remote healthcare procedures (using immersive video collaboration).
Collaborating with researchers and pharmaceutical firms to bring a vaccine quicker to the market by speeding inventions, trials and outcome-based research.
The swell of stay-at-home workers provides an opportunity for cloud providers to affect a sea of change in how much digital work becomes the norm rather than the exception — using their cloud.
Every organization will be impacted due to COVID-19.
Cloud providers should monitor the following events and situations:
Where possible, organizations will ask their employees to work from home, creating additional demand for collaboration and videoconferencing services.
Private networks are more stressed than the public internet. The public internet has extensive redundancy to manage load increases, but quality will be an issue since it lacks advanced traffic engineering and capacity management capabilities, whereas private networks often do not.
The impact of COVID-19 on financial markets has already been substantial. A looming market recession may reduce IT spending, so there may be less technology spending in the future.
Spot pricing may turn to surge pricing. If cloud providers are oversubscribed, then the prices may favor those who can afford it.
The impact of COVID-19 to the healthcare industry will be significant as it struggles to provide care to patients in emergency rooms. This will require the need for real-time patient data on the healthcare professional’s medical devices, creating an increased revenue opportunity for connected devices, cloud, edge and Internet of Things (IoT)-based cloud solutions.
Below are the actions cloud provider technology general managers should take in response to the COVID-19 pandemic.
Most corporate networks are not prepared for the onslaught of remote work being driven by COVID-19. Corporations and governments alike do not yet have a consistent sense of this as a technology crisis. In Gartner’s Research Circle polls, only 8% of respondents say that business operations are being severely restricted by their remote work impact.
However, more reports of slowdowns or outright inability to handle increased loads are surfacing every day. Reasons for this are varied. Increased remote workers means increased load on VPNs and the servers that handle their connections. While companies have spent money to build out their existing capacity to handle remote work, few have come close to provisioning capacity many times their established norms. VPN connections and traffic are critical junctures where this issue of capacity surfaces. Since VPNs are dependent on server connections across both the internet and private networks, any reduction in service quality will appear to end users as slow connection requests, limited bandwidth availability and slower software loads.
Cloud providers host VPN servers and private network connections as well as internet services for companies with workforces both large and small. It is of critical importance that these providers help customers understand their limitations and to expand them at a measured cost. Cloud providers must engineer their networks to handle spikes without having to throw bandwidth at the problem, which can often be an expensive proposition. Deploying advanced traffic engineering technologies will be crucial.
Customers’ concerns about performance problems that are building due to the increased load of remote workers are not initially focused on the hyperscale cloud providers. Instead, customers are noting slowdowns in connections and bandwidth limitations in their normal workplace and collaboration applications. The reasons for these reduced performance experiences are generally due to network congestion, either within the corporate VPNs or in the remote internet connections used by remote workers.
The impact of COVID-19 extends in a personal way to individuals. Many employees will find ways to continue to be productive during the outbreak by working from remote locations. However, some positions may not be conducive to remote working, or they may be a role of a support nature, such as cafeteria workers, shuttle drivers, janitorial workers and other types of support staff. Additionally, the microeconomies associated with office complexes will be impacted as remote workers stay away. This includes many services such as restaurants, cafes, bars and other services near corporate office complexes.
Providers have an opportunity to ease the burden of those impacted by the change in work patterns by offsetting the losses in wages or revenue due to the change in worker patterns. This not only helps those suffering disproportionately during the crisis, but also helps preserve important services and support staff members for the time when work can resume to a more normal pace.
Providers can offer financial relief to their customers, especially small and midsize businesses, as well as technology startups that will be revenue-challenged during these unprecedented times.
We examine here the response of three hyperscale cloud providers to gain a sense of how they are reacting in the time of crisis. Through their actions, cloud will, or will not, be seen as a reliable mechanism for end users surviving and thriving in such unplanned situations.
Early in the U.S. growth phase of COVID-19, Microsoft had confidence its data centers were operating normally. The company, like the other cloud providers, is monitoring the data centers and networks, but it is also working to protect its employees, both salaried and hourly. Microsoft announced it will continue to pay its vendor hourly service providers, approximately 4,500 workers, their regular pay during the period in which Microsoft’s regular employees will be working from home.
On the cloud side, Microsoft has been actively working to minimize impact on Azure services, and other applications that rely on Azure as a foundation, such as Office 365. The company has long-established processes for dealing with spikes in work brought about at unexpected times. Despite this, a combination of potential stress issues can affect services that support solutions like Microsoft Teams and Exchange. However, the impact often comes from challenges to server connections, VPN traffic or network bandwidth on the customer side rather than within the Microsoft cloud.
Microsoft uses stress testing of their systems as a normal course of action. Several large-scale plans by corporations (including Microsoft), schools and even governments to encourage remote work in response to the COVID-19 health emergency will demand more testing. Microsoft relies on existing processes it feels are already designed to address the increased cloud infrastructure and network demand that will come with the spike in remote work. This includes providing comprehensive service continuity plans for increased usage, remotely managing services and leveraging a geographically diverse engineering workforce to support those services.
Microsoft is actively working to assess and minimize the impact on Azure infrastructure capacity through established and tested processes. This is in line with what other hyperscale cloud providers state, and Microsoft is providing service updates to its Azure blog. The areas of interest include ensuring that existing customers have breathing room for workloads that need to expand and contract. Monitoring of cloud services is ongoing, and anticipation of need will spur action to stay ahead of the demand curve.
Like other providers,Amazon has asked its employees to work remotely, which significantly changes the dynamic of the office complex. Amazon has created a series of grants to offset the economic impacts of COVID-19 on hourly workers and the small vendors that operate near Amazon’s offices:
$25 million relief fund to support independent delivery service partners and drivers and for employees or contractors who face financial hardships
Extra time off with pay for full-time and part-time employees diagnosed with COVID-19 or placed into quarantine
$5 million grant to help businesses impacted by the virus
Amazon Web Services (AWS) recognizes that customers need assurance right now. AWS has a long track record of handling events with high, and sometimes unexpected, variability. As the supporting infrastructure for Amazon’s e-commerce business, AWS experiences all the seasonal variability associated with retail events. AWS also provides support for high-traffic events, such as the Super Bowl, as well as many online gaming and video streaming services. These events have given AWS the opportunity to load-balance, pressure-test and configure network routes so people don’t have to go the wrong way around the world to get connections set.
AWS also employs a sophisticated and automated supply chain to ensure compute and storage resources are readily available to respond to changes in demand. Additionally, AWS maintains excess capacity, allowing the services to provide true elasticity and meet scale-up requirements without deploying additional resources.
AWS manages network capacity and proactively scales network access and capacity to get a buffer beyond what they normally have. This requires the deployment of latent networking bandwidth in addition to the dynamic increase in capacity.
During times of duress, AWS aggressively increases infrastructure testing cycles across all AWS regions. AWS also raises usage limits to ensure customers are not impacted when the new demands exceed previously set thresholds. Telecom providers are helping by removing data caps and throttling, thus releasing additional bandwidth to help with the increased demand and volume.
When organizations operate in crisis mode, unpredictable behaviors result. AWS has seen an increase in demand for services, with no discernible trends related to specific services. Rather, users adjust instance types to ensure better performance for application workloads.
Google has long prepared for an elastic need for their services to handle unpredictable load variances. Google has taken precautions to increase monitoring of service quality, capacity and reliability in the face of COVID-19. The company has an ongoing disaster recovery training program in which testing is performed in an ongoing fashion. In addition, Google has created an internal working group tasked with planning for — and mitigating against — specific business impact resulting from the outbreak or the global response.
Google has a 10-year history of preparation for unexpected stress events. This preparation, it believes, has covered technological disaster preparedness scenarios, as well as scenarios for unavailability in local or regional offices and with personnel. Google does not allow random entry into its regional data centers and support centers for the very reason that it wishes to reduce external factors. Infrastructure reliability and availability, customer support, and customer experience are key aspects of testing Google’s preparedness protocols.
On the humanitarian front, Google was one of the first to announce reduced cost access to advanced collaboration services. This is to help companies deal with the sudden influx of users needing to collaborate online. Simple but unexpected issues like hitting limits on the number of actual people who can be logged in to a service can become a major point of pain. Google has rolled out free access to its “advanced Hangouts Meet videoconferencing capabilities to all G Suite and G Suite for Education customers globally including:
Larger meetings, for up to 250 participants per call
Live streaming for up to 100,000 viewers within a domain
The ability to record meetings and save them to Google Drive”
This kind of largess has now become normal for hyperscale cloud providers and is now more the norm than the exception. The benefits, which may be seen later, are that some customers will not want to fall back to lower levels of support after the crisis.
Google has also supported a stronger ability for people to quickly find information about COVID-19. It has worked on trust and reducing misinformation through the Google Trust and Safety team.
Gartner Recommended Reading
Gartner webinar survey. A total of 1,500 participants were surveyed during a Gartner webinar presented on 6 March. Multiple industries, countries and regions around the world were represented.
Gartner Research Circle Survey, 15 March 2020.