How Google Workspace admins can prepare for the next Google Drive service disruption

Google reported a four-hour long disruption of service for Google Drive customers on Tuesday, January 17th, 2017. If you are an admin for any of the impacted Drive customers, you may have heard from your users that they could log in, but could not access their data. Hopefully, none of your executives needed to access active contracts related to a time-sensitive legal review process, or any other vital work-in-progress.

This is not as rare a situation as you might think. In 2016 alone, according to Down Detector, thousands of people suffered service disruptions resulting in loss of productivity in Google apps. The length of each disruption varied, from a few minutes to a few hours or days. While one could login to Google, it’s clear that if you’re a Google Workspace administrator, you need a Plan B for access to specific apps if service is disrupted.

In this post, you’ll learn:

  • How a SaaS service model introduces a new paradigm for data responsibility.
  • How to minimize the impact on your organization when specific Google Workspace apps have service interruptions.
  • Steps to take today to build your Plan B, and add resilience and business continuity to your organization.


The potential impact when Google Drive service is interrupted

While there are no details yet from Google regarding the causes of Google Drive being unavailable, we do know that the service was unavailable for nearly four hours, affecting more than 10% of primarily North American users.

Google Drive disruption

Google’s Google Workspace status dashboard, showing the service disruption of January 17, 2017.

During that time, although admins and users could login, Drive access was spotty or nonexistent for those impacted. Challenges administrators could face from a service disruption include:

  • Urgent service desk tickets cannot be resolved. For example, executives who are collaborating internally and externally via Drive would be able to login, but wouldn’t be able to see the shared Doc. They may be unpleasantly surprised to find their admins cannot access that Doc in Drive, either.
  • HR needs will be delayed. Access to employee documents and email are part of onboarding and offboarding and may make up a significant part of HR service desk requests for Google Workspace admins. If an administrator can’t access Docs and Sheets in Drive, HR can’t respond to urgent requests.
  • Disruptions can lead to legal and audit gaps. Best practices for audit preparation require all SaaS application events to be recorded. Since a service disruption may not be recorded electronically, administrators would need a written log for the internal audit team that includes the name of the person making the log entry, the date and time of the entry, disruption start and end times, any related persons involved, and what actions were taken. This manual work is often neglected, resulting in gaps in the audit trail.


The new paradigm of data responsibility

On-premises applications require IT to manage the complete IT environment, from the network through hardware to OS to applications to apps, app data, and users. SaaS enterprises represent a new paradigm – the application vendor manages application infrastructure and the core application itself, while IT or the application team manage their own instance of the application, including users, customizations, and app data.

That means that in a SaaS enterprise, IT also is responsible for data protection. 


On-premises applications require IT to manage everything. SaaS applications leave infrastructure to the vendor but still require data protection.

Even experienced IT professionals may be surprised to learn the limits of vendor-provided SaaS data protection. Essentially, SaaS vendors protect you from data loss originating on their side (infrastructure issues, application issues) but cannot protect you from data loss originating on your side (admin errors, sync errors, user errors, malicious activity).

For example, the Google Drive TOS clearly states you own your data: ”We do not claim ownership in any of your content, including any text, data, information, and files that you upload, share, or store in your Drive account.” It is yours to manage, and yours to protect against loss.


Improving business resilience for SaaS enterprises

SaaS applications like Google Workspace provide important benefits for forward-thinking organizations. They can significantly reduce IT infrastructure costs, and simplify the management of applications. But the more value these applications provide, the more an organization is at risk when a specific application is unavailable, or when data is otherwise lost.

To improve business resilience in a SaaS enterprise, SaaS data must be protected and accessible, even during an app-specific disruption. This requires not simply backup, but backup that can enable access even if the SaaS vendor’s specific application is down. It further requires protection from much more common causes of data loss – sync errors, admin errors, or malicious activity. The best options will use “point-in-time” approaches to backup for both data and metadata (sharing settings, folder structures, labels, and similar), not a single snapshot or replication of data only. The replication approach will not enable fast recovery to a last known good point in time.


Steps to take now to ensure a resilient Plan B for SaaS data protection

It’s “when,” not “if,” you’ll experience SaaS data loss or temporary loss of access to data in specific apps. Review and test your processes in the following areas, to be sure your organization is resilient and ready to recover quickly.

  • Communications plan. As discussed in a white paper by Wipro, SaaS enterprises, in particular, need to develop a strong change management discipline. Part of change management means communicating clearly, early, and often with your users.
    • When a vendor’s app is down, be prepared to communicate to the organization what has happened, where users can go for more information, and what IT can do to help application users.
  • Backup and restore plan. Make sure you both document your plan, and that it’s thoroughly tested.
    • Test for fast, accurate restores related to the most common use cases – sync errors, admin errors, or malicious activity.
    • Test edge-case use cases, such as whether you can access your data if you can login, but the SaaS application is not accessible.
    • Test how long it takes to restore non-trivial amounts of data, and customizations / metadata, in order to be able to project recovery times for those times when large volumes of data may be lost, for major use cases.
  • Vendor SLA and make-good. Based on SLAs, your organization may be entitled to make-goods after a SaaS application disruption. Plan to document the impact to your organization in order to negotiate as robust a make-good as possible.

If, after testing, you determine that your backup and restore plan needs improvement, evaluate a SaaS data protection solution like Spanning Backup for Google Workspace or Spanning Backup for Office 365. These solutions offer point-in-time backup; easy, accurate restore; and are designed to meet the rapid recovery needs of the SaaS enterprise. Spanning makes it simple to add the resilience and business continuity you need for SaaS collaboration platforms like Google Workspace and Office 365.

Learn more here about Spanning Backup for Google Workspace.