The Basics of Zerto Replication
In order to replicate with Zerto, a Zerto Virtual Manager (ZVM) and Virtual Replication Appliances (VRAs) are required. These servers are deployed on the customer’s production end. The ZVM is the brains of the Zerto functionality and this can be installed on a Windows 2008 R2 server or higher. From the ZVM, you will connect to your vCenter and have the ability to Install VRA servers and create Virtual Protection Groups (VPGs). After you install Zerto, the VRAs are pushed out by the ZVM server. A VRA will need to be deployed on all Hosts in the production environment that has a server you want to protect. These VRA servers will act as the proxy for replication; they read the iSCSI connection from the VM to the Host and replicate all changes to the iland cloud. The VRAs do require an IP and need to be on a network that can communicate with the ZVM server.
Once the ZVM and VRA servers have been installed and configured, the last step is to create VPGs. VPGs are groupings of VMs that replicate and failover together. For example, you might group your Domain Controllers in one VPG and then group Web Servers in another VPG. During an outage where you need to failover, you may decide to failover the Domain Controller VPG initially, to ensure the domain is online first, then failover the Web Servers VPG shortly after.
Replication occurs from your production site to iland over a Replication VPN Tunnel. iland will assist with creating this VPN, along with ensuring any issues over time are resolved. In your iland environment, you have two Firewalls, one is dedicated to the replication VPN, the other is for your failover environment. These DR firewalls gives you full control of your environment during test and live failover situations. They also allow you to fully test your recovery environment using a Zerto Test Failover, while maintaining replication during the testing period.
Understanding RPO and RTO Requirements
Zerto replicates data in real time from your production environment to iland. This means that changes made on one of your production servers are automatically replicated to the recovery site, enabling a low Recovery Point Objective (RPO). The RPO refers to the amount of time or data that your business can tolerate being lost during a DR Scenario. For instance, pretend your production center were to go fully offline due to a tornado or hurricane. If you were able to recover to a point in time from 5 minutes prior to the outage, 2 hours before or even a day before, would your business still be able to operate? What would be the customer impact and potential business losses? The goal should be to minimize downtime as much as possible for your critical systems. With the real-time replication in Zerto, you should expect minimal RPO times averaging around 15 to 20 seconds. There are other considerations that can determine the RPO in Zerto, such as bandwidth and data change. More on that later.
Another aspect to look into when researching a DR service is the Recovery Time Objective (RTO). The RTO is the time that can be allowed for the actual recovery of your production environment to ensure business continuity. Where your RPO might be to always have the ability to recover to at least 5 minutes ago, RTO is being able to recover to that point within 30 minutes or an hour. A failover in Zerto can be initiated in just a few clicks, which allows you to quickly failover the servers needed to restore business functionality. Once the failover starts, your servers will be automatically imported and powered on at the iland Secure Cloud recovery site. There is no need to revert from snapshots or wait on any extra processes for recovery – just a simple import and power on. With the iland Secure Cloud console, you are able to initiate a failover, even in the event that all of your production servers including your Zerto infrastructure is offline. There are other factors that go into this, such as the number of servers and boot delays if required.
Bandwidth and Data Change Considerations
Zerto’s real-time replication and RPO does come with some challenges for companies with lower bandwidth and/or high amounts of data change. Because changes are happening in real time, certain large process or causes for high data change can cause network saturation. Large SQL queries or a backup of an application may cause a high amount of data changes on your servers. If there is more change happening at a point in time than the bandwidth is able to handle, you may see some saturation in the network. This causes the RPOs to grow for your VPGs. This may also happen if you are trying to protect a lot of data but have limited bandwidth availability. For instance, if you have 30 VMs that total 10TB but only a 50Mb connection, you may see issues. Even if there isn’t one big global workload that generates a lot of change, the bandwidth you have available may not be able to handle the daily rate of change on all servers. You must also consider that the bandwidth consumed by Zerto might affect the available bandwidth to your actual production network. If the network your production servers or end users are connected to is the same network being used for replication, you might see contention in the bandwidth usage, again impacting the performance of replication.
These bandwidth and data change issues are addressed by iland during the DR solution design process – along with defining the required RTOs and RPOs for different VPGs. In my next blog, I’ll be exploring best practices for Zerto installation and configuration – stay tuned.