Data Deduplication with Cloud VM Sets

Data deduplication (dedupe) eliminates redundant copies of identical data blocks by replacing those data blocks with a pointer to a single "master copy". Dedupe is especially effective across a large number of VMs running the same operating system because most of the OS data blocks are identical on every VM.

When you encrypt those VMs, however, the data blocks are no longer the same because each virtual disk has a different encryption key that changes the content of the encrypted data blocks.

In order to support dedupe, KeyControl provides the Single Encryption Key (SEK) option for a Cloud VM Set. When enabled, KeyControl encrypts all VMs in the Cloud VM Set with the same encryption key. This means that, if dedupe found identical data blocks across the VMs before they were encrypted, dedupe will find approximately 90% of the same identical data blocks after they are encrypted.

You can enable the SEK option when you create the Cloud VM Set, and you can change that option up until you register a VM with Cloud VM Set. After one or more VMs are registered with the set, you cannot change the SEK option for that set.

Considerations

  • Using the same key for all VMs in the Cloud VM Set makes the VMs slightly less secure because if a hacker gains access to the SEK key on one VM, they can use that key to access all VMs in the Cloud VM Set.
  • The SEK key expiration date and expiration option become global values set when you generate a new SEK key. If you change those options for any disk on any VM in the Cloud VM Set, KeyControl applies the changes to all disks on all VMs in the Cloud VM Set.
  • If you generate a new SEK key, you must rekey the encrypted disks in the VMs manually or use the Auto Rekey feature. KeyControl does not automatically schedule a rekey task for the VMs when you generate a new SEK key. In addition, the benefits of dedupe will be reduced until all encrypted disks in the VMs in the Cloud VM Set have been rekeyed.
  • If you are using the Auto Rekey feature for the Cloud VM Set, a Cloud Admin should manually generate a new SEK key before an automatic rekey operation is about to start. Auto Rekey rekeys the disks using the latest version of the SEK key available. It does not automatically generate a new SEK key before it begins the rekey operation. For more information, see Generating a New SEK Key.
  • Before the Policy Agent rekeys a disk it checks to make sure that there is a new version of the SEK key available. If there is not, it raises an alert stating that there is no new SEK key to use and it does not rekey the disk.
  • HyTrust recommends that you always upgrade your Policy Agents when you upgrade KeyControl. However, if a VM in the Cloud VM Set is using a version of the Policy Agent older than 4.2, the following caveats apply:

    • The older Policy Agent will always rekey a disk when requested, even if there is no new version of the SEK key. While this does not harm anything, it is a waste of time and resources. It will also adversely impact deduplication while the rekey is in process.
    • If you add or encrypt a Linux disk, or if you add a Windows disk, the older Policy Agent will use the first version of the SEK key. If new SEK keys have been generated in the meantime, the Cloud Admin must rekey the disk after it has been added or encrypted in order for that disk to use the current version of the SEK key. (When you first encrypt a Windows disk, the older Policy Agent correctly uses the current version of the SEK key.)