Sunday, May 25, 2014

EMC VNXe 5100 and Checkpoint Folders on VMFS volumes


Came across an interesting situation where the virtual machine stopped responding as the External Backup application tried to take the snapshot and the storage ran out of disk space

Infrastructure details

ESXi: 5.x

VM: Windows 2Kx OS

Storage: EMC VNXe 5100

4 HDDs are mounted on the VM which are coming from separate individual VMFS datastore.

All 4 VMDKs consumed the overall space of 4 TB in total.

The Datastore summary states the total size is 18.7 TB consumed out of 28.25 TB provisioned. This is the only VM using those 4 LUNs for each VMDK.

So the question is where is the additional space of 14TB lost apart from 4 TB used by the VM??

Here is the output of du -h for the VM and /vmfs/volumes

/vmfs/volumes/50cc5dd8-555143cbd-44d2-ac162d783818/SVRAP01 # du -h
332.6G  .
/vmfs/volumes/50cc5dd8-555143cbd-87d5-ac162d783818/SVRAP01 # du -h
3.7T    .

du -h
8.0K    ./.ckpt_group.vmware_21_sg_443.fs.13/lost+found
272.0K  ./.ckpt_group.vmware_21_sg_443.fs.13/.etc
3.7T    ./.ckpt_group.vmware_21_sg_443.fs.13/SVRAP01
8.0K    ./.ckpt_group.vmware_21_sg_443.fs.13/.vSphere-HA
3.7T    ./.ckpt_group.vmware_21_sg_443.fs.13
8.0K    ./.ckpt_group.vmware_21_sg_441.fs.13/lost+found
272.0K  ./.ckpt_group.vmware_21_sg_441.fs.13/.etc
3.7T    ./.ckpt_group.vmware_21_sg_441.fs.13/SVRAP01
8.0K    ./.ckpt_group.vmware_21_sg_441.fs.13/.vSphere-HA
3.7T    ./.ckpt_group.vmware_21_sg_441.fs.13
8.0K    ./.ckpt_root_rep_ckpt_51_832916_2/lost+found
272.0K  ./.ckpt_root_rep_ckpt_51_832916_2/.etc
3.7T    ./.ckpt_root_rep_ckpt_51_832916_2/SVRAP01
8.0K    ./.ckpt_root_rep_ckpt_51_832916_2/.vSphere-HA
3.7T    ./.ckpt_root_rep_ckpt_51_832916_2
8.0K    ./.ckpt_root_rep_ckpt_51_832916_1/lost+found
272.0K  ./.ckpt_root_rep_ckpt_51_832916_1/.etc
3.7T    ./.ckpt_root_rep_ckpt_51_832916_1/SVRAP01
8.0K    ./.ckpt_root_rep_ckpt_51_832916_1/.vSphere-HA
3.7T    ./.ckpt_root_rep_ckpt_51_832916_1
8.0K    ./lost+found
272.0K  ./.etc
3.7T    ./SVRAP01
8.0K    ./.vSphere-HA
18.7T   .

So as you can see the overall space used on the VM is approx 4 TB and at the datastore level is 18.7 TB.

After doing more research found that the these files are created during EMC SAN root replication to preserve a replication pair which had no prior replication relationship. So technically this includes all 4 LUNs used by the virtual machine.

Found few articles here (EMC community) and here (By Justin Paul) found the culprit as well for the Datastore space consumption and how to reclaim it.

As a work around we storage vMotioned smaller VMDK to another datastore and now the VM can be powered on and the users can work with it.

Plan recommended to contact EMC support for better guidance on how to reclaim that space properly without losing any data.

Hope this helps to find out the lost space on the VMFS datastore !

Share and care please !!

No comments:

Post a Comment