Showing posts with label VMFS. Show all posts
Showing posts with label VMFS. Show all posts

Saturday, February 16, 2013

VOMA on ESXi 5.1 to find out Metadata Corruption on VMFS

I recently blogged about the how to verify the VMFS Heart beat corruption and in this blog I am going to show you how to use VOMA (vSphere On-disk Metadata Analyzer) to check if there is any inconsistencies after the events such as power outage or some others due to which everything went down at the same time.

voma can be used only on ESXi 5.1 and its not available in any prior version.

You can see the help of the voma command here.





Here are the screen shots of how voma will detect and identify the error found on the VMFS volume

As you can see in these example that there is file used against which the voma command was ran. e.g. .bin

Now before you run the command you may want to take a dump from each affected volume/s and then run the command against the dump file/s.



e.g.

dd if=/vmfs/devices/disks/naa.0000000000000000000000000000:1 of=/tmp/naa.0000000000000000000000000000p1.dmp bs=1M count=1500

voma -m vmfs -f check -d /vmfs/devices/disks/naa.0000000000000000000000000000p1.dmp






Now as you can see the highlighted parts  (with gray background) in the above screen shots which are errors and inconsistencies found on the volumes. The white spaces are names which are hidden due to confidential information. If you find the errors on the volume/s then contact VMware Support for further actions.

There are five phases of the disk analysis and @VMwareStorage (Cormac Hogan) has posted about voma here and here.

You can check out the capabilities of voma on your own on ESXi 5.1.

Help by sharing !!


Thanks for your time.

Friday, January 25, 2013

How to verify VMFS Heartbeat Region Corruption

Recently I was discussing the issue when you see some messages in the logs where the HB offset is noticed in place of the actual location of the bytes.

Upon research found out that the there is a heartbeat region corruption occurred on the VMFS volume due to the power outage or some other reasons.

You will see the following messages in the logs:


cpu2:2816)WARNING: HBX: 533: Volume 4f7217c1-3589352d-625d-001b22248a7a ("1TB-LAB") may be damaged on disk. Corrupt heartbeat detected at offset 3751936: [HB state 0 offset 0 gen 0 stampUS 0 uuid 00000000-00000000-00

cpu5:2980)WARNING: HBX: 533: Volume 4f7217c1-3589352d-625d-001b22248a7a ("1TB-LAB") may be damaged on disk. Corrupt heartbeat detected at offset 3751936: [HB state 0 offset 0 gen 0 stampUS 0 uuid 00000000-00000000-00


Now on your ESXi 5.x host run the following command on ESXi console using SSH or DCUI:

hexdump -s 22626304 -n 2048 -C /vmfs/devices/disks/

The output will be similar to this:

hexdump -s 22626304 -n 2048 -C /vmfs/devices/disks/

01594000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
01594200  01 ef cd ab 00 42 39 00  00 00 00 00 de 00 00 00  |.....B9.........|
01594210  00 00 00 00 cd 31 82 11  00 00 00 00 00 00 00 00  |.....1..........|
01594220  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
01594230  0e 00 00 00 36 00 00 00  00 00 00 00 00 00 00 00  |....6...........|
01594240  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
01594400  01 ef cd ab 00 44 39 00  00 00 00 00 00 00 00 00  |.....D9.........|
01594410  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
01594600  01 ef cd ab 00 46 39 00  00 00 00 00 00 00 00 00  |.....F9.........|
01594610  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
01594800

Now if you look at the bold section in RED above you see that the 1st section is missing some value out of the 4.

After fixing the heartbeat region you can see the value as follows once you run the same command again.


hexdump -s 22626304 -n 2048 -C /vmfs/devices/disks/

01594000  01 ef cd ab 00 40 39 00  00 00 00 00 00 00 00 00  |.....@9.........|
01594010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
01594200  01 ef cd ab 00 42 39 00  00 00 00 00 de 00 00 00  |.....B9.........|
01594210  00 00 00 00 cd 31 82 11  00 00 00 00 00 00 00 00  |.....1..........|
01594220  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
01594230  0e 00 00 00 36 00 00 00  00 00 00 00 00 00 00 00  |....6...........|
01594240  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
01594400  01 ef cd ab 00 44 39 00  00 00 00 00 00 00 00 00  |.....D9.........|
01594410  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
01594600  01 ef cd ab 00 46 39 00  00 00 00 00 00 00 00 00  |.....F9.........|
01594610  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
01594800

As you can see in the above output with all Bold in RED, all 4 sections have some HEX values which is a sign of healthy VMFS heartbeat region. If you do have issue viewing the file for any VM before and it was giving an error, try again you should be able to view the file from the affected datastore.
Now to fix the same issue I would suggest to contact VMware Support if not dealing with Lab/Test environment or you are not sure how to fix the corruption. This information is just to give you an idea how to verify if there is indeed a corruption on the VMFS volume or not.

For more information refer to KB 1012036 and also refer to the blog by @VMwareStorage on VOMA which talks about the metadata corruption.


Hope you find this information useful, if yes then do share.

Sunday, April 29, 2012

Device or Resource Busy Error on vSphere client

I can hardly count the instances where anyone working with VMware vSphere environment has not encountered this error "Device or Resource Busy"

There are  some reasons why this error coming up.

Few instances mainly can be listed as follows

  • When you browse a datastore, it does not have any files or folders.
  • Under /vmfs/volumes/ there are no files or directories.
  • While deleting a datastore, you see the error:

    Device or Resource Busy
     
  • Deleting the partition table of this datastore does not resolve the issue.
  • Deleting a vmdk fails for the VM which is not longer in use 
There are few articles you can find here which discusses all these conditions and how to resolve them.


http://kb.vmware.com/kb/1015791   and  http://kb.vmware.com/kb/1039362

If you have encountered other instances which are not documented then you can always provide the feedback to the article and it will be updated accordingly.

If you search http://kb.vmware.com/ you may find other articles which highlights other issues apart from the listed above.

Hope you will find this information useful.

Wednesday, April 25, 2012

Data Recovery Services for Data Loss

Recently I attended onsite session from Seagate Recovery Services and few weeks back from Kroll Ontract Services.

These two companies are PROs in the Data Recovery services.

For media they can recover could be from anything which can store data. Yes - anything.

Now lets come to the point where you have a definite corruption with VMFS volume or have a major impact on your storage devices which are used to store virtual machines.

Once you open the case and find out there is no other alternative apart from contacting Professional Data Recovery Services to recover the lost data, refer and bookmark our KB 1014513 which has the contact information of both companies.

Now if you know any other company who provide similar services like Kroll and Seagate then provide the feedback to the article with contact information  for that company.

There are few goodies offered by Seagate and one has really critical steps on it which I like to share with all the VMware users. These steps are very useful in order to recover the data properly. Stick the print out on your desk so that visitors can read them :-)

Good Luck !!