vCenter Server Maintenance Best Practices
Virtual Center Roles (Yearly)
• VirtualCenter Administrators: super users who have all privileges on
all systems
• Virtual Machine Administrators: administrators on a subset of
servers; can perform all operations on their servers, including VM
provisioning, resource allocation and VMotion
• Virtual Machine User: access to a subset of VMs; can use remote
console, perform power operations, view performance graphs, but cannot
create/delete VMs, set resources or move VMs.
• Read-Only User: can only view information on a subset of VMs
• Privilege Management
• Administrators on the Windows system running the Management Server
are automatically assigned VirtualCenter Administrator privileges
• VirtualCenter Administrators can delegate privileges to other users
by accessing an existing ActiveDirectory or Domain Controller
Best Practices for Templates (Quarterly)
Virtual
machine templates are very powerful and versatile. The following best
practices, culled from many different areas of IT infrastructure management,
will enable you to derive the most value from templates and avoid starting
ineffective habits.
• Install
Antivirus software and keep it up to date: In today’s world of viruses that are
hyper efficient at exploitation and replication, an OS installation routine has
to merely initialize the network subsystem to be vulnerable to attack. By
deploying virtual machines with up to date antivirus protection, this exposure is
limited. Keep the antivirus software current every month by converting the
templates to VMs, powering on, and updating the signature files.
• Install
the latest operating system patches, and stay current with the latest releases:
Operating system vulnerabilities and out of date antivirus software can
increase exposure to exploitation significantly, and current antivirus software
isn’t enough to keep exposure to a minimum. When updating a templates antivirus
software, apply any relevant OS patches and hotfixes.
• Use the
template notes field to store update records: A good habit to get into is to
keep information about the maintenance of the template in the template itself,
and the Notes field is a great place to keep informal update records.
• Plan for
ESX Server capacity for template management: The act of converting a template
to virtual machine, powering it on, accessing the network to obtain updates,
shutting down, and converting back to template requires available ESX Server
resources. Make sure there are ample resources for this very important
activity.
• Use a
quarantined network connection for updating templates: The whole point of
keeping antivirus and operating systems up to date is to avoid exploitation, so
leverage the ability of ESX Server to segregate different kinds of network
traffic and apply updates in a quarantined network.
• Use the
same datastore for storing templates and for powered on templates: During the
process of converting templates to virtual machines, do not deploy the template
to another datastore. It is faster and more efficient to keep the template’s
files in the same place before and after the update.
• Install the VMware Tools in the template: The VMware Tools include
optimized drivers for the virtualized hardware components that use fewer
physical host resources. Installing the VMware Tools in the template saves time
and reduces the chance that a sub optimally configured virtual machine will be
deployed to your production ESX Server infrastructure.
• Use a standardized naming convention for templates: Some inventory
panel views do not offer you the opportunity to sort by type, so create a
standard prefix for templates to help you intuitively identify them by sorting
by name. Also, be sure to include enough descriptive information in the
template name to know what is contained in the template.
• Defragment the guest OS filesystem before converting to template:
Most operating system installation programs create a highly fragmented
filesystem even before the system begins its useful life. Defragment the OS and
convert to template, and that way you won’t have to worry about it again until
the system has been in production for a while.
• Remove Nonpresent Hidden Devices from Templates: This problem will
likely occur only if you about certain devices, notably network devices, even
after they are removed from the system. Refer to Microsoft TechNet article
269155 for removal instructions
• Use Folders to Organize and Manage Templates: Folders can be both an
organizational and security container. Use them to keep templates organized and
secure.
• Create Active Directory groups that map to VirtualCenter roles:
Rather than assign VirtualCenter roles to individual user accounts, create
dedicated Active Directory groups, and place user accounts in those groups.
• Store
templates on a shared VMFS volume on the SAN (dedicated LUN) and enable access
to the SAN-based template volume from all ESX servers
• SAN templates may only be provisioned to target hosts connected to
SAN
• The VC Mgmt Server’s local template repository can be used to
provision VMs onto ESX Servers that are not connected to the SAN
• If template deployments to a LUN fail due to SCSI reservations,
increase the “Scsi.ConflictRetries” parameter to a value of “10” through the
Advanced Settings menu
Roll up Jobs (Quarterly)
1.
Ensure that the jobs
listed in this table are installed:
Note: For managing an
Oracle vCenter Server database, you can use Oracle SQL Developer and for
managing a DB2 vCenter Server database, you can use DB2 Control Center.
Rollup
Job
|
Corresponding
File
|
Event Task Cleanup myDB
|
job_cleanup_events_DB.sql
|
Past Day stats rollup myDB
|
job_schedule1_DB.sql
|
Past Month stats rollup myDB
|
job_schedule3_DB.sql
|
Past Week stats rollup myDB
|
job_schedule2_DB.sql
|
Process Performance Data myDB
|
job_dbm_performance_data_DB.sql
|
Property Bulletin Daily
Update myDB
Note: This job only applies to vCenter Server 5.x |
job_property_bulletin_DB.sql
|
Topn past day myDB
|
job_topn_past_day_DB.sql
|
Topn past month myDB
|
job_topn_past_month_DB.sql
|
Topn past week myDB
|
job_topn_past_week_DB.sql
|
Topn past year myDB
|
job_topn_past_year_DB.sql
|
2.
where DB is db2, mssql, or oracle.
Note: Ensure that myDB references the vCenter Server database
and not the master or some other database. If these jobs reference any other
database, you must delete and recreate the jobs.
Stored Procedures (Quarterly)
Verifying the stored procedures
installed in vCenter 5.5 and 6.0
To check the stored procedures
installed in vCenter Server 5.5 and 6.0 using MS SQL:
1.
Navigate to vCenter DB > Programmability > Stored Procedures.
2.
Ensure that the stored
procedures listed in this table are installed:
Stored
Procedure
|
Corresponding
File
|
calc_topn1_proc
|
calc_topn1_proc_DB.sql
|
calc_topn2_proc
|
calc_topn2_proc_DB.sql
|
calc_topn3_proc
|
calc_topn3_proc_DB.sql
|
calc_topn4_proc
|
calc_topn4_proc_DB.sql
|
cleanup_events_tasks_proc
|
cleanup_events_DB.sql
|
clear_topn1_proc
|
clear_topn1_proc_DB.sql
|
clear_topn2_proc
|
clear_topn2_proc_DB.sql
|
clear_topn3_proc
|
clear_topn3_proc_DB.sql
|
clear_topn4_proc
|
clear_topn4_proc_DB.sql
|
delete_stats_proc
|
delete_stats_proc_DB.sql
|
insert_stats_proc
|
insert_stats_proc_DB.sql
|
l_purge_stat2_proc
|
l_purge_stat2_proc_DB.sql
|
l_purge_stat3_proc
|
l_purge_stat3_proc_DB.sql
|
l_stats_rollup1_proc
|
l_stats_rollup1_proc_DB.sql
|
l_stats_rollup2_proc
|
l_stats_rollup2_proc_DB.sql
|
l_stats_rollup3_proc
|
l_stats_rollup3_proc_DB.sql
|
load_stats_proc
|
load_stats_proc_DB.sql
|
load_usage_stats_proc
|
load_usage_stats_proc_DB.sql
|
process_license_snapshot_proc
|
process_license_snapshot_DB.sql
|
process_performance_data_proc
|
process_performance_data_DB.sql
|
purge_stat2_proc
|
purge_stat2_proc_DB.sql
|
purge_stat3_proc
|
purge_stat3_proc_DB.sql
|
purge_usage_stat_proc
|
purge_usage_stats_proc_DB.sql
|
rule_topn1_proc
|
rule_topn1_proc_DB.sql
|
rule_topn2_proc
|
rule_topn2_proc_DB.sql
|
rule_topn3_proc
|
rule_topn3_proc_DB.sql
|
rule_topn4_proc
|
rule_topn4_proc_DB.sql
|
stats_rollup1_proc
|
stats_rollup1_proc_DB.sql
|
stats_rollup2_proc
|
stats_rollup2_proc_DB.sql
|
stats_rollup3_proc
|
stats_rollup3_proc_DB.sql
|
upsert_last_event_proc
|
upsert_last_event_proc_DB.sql
|
Where, DB is db2, mssql, or oracle.
If any of these jobs or stored procedures are missing, you must
install them by running the corresponding .sql file on the
vCenter Server database using a database management tool such as SQL Management
Studio. For more information on running these .sql files, see
sectionAdding the SQL Server Agent Jobs in Updating rollup jobs after the error: Performance data is
currently not available for this entity (1004382).
Notes:
- The upsert_last_event_proc procedure
is not required for the Oracle database.
- If there is a custom schema, the following command also needs to be ran:alter schema schema_name transfer dbo.stored_procedure_name
All SQL scripts are located in the vCenter Server installation
folder:
- vCenter
Server 5.1 and 5.5: C:\Program
Files\VMware\Infrastructure\VirtualCenter Server\sql.
- vCenter
Server 6.0: C:\Program Files\VMware\vCenter
Server\vpxd\sql
For more information on commonly used vCenter Server installation
paths, see Common vCenter Server and vSphere Client Windows paths (1028185).
Ensure that the vCenter Server database is the target before executing the SQL
file.
Growth of a Database
(Quarterly)
Determining what is growing in the vCenter Server database
The vCenter Server database is
a complex database and there are several areas that can cause problems. Out of
the many tables in vCenter Server, there are very few which accumulate data
during regular operation. These tables do accumulate data during regular
operation:
- vpx_hist_stat1 to vpx_hist_stat4 in vCenter Server 4.x and vpx_hist_stat1_n to vpx_hist_stat4_n in vCenter Server
5.x – These tables store the collected performance data information.
- vpx_sample_time1 to vpx_sample_time4 – These tables
store the reference time frames for the performance data in the
vpx_hist_stat tables.
- vpx_event and vpx_event_arg – These tables
store the event information from the Tasks and Events tab in vCenter
Server.
- vpx_task – stores the task
information from the Tasks and Events tab in vCenter Server.
This small subset of the tables in vCenter Server account for
the majority of cases that are showing substantial growth in the database. If
any other table is showing growth, file a support request with VMware Technical
Support and note this KB Article ID in the Problem Description. For more information,
see How to Submit a Support Request.
Microsoft SQL
If you are using Microsoft SQL,
there are three ways to validate where space is being consumed within a
Microsoft SQL database. Select one method.
- From
the SQL Management Studio interface, navigate to the database, right-click
the table, and select Properties. See the Data
space in the Storage section of the screen.
- Manually run this SQL query against the vCenter Server database:select object_name(id) [Table Name],[Table Size] = convert (varchar, dpages * 8 / 1024) + 'MB'from sysindexes where indid in (0,1)order by dpages descThis query lists all tables in the vCenter Server database by table size in MB.
- Manually run this SQL query for individual tables:
exec sp_spaceused
tablename;
See the data column of the output. For example:Note: Querying the database one table at a time may be time consuming. To query all tables simultaneously, use this SQL Query:
EXEC sp_MSforeachtable
@command1="EXEC sp_spaceused '?'"
Growth of
Transaction Logs (Quarterly)
vCenter Server Transaction log growth when using Microsoft SQL
The Transaction log records all
transactions that occur on the database.
Depending on the recovery model
that is set on the database, you may notice growth of the transaction log. The
recovery model for the database can dramatically affect database growth for any
database.
There are three different
recovery models for Microsoft SQL:
- Full Recovery ModelThis model logs all transactions, which makes full failure recovery possible. It provides the greatest amount of recovery potential in case of a failure that impacts the database, but it uses the most disk space of all of the models.
- Bulk-Logged Recovery ModelThis model logs all transactions except for certain large scale operations such as Index creation or bulk load operations. A full backup is typically performed after a large insert of information, but this model does not consume as much disk space.
- Simple Recovery ModelThis model logs all transactions, but after the transaction is complete, it is deleted. It uses the least amount of disk space of all the models, but it also offers the least amount of recovery. As such, regular full backups need to be taken.
By default, Microsoft SQL uses
the full recovery model for the databases. Due to the large number of
transactions with the vCenter Server database, VMware uses a warning during the
installer that indicates the recovery model that is set in the database. For
example:
Regardless of the recovery model, VMware
recommends that you take regular backups of the database and that a truncate of
the transaction log is performed at the same time as the backup. This regular maintenance
prevents the size of the transaction logs from posing an issue to the amount of
disk space available to the system. For more information on the transaction
logs and how to shrink them, see:
- SQL Server Recovery Model
Affects Transaction Log Disk Space Requirements (1001046)
- Troubleshooting
transaction logs on a Microsoft SQL database server (1003980)
Reducing the size of
SQL Database (Quarterly)
Reducing the size of the vCenter Server
database
To reduce the size of the vCenter Server database:
Warning: This procedure erases all historical
data. If you want to retain some historical performance data instead of
deleting all of it, see Purging old data from the database used by vCenter Server
(1025914) or Purging old data from the database used by VirtualCenter 2.x
(1000125).
Note: The below steps are
not applicable to vCenter Server 5.1 and 5.5. To truncate performance data on
the vCenter Server 5.1 and 5.5 database see the sections, Truncating all
performance data from vCenter Server 5.1.
To reduce the data perform these steps:
1.
If your database is
Microsoft SQL Server, Oracle or PostgreSQL, obtain the vCenter Server database
password.
For information, see the Obtain the vCenter Server database password section
in this Knowledge Base article.
2.
Stop the vCenter
Server service.
o If you installed vCenter Server on a Windows
machine:
Note: For more
information, see Stopping, starting, or restarting VMware vCenter Server services
(1003895) and Stopping, starting, or restarting VMware vCenter Server 6.0
services (2109881).
a.
Log in as an
administrator to the Windows machine on which vCenter Server is installed.
b.
Navigate to Start > Administrative Tools
> Services.
c.
Right-click VMware VirtualCenter Server and select Stop.
3.
Back up the vCenter
Server database.
o For MS SQL, see your database vendor's
documentation.
4.
Run the script for
your database.
As vCenter Server is installed on a Windows machine:
1.
Log in as an
administrator to the Windows machine on which vCenter Server is installed.
2.
Locate the vcdb.properties file
and open the file by using a text editor.
o For vCenter Server 5.1 and 5.5, the file is
located in the C:\ProgramData\VMware\VMware
VirtualCenter\ folder.
o For vCenter Server 6.0 the file is located in
the C:\ProgramData\VMware\vCenterServer\cfg\vmware-vpx\ folder.
3.
In the vcdb.properties file,
locate the password of the vCenter Server database user and record it.
For information, see
the Run the script for your
database section in this KB article.
The scripts contain
three main parameters:
For Microsoft SQL Server:
1.
Log in to the
Microsoft SQL Server machine as an administrator.
2.
Download and save
the 2110031_MS_SQL_task_event_task.sql script attached to this Knowledge Base
article.
3.
Open the command
prompt and run the script:
sqlcmd -S IP-address-or-FQDN-of-the-database-machine\instance_name -U vCenter-Server-database-user -P password-d database-name -v TaskMaxAgeInDays=task-days -v EventMaxAgeInDays=event-days -v StatMaxAgeInDays=stat-days -i download-path\2110031_MS_SQL_task_event_stat.sql
- TaskMaxAgeInDays
All tasks older than TaskMaxAgeInDays day are deleted.
- EventMaxAgeInDays
All events older than EventMaxAgeInDays day are deleted.
- StatMaxAgeInDays
All statistics older than StatMaxAgeInDays day are deleted.
The possible values
for all of the parameters are:
-1
|
Skips the respective historical
data deletion. For example, TaskMaxAgeInDays = -1, means that no task records
will be deleted.
|
0
|
Deletes all historical data for
the respective component. For example, TaskMaxAgeInDays = 0, deletes all task
records.
|
1 and more
|
Rebuilding indexes (Quarterly)
To
rebuild the vCenter Server database indexes:
Note: For a vCenter Server
5.1 and 5.5 databases, download and extract the .sql files from
the 2009918_rebuild_51.zip file attached to this article.
2.
Connect to the vCenter
Server database, for example using Management Studio for SQL Server or SQL*Plus
for Oracle.
3.
Execute the .sql file to create
the REBUILD_INDEX stored procedure:
o Oracle: rebuild_indexes_oracle.sql or rebuild_indexes_oracle_51.sql
o SQL Server: rebuild_indexes_sql.sql or rebuild_indexes_sql_51.sql
4.
Execute the stored
procedure for either Oracle or SQL Server that was created in the previous
step:
execute REBUILD_INDEX
Backup of vCenter
SSL Certificates (Yearly)
· Windows 2003: %ALLUSERSPROFILE%\Application
Data\VMware\VMware VirtualCenter
· Windows Vista and 2008 Server: %ALLUSERSPROFILE%\VMWare\VMware VirtualCenter
Windows OS Patches (Monthly)
Windows
Critical and Security Patches are installed after taking the snapshot of the
Virtual machine.
Test all the
functionalities of vCenter Server machine and see if any other impact occurred
due to the recent patches. If yes, then roll back the patches installed
recently and or uninstall them from Add Remove Programs.
Once the
verification is finished, and all the application/s runs without an issue then
remove the Snapshot from the vCenter Server.
Change Request is required to put the patches and having maintenance on the vCenter Server.
Change Request is required to put the patches and having maintenance on the vCenter Server.
Upgrade to the new version
release with same build or patch/es (3-6 months)
Every six months check the VMware’s web site for any new updates available for
vCenter Server and download them and have a maintenance window with a Change request to
put those patch/es or update/s.
Major
version upgrade is not discussed here as it requires the whole environment
upgrade.
As usual
take the snapshot of the virtual machine and also make sure that a full backup
of vCenter is taken along with necessary database/s before applying the
patch/update.
One can
verify on the download site for any critical issue which got resolved by VMware
from security stand point or other critical areas which are flagged as Bug with
the product, then, it’s very important to apply such patch/es or update/s with
an Emergency Change Request to avoid any impact on the existing environment e.g. SSL issue
or some other security flaw/s.
Keep
checking http://kb.vmware.com/kb/ and http://blogs.vmware.com/ site frequently and look for
recently modified/created articles for vCenter Server which will have similar
information about any known issue/s which got taken care with a single
patch/update or multiple patches/updates.
vCenter Server
Service Restart (as needed)
Sometimes
due to the nature of the issues, you may just restart the vCenter Server
Service rather than restarting the whole vCenter Server. In such cases, please
make sure that you verify all the other dependent application/s, service/s and
other components which are heavily rely and integrated to the vCenter Server
service and to avoid any disruption to the existing production workload in
Private Cloud, you need to inform the necessary stake holders about the
possible impact on the functionalities and availability of services offered by
whole vRealize suite and vCenter Server combination. This should be done only
with Change Request during after hours.
For more information
refer http://kb.vmware.com/kb/1003895
Resource
Availability on ESXi (upon scheduling maintenance)
Make sure we
have enough resources on ESXi where the vCenter Server will be powering on and
running (in case disaster occurred and the vCenter Server was shut down).
Verify if
any resource pools or memory/cpu reservation configured at the cluster level,
then make sure such resources are available to the vCenter Server virtual
machine.
vCenter Server and
vRealize Automation Center portal (as and when
required)
During the maintenance of
vCenter Server the console to the virtual machines are not available and also
the Access to the vRA portal (if you are using vRA) is not available to deploy any of the virtual machine
Changing IP address of Vcenter Server
virtual machine(as and when required)
Whenever you need ot change the IP address of
the vCenter Server please proceed very carefully as so many other components
are involved and dependent on vCenter Server e.g. Plugins, vRA, vRO etc.
First of all, create backups of the vCenter Server VM and underlying SQL
database.
·
Set DRS to manual mode
to avoid anything moving around (optional as it depends on the configuration in
place at Cluster level).
·
Identified the ESXi
host running the vCenter VM and connected directly to the host with the vSphere
Client.
·
Close any sessions you
have open to the vCenter Server (Web Client, vSphere Client, etc.)
·
Open a console window
to the vCenter Server by way of the ESXi host.
·
Stop all VMware
services.
·
Changed the IPv4
address and IPv4 gateway.
·
Ping the new Default
Gateway for Test and also try pinging other ESXi hosts in the same cluster and
other clusters, do some more ping test with other virtual machine in same
Datacenter and other Datacenter as well and to the Internet for making sure end
to end connectivity.
·
Restarted the vCenter
Server.
·
Put DRS back to fully
automated (optional based on your setup)
Changing FQDN of vCenter
Server virtual machine(as and when required)
The
same recommendation as of changing the IP address will apply for FQDN of the
vCenter Server virtual machine. Also make sure the Active Directory is
reflecting the new name and all the DNS records get updated upon the change.
Check with the Server Team if needs any help on updating the DNS records.
Plugins and other Extensions of vCenter Server (as and
when required)
Check and verify all the plugin operations e.g. Storage plugin or Update
Manager or External Backup solution plugin or #vRealize Orchestrator as and
when vCenter Server got patched/updated/upgraded from one version to another.
Major version always requires to update the plugins and then test the
functionality.
Backup of vCenter server (monthly)
Make sure the Backup is
done and includes a full back up of the vCenter Virtual machine each month.
Verify with Backup Team about the same.
1)
In case of troubleshooting / maintenance only you can try restarting vCenter
Service first and see if the issue gets resolved or not and if not then you can
shut down the service and restart the virtual machine
For more information please refer
2)
Make sure you have enough resources on the ESXi host in case of Disaster
occurred and the vCenter server was shut down
3)
For Patching you need to have a maintenance window scheduled with a proper Change Request request and then patch the VM with specific OS related critical and important
patches
4)
For Database maintenance refer the following KB and make sure you have the Full
Backup done before you proceed
5)
For purging old Data from the vCenter Database please refer to http://kb.vmware.com/kb/1025914
6)
During the maintenance of vCenter Server the console to the virtual machines
are not available and also the Access to the vRA portal (if using to deploy VMs) is not available to deploy any of the
virtual machine
7) Whenever you need to change the IP address of the vCenter Server please proceed very carefully as so many other components are involved and dependent on vCenter Server e.g. Plugins, vRA, vRO etc.
7) Whenever you need to change the IP address of the vCenter Server please proceed very carefully as so many other components are involved and dependent on vCenter Server e.g. Plugins, vRA, vRO etc.
First of all, create backups of the vCenter
Server VM and underlying SQL database.
1.
Set DRS to manual mode to avoid anything moving around (optional
as it depends on the configuration in place at Cluster level).
2.
Identified the ESXi host running the vCenter VM and connected
directly to the host with the vSphere Client.
3.
Close any sessions you have open to the vCenter Server (Web
Client, vSphere Client, RDP etc.)
4.
Open a console window to the vCenter Server by way of the ESXi
host.
5.
Stop all VMware services.
6.
Changed the IPv4 address and IPv4 gateway.
7.
Ping the new Default Gateway for Test and also try pinging other
ESXi hosts in the same cluster and other clusters, do some more ping test with
other virtual machine in same Datacenter and other Datacenter as well and to
the Internet for making sure end to end connectivity.
8.
Restart the vCenter Server.
9.
Put DRS back to fully automated (optional based on your setup)
8)
The same recommendation as per Item #7 goes for changing the FQDN of the vCenter Server
virtual machine. Also make sure the Active Directory is reflecting the new
name and all the DNS records get updated upon the change.
10)
Check and verify all the plugin operations e.g. Storage plugin or Update
Manager or External Backup solution plugin or #vRealize Orchestrator
11)
Make sure the Backup is done and includes a full back up of the
vCenter Virtual machine each month. Verify with Backup Team about the same.
Hope you will find the above useful in your environment and as vCenter Server is a crucial core components, necessary steps needs to be taken to make sure it runs smoothly without any issues.
I have not included anything about VCSA but you can use the same information for Windows based vCenter and just replace the services portion accordingly and add ssh to the connection list.
If you have any Feedback then do let me know please.
Please share and care !
Enjoy !!
No comments:
Post a Comment