Although Database Availability Group (DAG) provides automatic failover protection in case of database, server, or network failure, it requires monitoring to ensure that DAG works and provides high availability (HA) and site resilience when a disaster strikes.
Test-ReplicationHealth is one of the PowerShell cmdlets that you can use to monitor DAG replication health, quorum, network components, Active Manager, status of cluster service, etc. The cmdlet works in Microsoft Exchange Server 2010 SP1 and later versions. You can run this command both locally on the DAG member servers or remotely to monitor the DAG.
Below, you’ll learn how to use the Test-ReplicationHealth PowerShell cmdlet with various switches and parameters to perform various tests and monitor the DAG health.
Before You Begin
To use the Test-ReplicationHealth cmdlet, you must have the “Monitoring” role assigned.
You can use the Exchange Admin Center (EAC) or PowerShell cmdlets in Exchange Management Shell (EMS) to assign the required role to the user. The command is as follows:
New-ManagementRoleAssignment –Role Monitoring –User Administrator
Check if the administrator has been assigned with “Monitoring” role using the following command.
Get-ManagementRoleAssignment –Role Monitoring
How to Use Test-ReplicationHealth Cmdlet in Exchange Server?
You can check the DAG health by using the Test-ReplicationHealth cmdlet without any parameters.
Test-ReplicationHealth
If you see the Result status as Passed for all the checks, the DAG is working fine and does not require any action. However, if you see a Result with FAILED status, you must investigate the issues and fix them to ensure HA.
To know about each service listed in the output and what’s their role in DAG, you can run the following command in the EMS:
Test-ReplicationHealth | Format-List Check*
You can also run this command with an identity parameter to specify and test the health status of a remote member server. For instance,
Test-ReplicationHealth –Identity
If you see errors or failed status, investigate the issue and fix it before it leads to DAG failure.
You can also check all servers and the overall health status of your Database Availability Group (DAG) using the following command:
Get-DatabaseAvailabilityGroup | select -ExpandProperty:Servers | Test-ReplicationHealth
The command displays all services and their status. This can be a long list if there are 3-4 Exchange Servers or more in your DAG. In such a case, you can use the following command to list services and servers having issues or services with the ‘Failed’ status.
Get-DatabaseAvailabilityGroup | Select -ExpandProperty:Servers | Test-ReplicationHealth | Where {$_.Result.Value -ne "Passed"}
Now that you know that there are some errors in your DAG member servers, you may see the error messages getting truncated. However, to resolve the error, you must know the complete information about it. Use the following command to see the complete error information.
Get-DatabaseAvailabilityGroup | Select -ExpandProperty:Servers | Test-ReplicationHealth | Where {$_.Result.Value -ne "Passed"} | Format-List
The command output displays the complete detailed information about the error in list form.
Check the Failed Status
If one of the following checks is in the FAILED state, the databases on the server could be at risk of losing data due to inconsistency or damage.
Check |
Description |
DatabaseRedundancy |
Verifies that databases have sufficient redundancy. If this check fails, it means that some databases are at risk of losing data. |
DatabaseAvailability |
Verifies that databases have sufficient availability. If this check fails, it means that some databases are at risk of losing service. |
DBCopySuspended |
Checks if any database copies are in the 'Suspended' state. |
DBCopyFailed |
Checks if any database copies are in the 'Failed' state. |
In such cases, you may need to recover and restore the database using EseUtil Soft Recovery or Hard Recovery. Make sure to back up the database before using the Hard Recovery on the database as it can purge irrecoverable data, such as mailboxes and mail items, leading to data loss. If Soft Recovery fails, you can use an advanced Exchange database recovery tool, such as Stellar Repair for Exchange to repair and recover mailboxes from corrupt or inconsistent databases to a new database on the live Exchange Server directly. You may also save mailboxes to PST or export them to Office 365 tenant in a few clicks with the original folder structure and integrity.
Conclusion
In this article, you have learned how to use the Test-ReplicationHealth PowerShell cmdlet to check and monitor all member servers in a Database Availability Group (DAG) on Exchange Server 2010 SP1 and later versions. If you find issues with the member servers, such as failed replication or database dismount, you can use the EseUtil to recover and restore the database. If that doesn’t work or fails to fix the database, you can use an Exchange recovery software, such as Stellar Repair for Exchange to recover and restore mailboxes from a dismounted or corrupt Exchange database to PST. You may also export them directly to a new or existing mounted database on your Exchange Server. It also provides options to export single or multiple mailboxes recovered from corrupt, dismounted, or inaccessible Exchange databases to Office 365 or Exchange Online (Microsoft 365) directly in a few clicks.