Troubleshooting a Broken TFS Data Tier


I have seen a number of questions out there around troubleshooting a broken data tier in TFS 2010 in cases where the application tier appears to be working correctly, but the whole system is down because the AT can no longer reach the core databases.   Here are the steps I have gone through to resolve the issue and to get back into production.

Start at the Data Tier

  1. For now, let’s ignore anything that isn’t App Tier (AT) or Data Tier (DT) as almost all of that can be rebuilt or you probably have it elsewhere (though possibly not the case for SharePoint documents).
  2.  First, check your database backups and their logs to make sure you have them in a safe place.  I am hoping you went through the real backup procedure (either through the painful process with all the SQL statements or through the Oct version of the Power Tools) rather than just clicking “backup” on the databases.  Make a copy of them elsewhere in case the problem comes down to a bad disk or something with the hardware.  If you can have them in a safe place, first, that is some measure of security.
  3. Check access to the databases with SQL Management studio.  You need to make sure you are using the account that you expect TFS to be using for data layer access.  Try connecting to Tfs_Configuration and to your collections Tfs_[Collection Name]
  4. Triage issues in the event logs

Validate Network Connectivity

  1. Log on to the AT server
  2. Validate that you can ping the Data Tier by its server name and its full name (FQDN)
  3. Triage issues in the event log

Validate AT Connection

  1. Open the TFS Administration Console
  2. Check the TFS logs here
  3. Click on the Application Tier Node and scroll down in the main pane to confirm that the data connection is right
  4. At this point you can try resetting the database registration in TFS
    Attempt to repair the connection http://msdn.microsoft.com/en-us/library/ee349268.aspx
    Attempt to Re-register from scratch RegisterDbs  http://msdn.microsoft.com/en-us/library/ms252443.aspx
    RemapDbs (more complex/ split server scenarios) http://msdn.microsoft.com/en-us/library/ee349262.aspx
  5. If you are still not at the problem, I would go through the steps as though this were a data tier move…  that means you will try to reattach the working AT to the DT.  You should first try this with the existing databases, but these may have gotten munged so you’d then move to the backup.

http://support.microsoft.com/default.aspx?scid=kb;EN-US;955601

If you didn’t have the right backups, you may have to recover the collections once connected, but this would result in possible data loss of the unsynched portions (usually only a few seconds of difference) http://msdn.microsoft.com/en-us/library/ff407077.aspx

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s