Pulling Production data back to Staging for accurate testing



  • Hello all, thanks for your time,

    I'm working on an application that has four-levels: one developer only, all developers, staging/QA and production.  The goal I have is to port back production data to staging once a week so our testing is on relevant data in 50 databases.  I've thought of a few ways to approach this, and wanted to get some other minds churning.

    We do dynamically create databases which I'd want to port back with a minimum amount of effort maintaining these scripts.  Soon, administrators will be able to create databases on the fly, and any scripts need to account for new entries.  I know enough SQL to create a cursor of sysdatabases and build dynamic 'backup database' and 'restore database' although I'm concerned about needing additional devices as we get more databases (to get the file locations correct upon restoring).

    1) We own SQL Data Compare from Red Gate.  I could port each database back and save each script then find some way to automatically execute them.  What would the driver app be?  Can this handle grabbing a list of databases determined dynamically by a query on master..sysdatabases? 

    2) Use SQL DTS and create a job to either a) dump out backups, copy them with FTP, then restore them over top Staging databases, or b) overwrite them using the export option which overwrites the live Staging file.

    3) Write a Cold Fusion or .Net app to recreate either path of step 2).

    Another consideration is that Staging will at times have additional columns and tables when we're QAing new stuff.  In the case the testing is still going on when it's time to run the script, the DTS package may overwrite the entire database and its structure losing those changes.  Wouldn't want that.  Very rarely do we change column names; it's mostly expansion work at this point in the app's life-cycle which leans me towards option 1).

    I'm curious to see if anyone else has already done this or has some suggestions.  Thanks for any input.



  • What's the problem with the (to me) obvious solution of loading the dev/test servers with the live server backup? That's what you want to work with after all, isn't it? On those databases I've worked with that was a fairly easy (and scriptable) process.



  • Yes, that's the path I'm looking down now.  I had this confusion about devices that made things more difficult.  I thought I needed to create a backup devices for dumping and restoring each database, but I've learned that fortunately isn't the case.

    The only drawback that will remain is during times of moving new functionality out to Staging for testing.  I'll have to remember to stop the job otherwise I'll lose the new changes when it gets overwritten by the Production backup.  Other than that, I will have to find a way to port the backup file from Production to Staging without mapping a drive (or creating then removing it as part of the script) as I don't want to leave that open if I don't have to.  I might be able to FTP from Prod to Staging with the same result.



  • I was using the "restore database from device" command.  Sooooo much easier to use the "restore database from disk" option.  Really cuts down on the number of steps and makes this process sooo much easier.  If that's what you're talking about, that is a smart way to do this.



  • Part of the issue I forgot to mention is that many of these databases have different logical file names.  The "restore database" commands I've seen all use the logical file name.  Here is an article found on Microsoft's website about setting logical file names: http://support.microsoft.com/kb/817089.  Here are the results of running SQL Profiler while backing up then restoring a database from Enterprise Manager (this restore creates a copy of the database):

    BACKUP DATABASE [Employees] TO DISK = N'E:\TEST.bak' WITH INIT , NOUNLOAD , NAME = N'Employees backup', NOSKIP , STATS = 10, NOFORMAT

    RESTORE DATABASE [Employees2] FROM DISK = N'E:\TEST.bak' WITH FILE = 1, NOUNLOAD , STATS = 10, RECOVERY , MOVE N'Employees_Data' TO N'E:\Program Files\Microsoft SQL Server\MSSQL\data\Employees_Data2.MDF', MOVE N'Employees_Log' TO N'E:\Program Files\Microsoft SQL Server\MSSQL\data\Employees_Log2.LDF'

    Fortunately, the Prod and Staging SQL Servers are set such that the paths to the database datafiles are the same.  Otherwise, I'd need to have two 'move' clauses in the 'restore' statements.  I'm testing the restores to not use 'move' and overwrite.  If this doesn't work, I'll have to get the logical file names from the sysaltfiles database in master.



  • When you're trying to push schema changes from development on up the stack, it's probably a good idea to keep a running deployment script handy.  That way, when you push production data and schema back down to development, you can run the script.  You're both making deployment easier on yourself by scripting it in the first place and also providing yourself with several places to test your deployment.



  • Thanks for the response, Oscar.

    Your idea makes good sense.  I've been using this GUI tool called SQL Compare to reconcile database structures.  I like your idea of using a script rather than needing to click-click-click to redeploy to development / staging which requires manual intervention.

    Here's what I think I'm going to do: use SQL Compare when first rolling out new features and have it create a script.  Then optionally use SQL Data Compare to create a script if any changes require data (usually for new columns).  I think it's important to know enough SQL to know what it's doing; I use it to save a good amount of time and not miss anything.  Then sew the scripts together and call the file something consistent.  Then at the end of the script after porting back the database from Production, see if that SQL script exists and execute it if so.  Whenever I roll the new features to Production, I can archive this script.  Now I have both automation in bringing back Production data, overlaying the new development changes and data, and a history of these changes via the archived scripts.  The Development script up the stack needs to be alters to the Production baseline.  I will dump and restore the entire database when moving from Production back to avoid duplicating effort writing un-alter scripts.

    I like automation whenever possible, and that sits well with your scripting idea.  Thanks for your input.  It's helped me to work out the details.



  • Our very own Alex Papadimoulis has an article on why you should really use a script, not the pointy-clicky tool, to deploy database changes: http://weblogs.asp.net/alex_papadimoulis/archive/2006/05/03/444973.aspx


Log in to reply