In the past I have used snapshot replication to present data to clients for reporting purposes. It hasn’t mattered that the data is not the most current version, so a snapshot generated every 24 hours has fitted the reporting requirements.
Now, one of the ‘gotchas’ that I have discovered with this implementation is that you can run into an issue whereby your transaction log grows and even though you may be performing regular T-log backups, the Log does not ‘clear down’. So you can end up with a situation where your transaction log is vastly proportionally larger than your datafile. This usually shouldn’t be the case if it was not for the fact that Snapshot Replication was enabled. So what is happening?
Well, to start, let’s take a basic look at how the transaction log works.
The transaction log is made up of lots of smaller internal files called virtual log files (vlf’s).
In the above example let’s presume the T-log is 6Gb in total with each virtual log file representing 1Gb in size. In the above example the transaction log is approximately 70% full. If you wanted to try and shrink the file down it would not shrink down to vlf no1 because vlf’s 2,3,4 and a portion of 5 are active. You would have to manually run a CHECKPOINT command (if the database is using SIMPLE RECOVERY MODEL), or perform a T-log back up if the database is using the FULL RECOVERY MODEL.
This would flush the data to disc and the active log records would now show as all black in the above diagram (free) and you could shrink the log file down and re-size to a size that fits your requirements.
However, If Snapshot Replication is enabled, you may find that running a T-log backup or issuing the CHECKPOINT command doesn’t not free up the records.
A highly likely reason for this behaviour is that there has been a schema modification statement issued against the database involved in Snapshot Replication. You can see how to find out if this is the case by using a system function called fn_dblog which should reveal the ddl statement responsible. When this change is stored in the transaction log it is not ‘unmarked’ after the changes are propagated. This results in the transaction log growth.
If you where to run the command (query 1)
name AS [Database Name]
,log_reuse_wait_desc AS [LOG Usage]
,recovery_model_desc AS [Recovery Model]
,state_desc AS [Database State]
WHERE name = ‘db_name’
you will see the LOG Usage Column with a value of REPLICATION
in the context of the database in question, you will see in the results set all the virtual log files with a status of 2, indicating they are still active. No matter how many t-log backups or CHECKPOINT commands you issue, they will remain with a status of 2.
How to Fix
Firstly,To fix your transaction log and clean things up you need to mark the transaction as distributed by executing:
CAUTION: Make sure the subscriber is indeed upto date with the latest snapshot
EXEC sp_repldone null, null, 0,0,1
Within the Publication in question, under Subscription Options there is an option under Schema Replication – Replicate schema changes. This needs setting to false. Any schema modifications will automatically be picked up when you generate your daily snapshot and copied across to your subscribers. You can change this setting to false either by TSQL
@publication = ‘PublicationName’,
@property = N’replicate_ddl’,
@value = 0
Or by the GUI, by right clicking the Publication in question and selecting Properties.
Once this is done re run your snapshot/sync job.
If you now run query 1 again you will see that the Log is no longer waiting on Replication and you are now able to clear your log file after running a T-log backup or CHECKPOINT command.
should now show a status of 0, indicating that the vlf’s are now inactive (data flushed to disc).