<?xml version="1.0" encoding="utf-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: How to Keep Binlogs in Sync?</title>
	<atom:link href="http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/</link>
	<description>a blog about databases and stuff</description>
	<pubDate>Wed, 03 Dec 2008 20:55:28 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.1</generator>
		<item>
		<title>By: Gregory Haase</title>
		<link>http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/#comment-1482</link>
		<dc:creator>Gregory Haase</dc:creator>
		<pubDate>Fri, 24 Oct 2008 14:27:04 +0000</pubDate>
		<guid isPermaLink="false">http://blog.onefreevoice.com/?p=100#comment-1482</guid>
		<description>Here's a clarification: I just looked at my intermediate layers and compared a single command/timestamp between them. They all had a different binlog posititon:
server 1: 111949922
server 2: 111957284
server 3: 112010726

These servers have been running for 98 days. I think considering all the water that's gone under the bridge, they are really close. To be honest, they are a lot closer than I personally thought they'd be.

Can I tell you why they are off - No. I do know of two separate occasions where the slave SQL thread stopped due to a bad query. In both cases there was some manual intervention. This could possible have had an affect on the binlogs.

I guess I'll be doing a lot more investigation into this in the next couple of days.  :-/</description>
		<content:encoded><![CDATA[<p>Here&#8217;s a clarification: I just looked at my intermediate layers and compared a single command/timestamp between them. They all had a different binlog posititon:<br />
server 1: 111949922<br />
server 2: 111957284<br />
server 3: 112010726</p>
<p>These servers have been running for 98 days. I think considering all the water that&#8217;s gone under the bridge, they are really close. To be honest, they are a lot closer than I personally thought they&#8217;d be.</p>
<p>Can I tell you why they are off - No. I do know of two separate occasions where the slave SQL thread stopped due to a bad query. In both cases there was some manual intervention. This could possible have had an affect on the binlogs.</p>
<p>I guess I&#8217;ll be doing a lot more investigation into this in the next couple of days.  :-/</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gregory Haase</title>
		<link>http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/#comment-1480</link>
		<dc:creator>Gregory Haase</dc:creator>
		<pubDate>Fri, 24 Oct 2008 14:06:49 +0000</pubDate>
		<guid isPermaLink="false">http://blog.onefreevoice.com/?p=100#comment-1480</guid>
		<description>I read through your bug report, and I understand your issue, but it doesn't really apply to my scenario. I have 1 master which replicates to 3 slaves, which each replicate down. Although I am using a lot of transactions, and my 3 intermediate slaves would not have the same binlog name and position of the master, they are in sync with each other.

Even then, I don't have a system for automatic failover of the intermediate nodes, and I'm not sure I would trust the binlog positions to be absolutely the same.

My current procedure in the event of a failure is to assign the standard replicated servers with one of the other intermediate servers as master. Before manually issuing a change master on each slave, I will compare the relay-log on the slave to the bin-log on the new master.

First, do a SHOW SLAVE STATUS on the slave you want to change master on. Make a note of RELAY_MASTER_LOG_FILE, EXEC_MASTER_LOG_POS, RELAY_LOG_FILE and RELAY_LOG_SPACE.  On the slave, do mysqlbinlog RELAY_LOG_FILE and look at the timestamp and the command identified by RELAY_LOG_SPACE. On the new master, do mysqlbinlog RELAY_MASTER_LOG_FILE and look at the timestamp and the command identified by EXEC_MASTER_LOG_POS.  If these commands are the same, it's a safe bet I can just point the slave to it's new master and use it's current coordinates. If the commands are different, then we need to find the correct coordinate on the new master that reflect the next command that the slave needs to use before we can do a change master.

There was lots of talk in our shop about using a VIP in our intermediate layer, so that if one of those servers failed, we could just point the VIP to one of the still functioning intermediate servers. I put that conversation on hold for two reasons - I don't trust the binlog positions to be in sync, and I only have n+1 intermediate servers, and if I assign all of my orphaned slaves to only one of those servers, it's going to have double the load of all the others.  There's probably a way to carefully automate it all, but we have bigger fish to fry right now.</description>
		<content:encoded><![CDATA[<p>I read through your bug report, and I understand your issue, but it doesn&#8217;t really apply to my scenario. I have 1 master which replicates to 3 slaves, which each replicate down. Although I am using a lot of transactions, and my 3 intermediate slaves would not have the same binlog name and position of the master, they are in sync with each other.</p>
<p>Even then, I don&#8217;t have a system for automatic failover of the intermediate nodes, and I&#8217;m not sure I would trust the binlog positions to be absolutely the same.</p>
<p>My current procedure in the event of a failure is to assign the standard replicated servers with one of the other intermediate servers as master. Before manually issuing a change master on each slave, I will compare the relay-log on the slave to the bin-log on the new master.</p>
<p>First, do a SHOW SLAVE STATUS on the slave you want to change master on. Make a note of RELAY_MASTER_LOG_FILE, EXEC_MASTER_LOG_POS, RELAY_LOG_FILE and RELAY_LOG_SPACE.  On the slave, do mysqlbinlog RELAY_LOG_FILE and look at the timestamp and the command identified by RELAY_LOG_SPACE. On the new master, do mysqlbinlog RELAY_MASTER_LOG_FILE and look at the timestamp and the command identified by EXEC_MASTER_LOG_POS.  If these commands are the same, it&#8217;s a safe bet I can just point the slave to it&#8217;s new master and use it&#8217;s current coordinates. If the commands are different, then we need to find the correct coordinate on the new master that reflect the next command that the slave needs to use before we can do a change master.</p>
<p>There was lots of talk in our shop about using a VIP in our intermediate layer, so that if one of those servers failed, we could just point the VIP to one of the still functioning intermediate servers. I put that conversation on hold for two reasons - I don&#8217;t trust the binlog positions to be in sync, and I only have n+1 intermediate servers, and if I assign all of my orphaned slaves to only one of those servers, it&#8217;s going to have double the load of all the others.  There&#8217;s probably a way to carefully automate it all, but we have bigger fish to fry right now.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sascha Curth</title>
		<link>http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/#comment-1479</link>
		<dc:creator>Sascha Curth</dc:creator>
		<pubDate>Fri, 24 Oct 2008 13:32:22 +0000</pubDate>
		<guid isPermaLink="false">http://blog.onefreevoice.com/?p=100#comment-1479</guid>
		<description>How do you assure the synchronicity of the  binlog positions? Some time ago I filed a bug at MySql regarding this problem, but unfortunally didn't get a proper solution:

http://bugs.mysql.com/bug.php?id=36541</description>
		<content:encoded><![CDATA[<p>How do you assure the synchronicity of the  binlog positions? Some time ago I filed a bug at MySql regarding this problem, but unfortunally didn&#8217;t get a proper solution:</p>
<p><a href="http://bugs.mysql.com/bug.php?id=36541" rel="nofollow">http://bugs.mysql.com/bug.php?id=36541</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gregory Haase</title>
		<link>http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/#comment-649</link>
		<dc:creator>Gregory Haase</dc:creator>
		<pubDate>Tue, 15 Jul 2008 19:52:33 +0000</pubDate>
		<guid isPermaLink="false">http://blog.onefreevoice.com/?p=100#comment-649</guid>
		<description>I've edited this procedure to reflect the fact that START SLAVE UNTIL is asynchronous (see &lt;a href="http://blog.onefreevoice.com/2008/07/15/creating_intermediate_slaves/#comment-647" rel="nofollow"&gt;comment&lt;/a&gt; from Mats Kindahl).</description>
		<content:encoded><![CDATA[<p>I&#8217;ve edited this procedure to reflect the fact that START SLAVE UNTIL is asynchronous (see <a href="http://blog.onefreevoice.com/2008/07/15/creating_intermediate_slaves/#comment-647" rel="nofollow">comment</a> from Mats Kindahl).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: One Free Voice &#187; Blog Archive &#187; Creating an Intermediate Replication Layer</title>
		<link>http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/#comment-646</link>
		<dc:creator>One Free Voice &#187; Blog Archive &#187; Creating an Intermediate Replication Layer</dc:creator>
		<pubDate>Tue, 15 Jul 2008 17:52:45 +0000</pubDate>
		<guid isPermaLink="false">http://blog.onefreevoice.com/?p=100#comment-646</guid>
		<description>[...] few weeks ago, I discussed how to keep binlogs in sync in a tree or pyramid replication scheme. That thread discussed how to re-distribute load in case of [...]</description>
		<content:encoded><![CDATA[<p>[...] few weeks ago, I discussed how to keep binlogs in sync in a tree or pyramid replication scheme. That thread discussed how to re-distribute load in case of [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gregory Haase</title>
		<link>http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/#comment-634</link>
		<dc:creator>Gregory Haase</dc:creator>
		<pubDate>Thu, 26 Jun 2008 00:06:54 +0000</pubDate>
		<guid isPermaLink="false">http://blog.onefreevoice.com/?p=100#comment-634</guid>
		<description>I revised the process outlined above to include Mark's suggestion about START SLAVE UNTIL.

I also changed MASTER_LOG_FILE to RELAY_MASTER_LOG_FILE reflecting Baron's comment. 

RELAY_MASTER_LOG_FILE is the name of the master binary log file containing the most recent event executed by the SQL thread. It's companion EXEC_MASTER_LOG_POSITION is the position of the last event executed by the SQL thread from the master's binary log. See &lt;a href="http://dev.mysql.com/doc/refman/5.1/en/show-slave-status.html" title="show slave status" rel="nofollow"&gt;Documentation&lt;/a&gt;.

I would note that in the case of one of my intermediate slaves going down, the SQL thread on the end slaves is not going to stop. It's going to process all the way through the relay logs. The I/O thread would keep trying to connect to the master every 60 seconds.  If the slaves were guaranteed in sync, one could theoretically map the IP Address of the failed server to one of your working servers and never miss a beat.</description>
		<content:encoded><![CDATA[<p>I revised the process outlined above to include Mark&#8217;s suggestion about START SLAVE UNTIL.</p>
<p>I also changed MASTER_LOG_FILE to RELAY_MASTER_LOG_FILE reflecting Baron&#8217;s comment. </p>
<p>RELAY_MASTER_LOG_FILE is the name of the master binary log file containing the most recent event executed by the SQL thread. It&#8217;s companion EXEC_MASTER_LOG_POSITION is the position of the last event executed by the SQL thread from the master&#8217;s binary log. See <a href="http://dev.mysql.com/doc/refman/5.1/en/show-slave-status.html" title="show slave status" rel="nofollow">Documentation</a>.</p>
<p>I would note that in the case of one of my intermediate slaves going down, the SQL thread on the end slaves is not going to stop. It&#8217;s going to process all the way through the relay logs. The I/O thread would keep trying to connect to the master every 60 seconds.  If the slaves were guaranteed in sync, one could theoretically map the IP Address of the failed server to one of your working servers and never miss a beat.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xaprb</title>
		<link>http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/#comment-633</link>
		<dc:creator>Xaprb</dc:creator>
		<pubDate>Wed, 25 Jun 2008 20:20:24 +0000</pubDate>
		<guid isPermaLink="false">http://blog.onefreevoice.com/?p=100#comment-633</guid>
		<description>MASTER_LOG_FILE, EXEC_MASTER_LOG_POSITION is quicksand for the unwary :-) Master_log_file is the I/O thread's position, and you care about what updates have been applied, not which have been read from the master.

I'm still working on automating this with mk-slave-move from Maatkit.  You can already do it *before* the server crashes, with the current code.  But after it crashes -- that's harder.  Even while the servers are all functioning normally there are a lot of tricky cases to cover.</description>
		<content:encoded><![CDATA[<p>MASTER_LOG_FILE, EXEC_MASTER_LOG_POSITION is quicksand for the unwary <img src='http://blog.onefreevoice.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> Master_log_file is the I/O thread&#8217;s position, and you care about what updates have been applied, not which have been read from the master.</p>
<p>I&#8217;m still working on automating this with mk-slave-move from Maatkit.  You can already do it *before* the server crashes, with the current code.  But after it crashes &#8212; that&#8217;s harder.  Even while the servers are all functioning normally there are a lot of tricky cases to cover.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gregory Haase</title>
		<link>http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/#comment-632</link>
		<dc:creator>Gregory Haase</dc:creator>
		<pubDate>Wed, 25 Jun 2008 12:18:06 +0000</pubDate>
		<guid isPermaLink="false">http://blog.onefreevoice.com/?p=100#comment-632</guid>
		<description>That's a good point. I keep forgetting about START SLAVE UNTIL.</description>
		<content:encoded><![CDATA[<p>That&#8217;s a good point. I keep forgetting about START SLAVE UNTIL.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark Leith</title>
		<link>http://blog.onefreevoice.com/2008/06/24/how-to-keep-binlogs-in-sync/#comment-631</link>
		<dc:creator>Mark Leith</dc:creator>
		<pubDate>Wed, 25 Jun 2008 10:12:52 +0000</pubDate>
		<guid isPermaLink="false">http://blog.onefreevoice.com/?p=100#comment-631</guid>
		<description>Yes there is a chance that there could be a statement issued from the slave SQL thread between the MASTER_POS_WAIT and the RESET MASTER. 

You want to look in to STOP SLAVE ; START SLAVE UNTIL instead, this guarantees that no statements run after the point that you want to be at. 

This was discussed a little before on Baron's blog (and in the comments) here:

http://www.xaprb.com/blog/2007/01/20/how-to-make-mysql-replication-reliable/</description>
		<content:encoded><![CDATA[<p>Yes there is a chance that there could be a statement issued from the slave SQL thread between the MASTER_POS_WAIT and the RESET MASTER. </p>
<p>You want to look in to STOP SLAVE ; START SLAVE UNTIL instead, this guarantees that no statements run after the point that you want to be at. </p>
<p>This was discussed a little before on Baron&#8217;s blog (and in the comments) here:</p>
<p><a href="http://www.xaprb.com/blog/2007/01/20/how-to-make-mysql-replication-reliable/" rel="nofollow">http://www.xaprb.com/blog/2007/01/20/how-to-make-mysql-replication-reliable/</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
