The reason is that isolation between concurrent transactions cannot be guaranteed, meaning that two concurrent transactions are not guaranteed to be serializable (the result of a concurrent transaction that has committed can "leak" into an ongoing transaction). Since they are not serializable, it means they cannot be written to the binary log in an order that produce the same result on the slave as on the master.
However, when using row-based replication they are serializable, because whatever values are written to the tables are also written to the binary log, so if data "leaks" into an ongoing transaction, this is what is written to the binary log as well, so that when the transaction commits, the values written to the table are the same as those written to the binary log.
It is a rational decision, but I hope that Falcon will support statement-based replication as well in the future.
4 comments:
Is there a description with more details that describes why this can't be done?
Why can't they use the technique that is used by InnoDB - for statements like 'insert into Foo select * from Bar' read locks are taken on the rows from Bar?
Well this is pretty foobar in my opinion. I use statement binary logs for auditing purposes like determining the insert/update/delete rate for individual tables, etc.
As Mark said, Falcon should be able to set locks like InnoDB does to ensure serializable order.
Swany, you can use RBR can still be able to get SBR like statements. Just take a look at:
http://jan.kneschke.de/2008/5/30/mysql-proxy-rbr-to-sbr-decoding
If column names aren't too important the released code is already good enough. If later releases we will mix in scripting to allow users to map in the original column-names.
I still want the original statements for audit and debugging purposes. I have spent a lot of time getting comments into SQL statements to identify application owners. With the scheme you have proposed, I will get the row changes but have no idea what the original statement and I won't have the comment. Why can't there be an option to create a new no-op event that includes the original SQL statement?
Post a Comment