Friday, March 24, 2006

Row-based replication and user defined functions

A little more than a year ago, I was hired to implement row-based replication to the MySQL database server. Since the principles are easy enough, I thought this would be a straightforward task to be done in a few months, tops. As always, I quickly got punished for my hybris: getting a basic row-based replication up and running was relatively straightforward but as the saying goes, the devil is in the details. Row-based replication is now safely tucked away in MySQL 5.1 for anybody who wishes to use it, but the obvious question is then what does it give me and why should I use it? When using statement based replication, the replication is accomplished by replicating the actual SQL statements to the slave server directly. That works fine for most statements, but in some situations, this does not work as expected. This time, we will only look into one such situation. Consider the following SQL statements:
UPDATE account
 SET   balance = balance + 100
 WHERE name = "Sakila";
INSERT INTO transactions
      VALUES ('Sakila', 'deposit', 100, UUID(), NULL);
The purpose is to update the balance and add a line to the transaction log that to keep track of all the transactions done to the accounts, including the user-id clerk that performed the transaction. This works fine when executed on the master, but when the slave thread executes the statements, it will be the UUID() of the SQL thread on the slave, which is really not what we want. It is, of course, possible to handle this for the built-in SQL functions, but for user defined functions (UDFs), there is no chance at all that we can make it work. Instead of logging the entire statement, we can log just the row change made to the table, and this is what row-based replication is about. That way, we will insert exactly the same row at the slave as we did on the master. To control the logging format, we introduced a new server variable BINLOG_FORMAT that can take the values STATEMENT, MIXED, and ROW. The formats STATEMENT and ROW do what you expect, use statement-based replication or row-based replication respectively, while the MIXED mode will temporarily switch to row-based replication for the statement if it uses a function that will give a different results when executed on the master and by the SQL thread. So, assume that I've got the following two tables:
master> describe transactions;
+--------+------------------------------+------+-----+-------------------+-------+
| Field  | Type                         | Null | Key | Default           | Extra |
+--------+------------------------------+------+-----+-------------------+-------+
| name   | char(20)                     | YES  |     |                   |       |
| kind   | enum('deposit','withdrawal') | YES  |     |                   |       |
| amount | decimal(10,2)                | YES  |     |                   |       |
| clerk  | int(11)                      | YES  |     |                   |       |
| time   | timestamp                    | YES  |     | CURRENT_TIMESTAMP |       |
+--------+------------------------------+------+-----+-------------------+-------+
5 rows in set (0.00 sec)

master> describe account;
+---------+---------------+------+-----+---------+-------+
| Field   | Type          | Null | Key | Default | Extra |
+---------+---------------+------+-----+---------+-------+
| name    | char(20)      | YES  |     |         |       |
| balance | decimal(10,2) | YES  |     |         |       |
+---------+---------------+------+-----+---------+-------+
2 rows in set (0.01 sec)
Here's the how you do to set replication to use MIXED binlog format and execute the statements above:
master> SET BINLOG_FORMAT=MIXED;
Query OK, 0 rows affected (0.00 sec)

master> UPDATE account
    -> SET balance = balance + 100
    -> WHERE name = 'Sakila';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

master> INSERT INTO transactions
    -> VALUES ('Sakila', 'deposit', 100, UUID(), NULL);
Query OK, 1 row affected, 1 warning (0.00 sec)

master> SHOW BINLOG EVENTS FROM 975;
+-------------------+------+------------+-----------+-------------+------------------------------------------------------------------------------+
| Log_name          | Pos  | Event_type | Server_id | End_log_pos | Info                                                                         |
+-------------------+------+------------+-----------+-------------+------------------------------------------------------------------------------+
| master-bin.000001 | 975  | Query      | 1         | 1102        | use `test`; UPDATE account SET balance = balance + 100 WHERE name = 'Sakila' |
| master-bin.000001 | 1102 | Table_map  | 1         | 1155        | table_id: 17 (test.transactions)                                             |
| master-bin.000001 | 1155 | Write_rows | 1         | 1206        | table_id: 17 flags: STMT_END_F                                               |
+-------------------+------+------------+-----------+-------------+------------------------------------------------------------------------------+
3 rows in set (0.00 sec)
The UPDATE statement is logged as a statement using a query event, while the INSERT statement is logged using a table map event and an event containing the rows inserted (one, in this case). I'll explain more about the table map events and the various forms of row-containing events, but for now you have to trust me when I say that there are rows in the Write_rows event. :)

Tuesday, March 21, 2006

First day back from the Developers Conference in Sorrento, and the first post to this blog (thought I might just as well get started, I've been thinking about it for a while but always been distracted). Since I've been to Italy before, I knew part of what was waiting: excellent food, terrific coffee, and nice friendly people. What I wasn't prepared for was the intense discussions that were continously going on. The last year, in Prague, it was to a large part just the down-to-earth work of getting the code ready for the 5.0 release; this time, however, the conference was vibrant with ideas for the future and ways to leverage the skills of the people in MySQL to produce even better services. Before the conference, I was mostly busy with getting the row-based replication in shape for shipping. All it all, I would say that it holds up very well. Since it's a fresh feature, there's (of course) a list of bugs, but these are mostly annoyances like the occational extra events in the binary log that doesn't really affect the result, but that are not needed. I feel I have a pretty good control of the code right now, and know where the problematic spots are. There's, however, nothing like community testing to find the really difficult problems. I really want this feature to be rock-solid, so I'm prepared to get swamped with bug reports. Please give me your worst. :)