Friday, April 21, 2006

Replication of ALTER TABLE with AUTO_INCREMENT

Just got Bug#16993 closed, which provides a good lesson into the complications of replication. The original title was RBR: ALTER TABLE ZEROFILL AUTO_INCREMENT is not replicated correctly, but the problem is not related to row-based replication (RBR) nor to ZEROFILL. The culprit is adding an AUTO_INCREMENT column to a table, and it is not actually a bug (to be frank, it depends on your point of view, but "fixing" this bug causes more headaces than it solves as you'll see in a moment). What the example code in the bug description does is creating a table on the slave and the master, but inserting (identical) rows into the tables in different order on the slave and the master; for example in this way:
master> CREATE TABLE ages(name CHAR(30), age INT);
master> SET SQL_LOG_BIN=FALSE;
master> INSERT INTO ages SET name='Mats', age=37;
master> INSERT INTO ages SET name='Lill', age=25;
master> INSERT INTO ages SET name='Jon',  age=4;
master> SET SQL_LOG_BIN=TRUE;
 slave> INSERT INTO ages SET name='Mats', age=37;
 slave> INSERT INTO ages SET name='Jon',  age=4;
 slave> INSERT INTO ages SET name='Lill', age=25;
Now, if you look at the tables, you will see that the same rows are present in both tables (I'm just showing the result on one of the servers since they are identical):
mysql> SELECT * FROM ages ORDER BY name, age;
+------+------+
| name | age  |
+------+------+
| Mats |   37 | 
| Jon  |    4 | 
| Lill |   20 | 
+------+------+
3 rows in set (0.00 sec)
Why the ORDER BY? Well, potentially the rows could be listed in different order on the master and slave because of one of the following reasons:
  • If this were a real database, we would have been running the master and the slave separately for a while before deciding that we should replicate them and, even though the tables contain the same rows now, the rows would potentially be listed in different order.
  • We are using NDB Cluster as storage engine, and there we have no guarantee on the order of the rows, regardless of the order they were inserted.
Now we (or management) decides that we need to assign a unique id to each person in the table. That is easy, just add a column with an AUTO_INCREMENT option. Since we have replication running, the same change will be made to the table on the slave.
master> ALTER TABLE ages
    ..>   ADD id INT AUTO_INCREMENT PRIMARY KEY;
To our surprise, we do not get the same id:s assigned to the people on the master as on the slave. Look here:
master> SELECT * FROM ages ORDER BY name, age;
+------+------+----+
| name | age  | id |
+------+------+----+
| Jon  |    4 |  3 | 
| Lill |   20 |  2 | 
| Mats |   37 |  1 | 
+------+------+----+

slave> SELECT * FROM ages ORDER BY name, age;
+------+------+----+
| name | age  | id |
+------+------+----+
| Jon  |    4 |  2 | 
| Lill |   20 |  3 | 
| Mats |   37 |  1 | 
+------+------+----+
So, what is happening under the hood? When executing an ALTER TABLE:
  1. a new table is created with the extra column
  2. the rows are copied over one by one from the old table to the new table in a storage engine-specific order
  3. the old table is dropped
  4. the new table is renamed to the name of the old table
Observe that the ALTER TABLE statement is replicated by statement, so it gets executed on both the slave and the master. So, if we're using MyISAM as storage engine, the rows are copied in the order they were inserted. For other storage engines, you get other row orders. So much for the problem, what about the solution and why is this not considered a bug? I will start with the solution, since this explains why we decided not to produce a bug fix for it, but rather adding a caveat to the documentation. The solution is simple: sort the rows before when adding them to the table. To do this, you have to repeat the steps above yourself, but add an ORDER BY clause when inserting the rows. So, the steps to add a column are:
  1. Create a new table with an extra column. Since you might want to add the new column at an arbitrary place, I give a generic solution.
    CREATE TABLE new_ages LIKE ages;
    ALTER TABLE new_ages
      ADD id INT AUTO_INCREMENT PRIMARY KEY;
    
  2. Copy the rows into the new table.
  3. INSERT INTO new_ages(name,age) 
      SELECT name,age FROM ages ORDER BY name,age;
    
  4. Drop the old table.
    DROP TABLE ages;
    
  5. Rename the old table to the new table.
    ALTER TABLE new_ages RENAME ages;
    
Now, why do we not write a patch to fix this "bug"? Simply put, the ORDER BY involves a complete sort of the table so implementing a patch to do it this way would force sorting the table for every execution of an ALTER TABLE. In addition, it might not be necessary to sort the rows on every column in the table. Forcing this solution on every user that needs to do an ALTER TABLE would be a bigger evil than leaving the implementation as it is. And of course, if don't care where the new column is, you can replace the first two steps with:
CREATE TABLE new_ages(id INT AUTO_INCREMENT PRIMARY KEY)
  SELECT name,age FROM ages ORDER BY name,age;

No comments: