ZEROFILL. The culprit is adding an
AUTO_INCREMENTcolumn to a table, and it is not actually a bug (to be frank, it depends on your point of view, but "fixing" this bug causes more headaces than it solves as you'll see in a moment). What the example code in the bug description does is creating a table on the slave and the master, but inserting (identical) rows into the tables in different order on the slave and the master; for example in this way:
Now, if you look at the tables, you will see that the same rows are present in both tables (I'm just showing the result on one of the servers since they are identical):master> CREATE TABLE ages(name CHAR(30), age INT); master> SET SQL_LOG_BIN=FALSE; master> INSERT INTO ages SET name='Mats', age=37; master> INSERT INTO ages SET name='Lill', age=25; master> INSERT INTO ages SET name='Jon', age=4; master> SET SQL_LOG_BIN=TRUE; slave> INSERT INTO ages SET name='Mats', age=37; slave> INSERT INTO ages SET name='Jon', age=4; slave> INSERT INTO ages SET name='Lill', age=25;
Why themysql> SELECT * FROM ages ORDER BY name, age; +------+------+ | name | age | +------+------+ | Mats | 37 | | Jon | 4 | | Lill | 20 | +------+------+ 3 rows in set (0.00 sec)
ORDER BY? Well, potentially the rows could be listed in different order on the master and slave because of one of the following reasons:
- If this were a real database, we would have been running the master and the slave separately for a while before deciding that we should replicate them and, even though the tables contain the same rows now, the rows would potentially be listed in different order.
- We are using NDB Cluster as storage engine, and there we have no guarantee on the order of the rows, regardless of the order they were inserted.
AUTO_INCREMENToption. Since we have replication running, the same change will be made to the table on the slave.
To our surprise, we do not get the same id:s assigned to the people on the master as on the slave. Look here:master> ALTER TABLE ages ..> ADD id INT AUTO_INCREMENT PRIMARY KEY;
So, what is happening under the hood? When executing anmaster> SELECT * FROM ages ORDER BY name, age; +------+------+----+ | name | age | id | +------+------+----+ | Jon | 4 | 3 | | Lill | 20 | 2 | | Mats | 37 | 1 | +------+------+----+ slave> SELECT * FROM ages ORDER BY name, age; +------+------+----+ | name | age | id | +------+------+----+ | Jon | 4 | 2 | | Lill | 20 | 3 | | Mats | 37 | 1 | +------+------+----+
- a new table is created with the extra column
- the rows are copied over one by one from the old table to the new table in a storage engine-specific order
- the old table is dropped
- the new table is renamed to the name of the old table
ALTER TABLEstatement is replicated by statement, so it gets executed on both the slave and the master. So, if we're using MyISAM as storage engine, the rows are copied in the order they were inserted. For other storage engines, you get other row orders. So much for the problem, what about the solution and why is this not considered a bug? I will start with the solution, since this explains why we decided not to produce a bug fix for it, but rather adding a caveat to the documentation. The solution is simple: sort the rows before when adding them to the table. To do this, you have to repeat the steps above yourself, but add an
ORDER BYclause when inserting the rows. So, the steps to add a column are:
- Create a new table with an extra column. Since you might want to add the new column at an arbitrary place, I give a generic solution.
CREATE TABLE new_ages LIKE ages; ALTER TABLE new_ages ADD id INT AUTO_INCREMENT PRIMARY KEY;
- Copy the rows into the new table.
- Drop the old table.
DROP TABLE ages;
- Rename the old table to the new table.
ALTER TABLE new_ages RENAME ages;
INSERT INTO new_ages(name,age) SELECT name,age FROM ages ORDER BY name,age;
ORDER BYinvolves a complete sort of the table so implementing a patch to do it this way would force sorting the table for every execution of an
ALTER TABLE. In addition, it might not be necessary to sort the rows on every column in the table. Forcing this solution on every user that needs to do an
ALTER TABLEwould be a bigger evil than leaving the implementation as it is. And of course, if don't care where the new column is, you can replace the first two steps with:
CREATE TABLE new_ages(id INT AUTO_INCREMENT PRIMARY KEY) SELECT name,age FROM ages ORDER BY name,age;