Friday, December 22, 2006

The invisible I/O thread failures are no more

To get the status of the replication slave, it is possible to check the Last_Error and Last_Errno fields from SHOW SLAVE STATUS. Unfortunately, they only give information about the status of the SQL thread (and not always that either). If the I/O thread fails, for example, because the server configuration is not correctly set up, or if the connection to the master is lost due to a network outage, it is necessary to dig through the error log to find out the reason. This might be possible, although annoying, for a DBA to do since he has access to the files on the machine where the server is running, but when using automatic recovery applications that watch the status of the replication, this is not practical. It is also easier to see the status of the server through a normal client connection, compared to logging into the machine and starting to locate the files.

This is actually quite stupid, especially since it is possible to individually check if the threads are running, so to make it possible to check the status of the threads from a client (an application or a user connecting directly to the server), I just added four new fields to the output from SHOW SLAVE STATUS: Last_SQL_Error, Last_SQL_Errno, Last_IO_Error, and Last_IO_Errno. The new fields were added last, and the two old fields Last_Error and Last_Errno are just aliases for Last_SQL_Error and Last_SQL_Errno respectively. Adding the new fields last and keeping the two old fields intact allow old applications to work as normal since they either use positional arguments or find the column by name. New applications, however, can take advantage of these new fields.