-----------------------------------------------------------------------
                              WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter updates to certain tables
CREATION DATE..: Mon, 10 Aug 2009, 13:25
CREATED BY.....: Psergey
SUPERVISOR.....: Monty
LEAD ARCHITECT.: 
ARCH REVIEW....: 
IMPLEMENTOR....: 
1st CODE REVIEW: 
2nd CODE REVIEW: 
QA.............: 
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 40 (http://askmonty.org/worklog/?tid=40)
VERSION........: Server-9.x
STATUS.........: Cancelled
PRIORITY.......: 60

DEPENDS ON.....:

DEPENDANT......:

DESCRIPTION:

Replication slave can be set to filter updates to certain tables with
--replicate-[wild-]{do,ignore}-table options.

This task is about adding similar functionality to mysqlbinlog.


PROGRESS NOTES:

-=-=(Guest - Thu, 17 Jun 2010, 00:39)=-=-
Dependency deleted: 39 no longer depends on 40

-=-=(Guest - Tue, 16 Feb 2010, 10:23)=-=-
Status updated.
--- /tmp/wklog.40.old.18300     2010-02-16 10:23:20.000000000 +0200
+++ /tmp/wklog.40.new.18300     2010-02-16 10:23:20.000000000 +0200
@@ -1 +1 @@
-Assigned
+Cancelled

-=-=(Guest - Wed, 25 Nov 2009, 11:41)=-=-
Status updated.
--- /tmp/wklog.40.old.5760      2009-11-25 11:41:09.000000000 +0200
+++ /tmp/wklog.40.new.5760      2009-11-25 11:41:09.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+Assigned

-=-=(Guest - Wed, 25 Nov 2009, 11:41)=-=-
Category updated.
--- /tmp/wklog.40.old.5737      2009-11-25 11:41:03.000000000 +0200
+++ /tmp/wklog.40.new.5737      2009-11-25 11:41:03.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint

-=-=(Bothorsen - Tue, 17 Nov 2009, 17:20)=-=-
Alex is closer to a working patch now.

-=-=(Bothorsen - Thu, 12 Nov 2009, 13:13)=-=-
Work hours by Alexi and Bo + estimated time for the task.

-=-=(Alexi - Sun, 08 Nov 2009, 15:18)=-=-
Low Level Design modified.
--- /tmp/wklog.40.old.15787     2009-11-08 15:18:11.000000000 +0200
+++ /tmp/wklog.40.new.15787     2009-11-08 15:18:11.000000000 +0200
@@ -62,7 +62,7 @@
    it considers the query to extent to the end of the event.
 2. For 'db' (current db) the trailing zero is redundant since the length
    is already known.
-3. db_len = 0 means that this is the current db.
+3. In tables_info, db_len = 0 means that this is the current db.
 
 When reading Query events from binary log, we can recognize its format
 by its post-header length: in extended case the post-header includes 4
@@ -75,6 +75,77 @@
 + #define Q_QUERY_LEN_OFFSET              Q_STATUS_VARS_LEN_OFFSET + 2
 + #define Q_QUERY_TABLES_INFO_LEN_OFFSET  Q_QUERY_LEN_OFFSET + 2
 
+
+***********************************************************************
+HELP NEEDED
+***********************************************************************
+The QUERY_HEADER_LEN is used in the definition of MAX_LOG_EVENT_HEADER:
+
+log_event.h
+~~~~~~~~~~~
+#define MAX_LOG_EVENT_HEADER   ( /* in order of Query_log_event::write */ \
+  LOG_EVENT_HEADER_LEN + /* write_header */ \
+  QUERY_HEADER_LEN  + /* write_data */   \
+  EXECUTE_LOAD_QUERY_EXTRA_HEADER_LEN + /*write_post_header_for_derived */ \
+  MAX_SIZE_LOG_EVENT_STATUS + /* status */ \
+  NAME_LEN + 1)
+
+which is used only for setting
+
+  thd->variables.max_allowed_packet
+  mysql->net.max_packet_size
+
+Looks like (but I am not quite sure) that QUERY_HEADER_LEN can simply
+(without making any other changes) be substituted in this definition by
+QUERY_HEADER_LEN_EXT.
+
+Below I list all places where MAX_LOG_EVENT_HEADER is used:
+
+slave.cc
+~~~~~~~~
+static int init_slave_thread(...)
+{ ...
+  /*
+    Adding MAX_LOG_EVENT_HEADER_LEN to the max_allowed_packet on all
+    slave threads, since a replication event can become this much larger
+    than the corresponding packet (query) sent from client to master.
+  */
+  thd->variables.max_allowed_packet= global_system_variables.max_allowed_packet
+    + MAX_LOG_EVENT_HEADER;  /* note, incr over the global not session var */
+  ...
+}
+pthread_handler_t handle_slave_io(...)
+{ ...
+  /*
+    Adding MAX_LOG_EVENT_HEADER_LEN to the max_packet_size on the I/O
+    thread, since a replication event can become this much larger than
+    the corresponding packet (query) sent from client to master.
+  */
+  mysql->net.max_packet_size= thd->net.max_packet_size+= MAX_LOG_EVENT_HEADER;
+  ...
+}
+
+sql_repl.cc
+~~~~~~~~~~~
+void mysql_binlog_send(...)
+{ ...
+  /*
+    Adding MAX_LOG_EVENT_HEADER_LEN, since a binlog event can become
+    this larger than the corresponding packet (query) sent 
+    from client to master.
+  */
+  thd->variables.max_allowed_packet+= MAX_LOG_EVENT_HEADER;
+  ...
+}
+bool mysql_show_binlog_events(...)
+{ ...
+  /*
+    to account binlog event header size
+  */
+  thd->variables.max_allowed_packet+= MAX_LOG_EVENT_HEADER;
+  ...
+}
+
 3. Changes in log events
 ************************
 
@@ -84,7 +155,7 @@
 This setting is done in Format description event constructor which creates
 the event for writing to binary log:
 
-  if (binlog_with_tables_info)
+  if (opt_binlog_with_tables_info)
       post_header_len[QUERY_EVENT - 1] = QUERY_HEADER_LEN_EXT;
   else
       post_header_len[QUERY_EVENT - 1] = QUERY_HEADER_LEN;
@@ -99,12 +170,12 @@
 following manner:
 
   switch (binlog_ver) {
-  case 4: /* MySQL 5.0 and higher */
 + #ifndef MYSQL_CLIENT
+  case 4: /* MySQL 5.0 and higher */
     ...
-+ #else
-+   <error>
+    break;
 + #endif
+
   case 1:
   case 3:
     ...
@@ -132,7 +203,7 @@
 --------------------------------
 [Creates the event for binlogging]
 
-In case of binlog_with_tables_info = TRUE, set additionally query_len,
+In case of opt_binlog_with_tables_info = TRUE, set additionally query_len,
 tables_info_len, and tables_info members (the constructor is to have
 an additional 'tables_info' argument).
 
@@ -140,7 +211,7 @@
 ----------------
 [Writes the event to binlog]
 
-In case of binlog_with_tables_info = TRUE, write additional members
+In case of opt_binlog_with_tables_info = TRUE, write additional members
 (query_len, tables_info_len, and tables_info) to binary log. Also
 write corresponding whole event length to the common-header.
 

-=-=(Alexi - Sun, 08 Nov 2009, 10:40)=-=-
Low Level Design modified.
--- /tmp/wklog.40.old.5055      2009-11-08 08:40:02.000000000 +0000
+++ /tmp/wklog.40.new.5055      2009-11-08 08:40:02.000000000 +0000
@@ -3,6 +3,7 @@
 
 1. Adding --binlog-with-tables-info option
 ******************************************
+GLOBAL, read-only option.
 
 When set, Query events are to be written in the extended binary
 format which contains tables_info. When not set, Query events

-=-=(Alexi - Thu, 05 Nov 2009, 12:37)=-=-
Low Level Design modified.
--- /tmp/wklog.40.old.11441     2009-11-05 12:37:16.000000000 +0200
+++ /tmp/wklog.40.new.11441     2009-11-05 12:37:16.000000000 +0200
@@ -1,9 +1,18 @@
 OPTION: 2.5 Extend Query Events With Tables Info
 ================================================
 
-1. Query_log_event Binary Format
-********************************
-Changes to be done:
+1. Adding --binlog-with-tables-info option
+******************************************
+
+When set, Query events are to be written in the extended binary
+format which contains tables_info. When not set, Query events
+are to be written in usual format (without any changes).
+
+2. Query event extended binary format
+*************************************
+
+When --binlog-with-tables-info is set, Query events are writen
+to binary log in the following (extended) format.
 
   Query_log_event binary format
   ---------------------------------
@@ -24,12 +33,12 @@
   error_code        2
   status_vars_len   2
 + query_len         2  (see Note 1)
-+ tables_info_len   2  (see Note 2)
++ tables_info_len   2
   ---------------------------------
   BODY:
   status_vars       status_vars_len
 - db                db_len + 1
-+ db                db_len  (see Note 3)
++ db                db_len  (see Note 2)
   query             query_len
 + tables_info
 
@@ -37,7 +46,7 @@
   ---------------------------------
   Name              Size (bytes)
   ---------------------------------
-  db_len            1  (see Note 4)
+  db_len            1  (see Note 3)
   db                db_len
   table_name_len    1
   table_name        table_name_len
@@ -48,19 +57,99 @@
   table_name        table_name_len
 
 NOTES
-1. Currently Query_log_event format doesn't include 'query_len' because
+1. In usual format, Query_log_event doesn't include 'query_len' because
    it considers the query to extent to the end of the event.
-2. If tables_info is not included in the event (--binlog-with-tables-info
-   option), tables_info_len = 0.
-3. The trailing zero is redundant since the length is already known.
-4. In case of db = current db, db_len = 0 and db = empty, because
-   current db is already included in the current event format.
+2. For 'db' (current db) the trailing zero is redundant since the length
+   is already known.
+3. db_len = 0 means that this is the current db.
+
+When reading Query events from binary log, we can recognize its format
+by its post-header length: in extended case the post-header includes 4
+additional bytes.
+
+  #define QUERY_HEADER_LEN                (QUERY_HEADER_MINIMAL_LEN + 4)
++ #define QUERY_HEADER_LEN_EXT            (QUERY_HEADER_LEN + 4)
+  ...
+  #define Q_STATUS_VARS_LEN_OFFSET        11
++ #define Q_QUERY_LEN_OFFSET              Q_STATUS_VARS_LEN_OFFSET + 2
++ #define Q_QUERY_TABLES_INFO_LEN_OFFSET  Q_QUERY_LEN_OFFSET + 2
+
+3. Changes in log events
+************************
+
+3.1. Format description event
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Changes needed here concern setting post-header length for Query events.
+This setting is done in Format description event constructor which creates
+the event for writing to binary log:
+
+  if (binlog_with_tables_info)
+      post_header_len[QUERY_EVENT - 1] = QUERY_HEADER_LEN_EXT;
+  else
+      post_header_len[QUERY_EVENT - 1] = QUERY_HEADER_LEN;
+
+This change is to be done only for case binlog_ver = 4.
+
+NOTE. The refered above constructor is allowed to be invoked in a client
+context for creating "artificial" Format description events in case of
+MySQL < 5.0 (e.g. see mysqlbinlog code). To avoid compilation problems
+(because of 'binlog_with_tables_info') and taking into account the
+"MySQL < 5.0" restriction, we have to #ifdef out the above code in
+following manner:
+
+  switch (binlog_ver) {
+  case 4: /* MySQL 5.0 and higher */
++ #ifndef MYSQL_CLIENT
+    ...
++ #else
++   <error>
++ #endif
+  case 1:
+  case 3:
+    ...
+  }
+
+3.2. Query event
+~~~~~~~~~~~~~~~~
+Changes needed here include adding tables_info and tables_info_len
+members (member for query length already exists) and modifying the
+following function-members:
+
+Query_log_event(buf) constructor
+--------------------------------
+[Parses binary format written to the 'buf']
+
+Getting post-header length from the Format description event (passed
+to the constructor as an argument), define whether buf contains an
+extended or usual Query event and parse the buf contents accordingly.
+
+NOTE. Defining Query event format here should be done with taking into
+account that this constructor can be called within a Query-derived
+event with the event_type argument != QUERY_EVENT.
+
+Query_log_event(thd) constructor
+--------------------------------
+[Creates the event for binlogging]
+
+In case of binlog_with_tables_info = TRUE, set additionally query_len,
+tables_info_len, and tables_info members (the constructor is to have
+an additional 'tables_info' argument).
+
+write() function
+----------------
+[Writes the event to binlog]
+
+In case of binlog_with_tables_info = TRUE, write additional members
+(query_len, tables_info_len, and tables_info) to binary log. Also
+write corresponding whole event length to the common-header.
+
+<To be continued>
 
-2. Where to get tables info from?
+4. Where to get tables info from?
 *********************************
 
-2.1. Case  study: CREATE TABLE
-******************************
+4.1. Case  study: CREATE TABLE
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 *** CREATE TABLE table [SELECT ...]
 
@@ -129,4 +218,4 @@
       }
     } 
 
-To be continued
+<To be continued>

-=-=(Alexi - Wed, 04 Nov 2009, 10:21)=-=-
Low Level Design modified.
--- /tmp/wklog.40.old.6734      2009-11-04 10:21:20.000000000 +0200
+++ /tmp/wklog.40.new.6734      2009-11-04 10:21:20.000000000 +0200
@@ -21,9 +21,9 @@
   slave_proxy_id    4
   exec_time         4
   db_len            1
-+ query_len         2  (see Note 1)
   error_code        2
   status_vars_len   2
++ query_len         2  (see Note 1)
 + tables_info_len   2  (see Note 2)
   ---------------------------------
   BODY:

	------------------------------------------------------------

		-=-=(View All Progress Notes -> 18 total)=-=-
	http://askmonty.org/worklog/index.pl?tid=40&nolimit=1


-----------------------------------------------------------------------
WorkLog (v4.0.0)