WorkLog Frontpage Log in / Register
High-Level Description | Task Dependencies | High-Level Specification | Low-Level Design | File Attachments | User Comments | Time Estimates | Funding and Votes | Progress Reports

 Parallel replication of transactions in distinct databases
Title
Task ID169
Queue
Version N/A
Status
PriorityN/A
Copies toSergei

Created byKnielsen13 Dec 2010Done
Supervisor N/A  
Lead Architect    
Architecture Review  
Implementor  
Code Review  
QA  
Documentation  
 High-Level Description
The main challenge in parallel replication is to determine which events are
safe and valid to run in parallel on the slave.

This worklog is about parallel replication of transactions that each access
only tables in a single database. Two such transactions can be safely applied
in parallel on a slave if the two databases they each access are distinct.

The term "database" is used here in the usual MySQL sense which is similar to
"schema" in other RDBMS.

There will be a server option "--fail-cross-database" to disallow
multi-database statements and cross-database transactions. Ie. cause such
queries to fail on the master so that the assumption remains valid on the
slave that transactions do not span databases. This option defaults to FALSE.

There will be a fixed number of apply threads on the slave, configured by the
user with the option "--slave-apply-threads". This will default to 1, which
will give the old non-parallel slave apply behaviour. Setting this to >1 will
enable parallel slave apply.

For this first version of parallel slave apply, it should probably be an error
to set --slave-apply-threads >1 without also setting --fail-across-database to
TRUE. If other options for discovering opportunities for parallel apply are
implemented later, this restriction can be lifted.

For this worklog, it is permissible, but not required, to use a simple
scheduling of parallel events to apply threads: take a numeric hash of the
database name and queue it for execution by thread (HASH mod N). This might be
simpler to implement than a more clever scheduling, though it might reduce the
opportunity for parallelism in case of hash collisions between distinct
database names.

When two transactions are applied in parallel on the slave, there is the
question whether they are allowed to commit in different order than they did
on the master. If commit in different order is allowed, this is called "full
independence". If not, this is called "binlog order".

With "binlog order", it is necessary to implement some mechanism to delay the
commit of one thread until other threads have applied and committed all prior
transactions. This waiting also has the potential to reduce the opportunities
for parallelism.

For "full independence", as a consequence of committing in different order,
there will be database states seen on the slave that never existed on the
master; the application will have to be tolerant of this. Also, the binlogs on
the slaves will have events in different order than on the master; this
complicates the promotion of a new master, as different servers will in
general have incommensurable states, neither of which is a subset of the
other.

It still needs to be determined which of "binlog order" and "full
independence" should be implemented. Probably a good idea is to start with
just "full independence" due to its simplicity. Then "binlog order" and/or
automated master promotion can be added as follow-up projects later.

Related worklogs:

MWL#170: Promotion of new master with full independence parallel replication.
 Task Dependencies
Others waiting for Task 169Task 169 is waiting forGraph
170 Promotion of new master in "full independence" parallel replication.
206 Parallel replication
 
 High-Level Specification
 Low-Level Design
 File Attachments
 NameTypeSizeByDate
 User Comments
 Time Estimates
NameHours WorkedLast Updated
Knielsen215 Dec 2010
Total2 
 Hrs WorkedProgressCurrentOriginal
This Task200
Total200
 
 Funding and Votes
Votes: 1: 67%
 Make vote: Useless    Nice to have    Important    Very important    

Funding: 0 offers, total 0 Euro
 Progress Reports
(Monty - Thu, 30 Jun 2011, 14:32
    
Dependency created: WL#206 now depends on WL#169

(Sergei - Fri, 17 Dec 2010, 21:07
    
Observers changed: Sergei

(Knielsen - Wed, 15 Dec 2010, 13:48
    
Writeup worklog
Worked 2 hours and estimate 0 hours remain (original estimate increased by 2 hours).

(Knielsen - Mon, 13 Dec 2010, 13:56
    
Dependency created: WL#170 now depends on WL#169

(Knielsen - Mon, 13 Dec 2010, 13:56
    
High Level Description modified.
--- /tmp/wklog.169.old.31776	2010-12-13 13:56:13.000000000 +0000
+++ /tmp/wklog.169.new.31776	2010-12-13 13:56:13.000000000 +0000
@@ -55,6 +55,6 @@
 
 Related worklogs:
 
-MWL#XXX: Promotion of new master with full independence parallel replication.
+MWL#170: Promotion of new master with full independence parallel replication.
 
 


Report Generator:
 
Saved Reports:

WorkLog v4.0.0
  © 2010  Sergei Golubchik and Monty Program AB
  © 2004  Andrew Sweger <yDNA@perlocity.org> and Addnorya
  © 2003  Matt Wagner <matt@mysql.com> and MySQL AB