Let DS-MRR/CPK implementation use buffer space more efficiently. See HLS for details.
1. Context ---------- After looking at MWL#121 implementation, Igor had a comment that DS-MRR/CPK code does unnecessary data copying, as shown in this example: Consider a query: SELECT * FROM t1, t2 WHERE t2.primary_key=t1.col Suppose the join order is (t1, t2), and t2's access method is eq_ref(t2.primary_key=t1.col), which done with Batched Key Access + DS-MRR/CPK. The execution will proceed according to this scenario: 1. Scan table t1, accumulate values of t1.* in the join buffer 2. Start an MRR scan on table t2 on t2.primary_key=t1.col. If t2's primary key is clustered, DS-MRR/CPK will be used. 2.1 DS-MRR/CPK will run through the provided sequence(t1.col), collect values of t1.col in an array and then sort it. 2.2 DS-MRR/CPK will then do a set of t2.primary_key=t1.col lookups in disk order. It was stated that we have a problem on step 2.1, where we run through the sequence of lookup keys and copy them into array: This step is - necessary when BKA module calculates lookup keys on the fly, but - looks redundant when BKA module's buffer contains array{ t1.colX, ... t1.col[=needed lookup key] ,... } in its buffer, i.e. there is already a directly-addressable array (w/ gaps) of lookup keys somewhere in the memory. 2. Suggested remedy ------------------- The suggested remedy was as follows: - Let BKA module inform MRR that lookup keys are directly addressable in its buffers - Let DS-MRR/CPK implementation take advantage of that by storing/sorting key pointers. This strategy will be used only when BKA code actually has the lookup keys in the array. If it doesn't, current MWL#121 code is used. 2.1 SQL Layer part ~~~~~~~~~~~~~~~~~~ Within the scope of this WL - multi_range_read_info() will inform the SQL layer about whether MRR implementation would benefit from materialized materialized keys - SQL layer will inform MRR implementation in multi_range_read_init() about whether the sequence has materialized ranges. If that is the case, this WL's code will be used. 2.2 MRR part ~~~~~~~~~~~~ When MRR implementation is provided with materialized keys, it will take advantage of that by sorting not (key,range_id) pairs but (key_pointer, range_id) pairs.
Dependency created: WL#125 now depends on WL#123
Dependency created: WL#91 now depends on WL#123
Code Review updated: -> Igor
High-Level Specification modified. --- /tmp/wklog.123.old.4441 2010-07-13 22:45:09.000000000 +0000 +++ /tmp/wklog.123.new.4441 2010-07-13 22:45:09.000000000 +0000 @@ -1,75 +1,59 @@ +1. Context +---------- +After looking at MWL#121 implementation, Igor had a comment that DS-MRR/CPK +code does unnecessary data copying, as shown in this example: -1. Buffer usage ---------------- +Consider a query: -1.1 Efficient buffer space use problem -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Suppose we're given a sequence of $N ranges, and $SIZE kbytes of memory. The -buffer will be used for -1. storage/sorting of lookup keys -2. storage/sorting of rowids - -If we take a simple approach, we divide the buffer into two parts, one for -lookup keys and another for rowids. The problem is that we don't know for -certain how many rowids we will get for each lookup key, so we will either -not be able use the full buffer or will run out of buffer space and will -have to do two disk-ordered reads instead of one. - -We could re-use the space occupied by lookup keys to store rowids (actually, -<rowid, range_id> pairs) but that doesn't solve the problem of not being able -to tell how many lookup keys we should have in one sort+sweep-read step. + SELECT * FROM t1, t2 WHERE t2.primary_key=t1.col -1.2 Solution -~~~~~~~~~~~~ -The buffer should be divided in proportion of expected sizes: lookup keys -occupy - - (sizeof(key) + sizeof(range_id)) * n_keys (1) - -rowids will occupy +Suppose the join order is (t1, t2), and t2's access method is +eq_ref(t2.primary_key=t1.col), which done with Batched Key Access + DS-MRR/CPK. +The execution will proceed according to this scenario: - (sizeof(rowid) + sizeof(range_id)) * rec_per_key * n_keys (2) +1. Scan table t1, accumulate values of t1.* in the join buffer +2. Start an MRR scan on table t2 on t2.primary_key=t1.col. If t2's primary key + is clustered, DS-MRR/CPK will be used. +2.1 DS-MRR/CPK will run through the provided sequence(t1.col), collect values + of t1.col in an array and then sort it. +2.2 DS-MRR/CPK will then do a set of t2.primary_key=t1.col lookups in disk + order. -(TODO: what to do if rec_per_key value is unknown?) -We can divide both (1) and (2) by n_keys and that will give the needed -proportion. +It was stated that we have a problem on step 2.1, where we run through the +sequence of lookup keys and copy them into array: This step is + - necessary when BKA module calculates lookup keys on the fly, but + - looks redundant when BKA module's buffer contains + array{ t1.colX, ... t1.col[=needed lookup key] ,... } -We will take advantage of the fact that {key, range_id} pairs for which we got -{rowid, range_id} pairs are not needed anymore. This will be done as follows: + in its buffer, i.e. there is already a directly-addressable array (w/ gaps) + of lookup keys somewhere in the memory. -- Store {key, range_id} pairs from the start of the buffer -- sort them in reverse order -- start index-ordered read (that would require traversing the sorted array - backwards) -- keep track of how much more space becomes free after we've done lookup for - the next {key, range_id} pair. -- let the accumulated array of {rowid, range_id} pairs to grow from the end of - the buffer. +2. Suggested remedy +------------------- -Using this scheme, one array will be at start of the buffer and gradually -shrinking and the other at end of the buffer and gradually growing. Overflow -detection code should make sure that arrays never have any intersection with -one another. +The suggested remedy was as follows: +- Let BKA module inform MRR that lookup keys are directly addressable in its + buffers +- Let DS-MRR/CPK implementation take advantage of that by storing/sorting key + pointers. -2. Grouping identical keys --------------------------- -BKA could possibly supply multiple identical keys, that is, MRR implementation -will get many pairs like: +This strategy will be used only when BKA code actually has the lookup keys in +the array. If it doesn't, current MWL#121 code is used. - {keyX, rangeY1}, {keyX, rangeY2}, {keyX, rangeY3}, ... +2.1 SQL Layer part +~~~~~~~~~~~~~~~~~~ +Within the scope of this WL +- multi_range_read_info() will inform the SQL layer about whether MRR + implementation would benefit from materialized materialized keys -since we do index reads in key order, we'll be able to detect this case -(compare current key with the next/previous) and avoid making multiple lookups -with the same value of keyX. +- SQL layer will inform MRR implementation in multi_range_read_init() about + whether the sequence has materialized ranges. If that is the case, this WL's + code will be used. -This, however, will not be implemented within scope of this WL entry. We'll -rely on the fact that storage engine makes multiple repetitive index lookups to -be sufficiently fast. - -3. User control ---------------- -Introduce an @@optimizer_switch flag to control the behavior. -Flag name: mrr_sort_keys -Default value: ON +2.2 MRR part +~~~~~~~~~~~~ +When MRR implementation is provided with materialized keys, it will take +advantage of that by sorting not (key,range_id) pairs but (key_pointer, +range_id) pairs.
High-Level Specification modified. --- /tmp/wklog.123.old.4251 2010-07-13 22:38:08.000000000 +0000 +++ /tmp/wklog.123.new.4251 2010-07-13 22:38:08.000000000 +0000 @@ -1,44 +1,75 @@ -1. Context ----------- -After looking at MWL#121 implementation, Igor had a comment that DS-MRR/CPK -code does unnecessary data copying, as shown in this example: - -Consider a query: - - SELECT * FROM t1, t2 WHERE t2.primary_key=t1.col - -Suppose the join order is (t1, t2), and t2's access method is -eq_ref(t2.primary_key=t1.col), which done with Batched Key Access + DS-MRR/CPK. -The execution will proceed according to this scenario: - -1. Scan table t1, accumulate values of t1.* in the join buffer -2. Start an MRR scan on table t2 on t2.primary_key=t1.col. If t2's primary key - is clustered, DS-MRR/CPK will be used. -2.1 DS-MRR/CPK will run through the provided sequence(t1.col), collect values - of t1.col in an array and then sort it. -2.2 DS-MRR/CPK will then do a set of t2.primary_key=t1.col lookups in disk - order. - -It was stated that we have a problem on step 2.1, where we run through the -sequence of lookup keys and copy them into array: This step is - - necessary when BKA module calculates lookup keys on the fly, but - - looks redundant when BKA module's buffer contains - - array{ t1.colX, ... t1.col[=needed lookup key] ,... } - - in its buffer, i.e. there is already a directly-addressable array (w/ gaps) - of lookup keys somewhere in the memory. - -The suggested remedy was as follows: - -1.1 AccessKeysByPointersProposal -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- Let BKA module inform MRR that lookup keys are directly addressable in its - buffers -- Let DS-MRR/CPK implementation take advantage of that by storing/sorting key - pointers. -This strategy will be used only when BKA code actually has the lookup keys in -the array. If it doesn't, current MWL#121 code is used. +1. Buffer usage +--------------- +1.1 Efficient buffer space use problem +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Suppose we're given a sequence of $N ranges, and $SIZE kbytes of memory. The +buffer will be used for +1. storage/sorting of lookup keys +2. storage/sorting of rowids + +If we take a simple approach, we divide the buffer into two parts, one for +lookup keys and another for rowids. The problem is that we don't know for +certain how many rowids we will get for each lookup key, so we will either +not be able use the full buffer or will run out of buffer space and will +have to do two disk-ordered reads instead of one. + +We could re-use the space occupied by lookup keys to store rowids (actually, +<rowid, range_id> pairs) but that doesn't solve the problem of not being able +to tell how many lookup keys we should have in one sort+sweep-read step. + +1.2 Solution +~~~~~~~~~~~~ +The buffer should be divided in proportion of expected sizes: lookup keys +occupy + + (sizeof(key) + sizeof(range_id)) * n_keys (1) + +rowids will occupy + + (sizeof(rowid) + sizeof(range_id)) * rec_per_key * n_keys (2) + +(TODO: what to do if rec_per_key value is unknown?) +We can divide both (1) and (2) by n_keys and that will give the needed +proportion. + + +We will take advantage of the fact that {key, range_id} pairs for which we got +{rowid, range_id} pairs are not needed anymore. This will be done as follows: + +- Store {key, range_id} pairs from the start of the buffer +- sort them in reverse order +- start index-ordered read (that would require traversing the sorted array + backwards) +- keep track of how much more space becomes free after we've done lookup for + the next {key, range_id} pair. +- let the accumulated array of {rowid, range_id} pairs to grow from the end of + the buffer. + +Using this scheme, one array will be at start of the buffer and gradually +shrinking and the other at end of the buffer and gradually growing. Overflow +detection code should make sure that arrays never have any intersection with +one another. + +2. Grouping identical keys +-------------------------- +BKA could possibly supply multiple identical keys, that is, MRR implementation +will get many pairs like: + + {keyX, rangeY1}, {keyX, rangeY2}, {keyX, rangeY3}, ... + +since we do index reads in key order, we'll be able to detect this case +(compare current key with the next/previous) and avoid making multiple lookups +with the same value of keyX. + +This, however, will not be implemented within scope of this WL entry. We'll +rely on the fact that storage engine makes multiple repetitive index lookups to +be sufficiently fast. + +3. User control +--------------- +Introduce an @@optimizer_switch flag to control the behavior. +Flag name: mrr_sort_keys +Default value: ON
Privacy level updated. --- /tmp/wklog.123.old.8178 2010-07-05 00:13:46.000000000 +0000 +++ /tmp/wklog.123.new.8178 2010-07-05 00:13:46.000000000 +0000 @@ -1 +1 @@ -y +n
High-Level Specification modified. --- /tmp/wklog.123.old.27909 2010-07-03 16:41:16.000000000 +0000 +++ /tmp/wklog.123.new.27909 2010-07-03 16:41:16.000000000 +0000 @@ -39,6 +39,6 @@ pointers. This strategy will be used only when BKA code actually has the lookup keys in -the array. If it doesn't, current MWL#21 code is used. +the array. If it doesn't, current MWL#121 code is used.
High-Level Specification modified. --- /tmp/wklog.123.old.27673 2010-07-03 16:38:04.000000000 +0000 +++ /tmp/wklog.123.new.27673 2010-07-03 16:38:04.000000000 +0000 @@ -1 +1,44 @@ +1. Context +---------- +After looking at MWL#121 implementation, Igor had a comment that DS-MRR/CPK +code does unnecessary data copying, as shown in this example: + +Consider a query: + + SELECT * FROM t1, t2 WHERE t2.primary_key=t1.col + +Suppose the join order is (t1, t2), and t2's access method is +eq_ref(t2.primary_key=t1.col), which done with Batched Key Access + DS-MRR/CPK. +The execution will proceed according to this scenario: + +1. Scan table t1, accumulate values of t1.* in the join buffer +2. Start an MRR scan on table t2 on t2.primary_key=t1.col. If t2's primary key + is clustered, DS-MRR/CPK will be used. +2.1 DS-MRR/CPK will run through the provided sequence(t1.col), collect values + of t1.col in an array and then sort it. +2.2 DS-MRR/CPK will then do a set of t2.primary_key=t1.col lookups in disk + order. + +It was stated that we have a problem on step 2.1, where we run through the +sequence of lookup keys and copy them into array: This step is + - necessary when BKA module calculates lookup keys on the fly, but + - looks redundant when BKA module's buffer contains + + array{ t1.colX, ... t1.col[=needed lookup key] ,... } + + in its buffer, i.e. there is already a directly-addressable array (w/ gaps) + of lookup keys somewhere in the memory. + +The suggested remedy was as follows: + +1.1 AccessKeysByPointersProposal +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- Let BKA module inform MRR that lookup keys are directly addressable in its + buffers +- Let DS-MRR/CPK implementation take advantage of that by storing/sorting key + pointers. + +This strategy will be used only when BKA code actually has the lookup keys in +the array. If it doesn't, current MWL#21 code is used. +