-----------------------------------------------------------------------
                              WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination
CREATION DATE..: Sun, 10 May 2009, 19:57
CREATED BY.....: Monty
SUPERVISOR.....: Monty
LEAD ARCHITECT.: 
ARCH REVIEW....: Igor
IMPLEMENTOR....: Igor
1st CODE REVIEW: Igor
2nd CODE REVIEW: Igor
QA.............: Igor
COPIES TO......: 
CATEGORY.......: Server-Sprint
TASK ID........: 17 (http://askmonty.org/worklog/?tid=17)
VERSION........: Server-5.1
STATUS.........: Complete
PRIORITY.......: 60

DEPENDS ON.....:

DEPENDANT......:

DESCRIPTION:

Eliminate not needed tables from SELECT queries..
This will speed up some views and automatically generated queries.

Example:

CREATE TABLE B (id int primary key);

select
 A.colA
from
 tableA A
left outer join
 tableB B
on
 B.id = A.id;

In this case we can remove table B and the join from the query.


PROGRESS NOTES:

-=-=(Sergei - Sun, 05 Sep 2010, 14:01)=-=-
Version updated.
--- /tmp/wklog.17.old.16512     2010-09-05 14:01:15.000000000 +0000
+++ /tmp/wklog.17.new.16512     2010-09-05 14:01:15.000000000 +0000
@@ -1,2 +1,2 @@
-Server-9.x
+Server-5.1
 

-=-=(Sergei - Sun, 05 Sep 2010, 14:01)=-=-
Status updated.
--- /tmp/wklog.17.old.16512     2010-09-05 14:01:15.000000000 +0000
+++ /tmp/wklog.17.new.16512     2010-09-05 14:01:15.000000000 +0000
@@ -1,2 +1,2 @@
-Assigned
+Complete
 

-=-=(Guest - Thu, 17 Jun 2010, 03:56)=-=-
Dependency deleted: 29 no longer depends on 17

-=-=(Guest - Thu, 01 Apr 2010, 01:56)=-=-
New attachment: '16_bitMICRO-PROCESSOR.htm'

-=-=(Guest - Wed, 24 Mar 2010, 05:58)=-=-
Status updated.
--- /tmp/wklog.17.old.13223     2010-03-24 05:58:26.000000000 +0000
+++ /tmp/wklog.17.new.13223     2010-03-24 05:58:26.000000000 +0000
@@ -1 +1 @@
-In-Progress
+Assigned

-=-=(Guest - Wed, 24 Mar 2010, 05:58)=-=-
Privacy level updated.
--- /tmp/wklog.17.old.13223     2010-03-24 05:58:26.000000000 +0000
+++ /tmp/wklog.17.new.13223     2010-03-24 05:58:26.000000000 +0000
@@ -1 +1 @@
-n
+y

-=-=(Guest - Wed, 28 Oct 2009, 02:01)=-=-
Version updated.
--- /tmp/wklog.17.old.24041     2009-10-28 02:01:58.000000000 +0200
+++ /tmp/wklog.17.new.24041     2009-10-28 02:01:58.000000000 +0200
@@ -1 +1 @@
-9.x
+Server-9.x

-=-=(Guest - Sun, 16 Aug 2009, 16:16)=-=-
Category updated.
--- /tmp/wklog.17.old.24882     2009-08-16 16:16:49.000000000 +0300
+++ /tmp/wklog.17.new.24882     2009-08-16 16:16:49.000000000 +0300
@@ -1 +1 @@
-Client-BackLog
+Server-Sprint

-=-=(Guest - Sun, 16 Aug 2009, 16:16)=-=-
Version updated.
--- /tmp/wklog.17.old.24882     2009-08-16 16:16:49.000000000 +0300
+++ /tmp/wklog.17.new.24882     2009-08-16 16:16:49.000000000 +0300
@@ -1 +1 @@
-Server-5.1
+9.x

-=-=(Guest - Wed, 29 Jul 2009, 21:41)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.26011     2009-07-29 21:41:04.000000000 +0300
+++ /tmp/wklog.17.new.26011     2009-07-29 21:41:04.000000000 +0300
@@ -2,163 +2,146 @@
 ~maria-captains/maria/maria-5.1-table-elimination tree.
 
 <contents>
-1. Conditions for removal
-1.1 Quick check if there are candidates
-2. Removal operation properties
-3. Removal operation
-4. User interface
-5. Tests and benchmarks
-6. Todo, issues to resolve
-6.1 To resolve
-6.2 Resolved
-7. Additional issues
+1. Elimination criteria
+2. No outside references check
+2.1 Quick check if there are tables with no outside references
+3. One-match check
+3.1 Functional dependency source #1: Potential eq_ref access
+3.2 Functional dependency source #2: col2=func(col1)
+3.3 Functional dependency source #3: One or zero records in the table
+3.4 Functional dependency check implementation
+3.4.1 Equality collection: Option1
+3.4.2 Equality collection: Option2
+3.4.3 Functional dependency propagation - option 1
+3.4.4 Functional dependency propagation - option 2
+4. Removal operation properties
+5. Removal operation
+6. User interface
+6.1 @@optimizer_switch flag
+6.2 EXPLAIN [EXTENDED]
+7. Miscellaneous adjustments
+7.1 Fix used_tables() of aggregate functions
+7.2 Make subquery predicates collect their outer references
+8. Other concerns
+8.1 Relationship with outer->inner joins converter
+8.2 Relationship with prepared statements
+8.3 Relationship with constant table detection
+9. Tests and benchmarks
 </contents>
 
 It's not really about elimination of tables, it's about elimination of inner
 sides of outer joins.
 
-1. Conditions for removal
--------------------------
-We can eliminate an inner side of outer join if:
-1. For each record combination of outer tables, it will always produce
-   exactly one record.
-2. There are no references to columns of the inner tables anywhere else in
+1. Elimination criteria
+=======================
+We can eliminate inner side of an outer join nest if:
+
+1. There are no references to columns of the inner tables anywhere else in 
    the query.
+2. For each record combination of outer tables, it will always produce
+   exactly one matching record combination.
+
+Most of effort in this WL entry is checking these two conditions.
 
-#1 means that every table inside the outer join nest is:
- - is a constant table:
-    = because it can be accessed via eq_ref(const) access, or
-    = it is a zero-rows or one-row MyISAM-like table  [MARK1]
- - has an eq_ref access method candidate.
-
-#2 means that WHERE clause, ON clauses of embedding outer joins, ORDER BY,
-  GROUP BY and HAVING do not refer to the inner tables of the outer join
-  nest.
-
-1.1 Quick check if there are candidates
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Before we start to enumerate join nests, here is a quick way to check if
-there *can be* something to be removed:
+2. No outside references check
+==============================
+Criterion #1 means that the WHERE clause, ON clauses of embedding/subsequent
+outer joins, ORDER BY, GROUP BY and HAVING must have no references to inner
+tables of the outer join nest we're trying to remove.
+
+For multi-table UPDATE/DELETE we also must not remove tables that we're
+updating/deleting from or tables that are used in UPDATE's SET clause.
+
+2.1 Quick check if there are tables with no outside references
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Before we start searching for outer join nests that could be eliminated, 
+we'll do a quick and cheap check if there possibly could be something that
+could be eliminated:
 
-  if ((tables used in select_list |  
+  if (there are outer joins && 
+      (tables used in select_list |  
        tables used in group/order by UNION |
-       tables used in where) != bitmap_of_all_tables)
+       tables used in where) != bitmap_of_all_join_tables)
   {
      attempt table elimination;
   }
 
-2. Removal operation properties
--------------------------------
-* There is always one way to remove (no choice to remove either this or that)
-* It is always better to remove as much tables as possible (at least within 
-  our cost model).
-Thus, no need for any cost calculations/etc. It's an unconditional rewrite.
 
-3. Removal operation
---------------------
-* Remove the outer join nest's nested join structure (i.e. get the
-  outer join's TABLE_LIST object $OJ and remove it from $OJ->embedding,
-  $OJ->embedding->nested_join. Update table_map's of all ancestor nested
-  joins).  [MARK2]
+3. One-match check
+==================
+We can eliminate inner side of outer join if it will always generate exactly
+one matching record combination.
 
-* Move the tables and their JOIN_TABs to front like it is done with const
-  tables, with exception that if eliminated outer join nest was within
-  another outer join nest, that shouldn't prevent us from moving away the
-  eliminated tables. 
+By definition of OUTER JOIN, a NULL-complemented record combination will be
+generated when the inner side of outer join has not produced any matches.
 
-* Update join->table_count and all-join-tables bitmap.
+What remains to be checked is that there is no possiblity that inner side of
+the outer join could produce more than one matching record combination.
 
-* That's it. Nothing else?
+We'll refer to one-match property as "functional dependency":
 
-4. User interface
------------------
-* We'll add an @@optimizer switch flag for table elimination. Tentative
-  name: 'table_elimination'.
-  (Note ^^ utility of the above questioned ^, as table elimination can never 
-   be worse than no elimination. We're leaning towards not adding the flag)
-
-* EXPLAIN will not show the removed tables at all. This will allow to check
-  if tables were removed, and also will behave nicely with anchor model and
-  VIEWs: stuff that user doesn't care about just won't be there.
+- A outer join nest is functionally dependent [wrt outer tables] if it will
+  produce one matching record combination per each record combination of
+  outer tables
 
-5. Tests and benchmarks
------------------------
-Create a benchmark in sql-bench which checks if the DBMS has table
-elimination.
-[According to Monty] Run
- - queries that would use elimination
- - queries that are very similar to one above (so that they would have same
-   QEP, execution cost, etc) but cannot use table elimination.
-then compare run times and make a conclusion about whether dbms supports table
-elimination.
+- A table is functionally dependent wrt certain set of dependency tables, if
+  record combination of dependency tables uniquely identifies zero or one
+  matching record in the table
 
-6. Todo, issues to resolve
---------------------------
+- Definitions of functional dependency of keys (=column tuples) and columns are
+  apparent.
 
-6.1 To resolve
-~~~~~~~~~~~~~~
-- Relationship with prepared statements. 
-  On one hand, it's natural to desire to make table elimination a
-  once-per-statement operation, like outer->inner join conversion. We'll have
-  to limit the applicability by removing [MARK1] as that can change during
-  lifetime of the statement.
-
-  The other option is to do table elimination every time. This will require to
-  rework operation [MARK2] to be undoable.
-  
-  I'm leaning towards doing the former. With anchor modeling, it is unlikely
-  that we'll meet outer joins which have N inner tables of which some are 1-row
-  MyISAM tables that do not have primary key.
-
-6.2 Resolved
-~~~~~~~~~~~~
-* outer->inner join conversion is not a problem for table elimination. 
-  We make outer->inner conversions based on predicates in WHERE. If the WHERE
-  referred to an inner table (requirement for OJ->IJ conversion) then table
-  elimination would not be applicable anyway.
-
-* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
-  - affected tables must not be eliminated
-  - tables that are used on the right side of the SET x=y assignments must 
-    not be eliminated either.
+Our goal is to prove that the entire join nest is functionally-dependent.
 
-* Aggregate functions used to report that they depend on all tables, that is,
+Join nest is functionally dependent (on the otside tables) if each of its
+elements (those can be either base tables or join nests) is functionally
+dependent.
    
-     item_agg_func->used_tables() == (1ULL << join->tables) - 1
+Functional dependency is transitive: if table A is f-dependent on the outer 
+tables and table B is f.dependent on {A, outer_tables} then B is functionally
+dependent on the outer tables.
+
+Subsequent sections list cases when we can declare a table to be
+functionally-dependent.
+
+3.1 Functional dependency source #1: Potential eq_ref access
+------------------------------------------------------------
+This is the most practically-important case. Taking the example from the HLD
+of this WL entry:
+
+  select
+    A.colA
+  from
+    tableA A
+  left outer join
+    tableB B
+  on
+    B.id = A.id;
 
-  always. Fixed it, now aggregate function reports it depends on
-  tables that its arguments depend on. In particular, COUNT(*) reports
-  that it depends on no tables (item_count_star->used_tables()==0). 
-  One consequence of that is that "item->used_tables()==0" is not 
-  equivalent to "item->const_item()==true" anymore (not sure if it's 
-  "anymore" or this has been already happening). 
-
-* EXPLAIN EXTENDED warning text was generated after the JOIN object has
-  been discarded. This didn't allow to use information about join plan 
-  when printing the warning. Fixed this by keeping the JOIN objects until 
-  we've printed the warning  (have also an intent to remove the const
-  tables from the join output).
- 
-7. Additional issues
---------------------
-* We remove ON clauses within outer join nests. If these clauses contain
-  subqueries, they probably should be gone from EXPLAIN output also?
-  Yes. Current approach: when removing an outer join nest, walk the ON clause 
-  and mark subselects as eliminated. Then let EXPLAIN code check if the 
-  SELECT was eliminated before the printing (EXPLAIN is generated by doing
-  a recursive descent, so the check will also cause children of eliminated
-  selects not to be printed)
-
-* Table elimination is performed after constant table detection (but before
-  the range analysis). Constant tables are technically different from 
-  eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
-  Considering we've already done the join_read_const_table() call, is there any
-  real difference between constant table and eliminated one? If there is, should
-  we mark const tables also as eliminated?
-  from user/EXPLAIN point of view: no. constant table is the one that we read
-  one record from. eliminated table is the one that we don't acccess at all.
+and generalizing it:  a table TBL is functionally-dependent if the ON 
+expression allows to construct a potential eq_ref access to table TBL that
+uses only outer or functionally-dependent tables.
+
+In other words: table TBL will have one match if the ON expression can be 
+converted into this form 
+
+    TBL.unique_key=func(one_match_tables) AND .. remainder ...
+
+(with appropriate extension for multi-part keys),  where 
+
+  one_match_tables= { 
+    tables that are not on the inner side of the outer join in question, and
+    functionally dependent tables
+  }
+
+Note that this will cover constant tables, except those that are constant because 
+they have 0/1 record or are partitioned and have no used partitions.
+
+
+3.2 Functional dependency source #2: col2=func(col1)
+----------------------------------------------------
+This comes from the second example in the HLS:
 
-* What is described above will not be able to eliminate this outer join 
   create unique index idx on tableB (id, fromDate);
   ...
   left outer join
@@ -169,32 +152,331 @@
     B.fromDate = (select max(sub.fromDate)
                   from tableB sub where sub.id = A.id);
 
-  This is because condition "B.fromDate= func(tableB)" cannot be used.
-  Reason#1: update_ref_and_keys() does not consider such conditions to 
-            be of any use (and indeed they are not usable for ref access)
-            so they are not put into KEYUSE array.
-  Reason#2: even if they were put there, we would need to be able to tell
-            between predicates like
-              B.fromDate= func(B.id)  // guarantees only one matching row as 
-                                      // B.id is already bound by B.id=A.id
-                                      // hence B.fromDate becomes bound too.
-            and
-              "B.fromDate= func(B.*)" // Can potentially have many matching
-                                      // records.
-  We need to
-  - Have update_ref_and_keys() create KEYUSE elements for such equalities
-  - Have eliminate_tables() and friends make a more accurate check.
-  The right check is to check whether all parts of a unique key are bound.
-  If we have keypartX to be bound, then t.keypartY=func(keypartX) makes 
-  keypartY to be bound.
-  The difficulty here is that correlated subquery predicate cannot tell what
-  columns it depends on (it only remembers tables). 
-  Traversing the predicate is expensive and complicated. 
-  We're leaning towards making each subquery predicate have a List<Item> with
-  items that 
-  - are in the current select
-  - and it depends on.
-  This list will be useful in certain other subquery optimizations as well, 
-  it is cheap to collect it in fix_fields() phase, so it will be collected 
-  for every subquery predicate. 
+Here it is apparent that tableB can be eliminated. It is not possible to
+construct eq_ref access to tableB, though, because for the second part of the
+primary key (fromDate column) we only got a condition in this form:
+
+   B.fromDate= func(tableB)
+
+(we write "func(tableB)" because ref optimizer can only determine which tables
+the right part of the equality depends on). 
+
+In general case, equality like this doesn't guarantee functional dependency. 
+For example, if func() == { return fromDate;}, i.e the ON expression is
+
+  ... ON B.id = A.id and B.fromDate = B.fromDate
+
+then that would allow table B to have multiple matches per record of table A. 
+
+In order to be able to distinguish between these two cases, we'll need to go 
+down to column level:
+
+- A table is functionally dependent if it has a unique key that's functionally
+  dependent
+
+- A unique key is functionally dependent when all of its columns are
+  functionally dependent
+
+- A table column is functionally dependent if the ON clause allows to extract
+  an AND-part in this form:
+  
+  tbl.column = f(functionally-dependent columns or columns of outer tables)
+
+3.3 Functional dependency source #3: One or zero records in the table
+---------------------------------------------------------------------
+A table with one or zero records cannot generate more than one matching
+record. This source is of lesser importance as one/zero-record tables are only
+MyISAM tables.
+
+3.4 Functional dependency check implementation
+----------------------------------------------
+As shown above, we need something similar to KEYUSE structures, but not
+exactly that (we need things that current ref optimizer considers unusable and
+don't need things that it considers usable).
+
+3.4.1 Equality collection: Option1
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We could 
+- extend KEYUSE structures to store all kinds of equalities we need 
+- change update_ref_and_keys() and co. to collect equalities both for ref 
+  access and for table elimination
+   = [possibly] Improve [eq_]ref access to be able to use equalities in 
+     form keypart2=func(keypart1) 
+- process the KEYUSE array both by table elimination and by ref access
+  optimizer.
+
++ This requires less effort. 
+- Code will have to be changed all over sql_select.cc
+- update_ref_and_keys() and co. already do several unrelated things. Hooking
+  up table elimination will make it even worse.
+
+3.4.2 Equality collection: Option2
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Alternatively, we could process the WHERE clause totally on our own. 
++ Table elimination is standalone and easy to detach module.
+- Some code duplication with update_ref_and_keys() and co.
+
+Having got the equalities, we'll to propagate functional dependency property
+to unique keys, tables and, ultimately, join nests.
+
+3.4.3 Functional dependency propagation - option 1
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Borrow the approach used in constant table detection code: 
+  
+  do 
+  {
+    converted= FALSE;
+    for each table T in join nest 
+    {
+      if (check_if_functionally_dependent(T))
+        converted= TRUE;
+    }
+  } while (converted == TRUE);
+
+  check_if_functionally_dependent(T)
+  {
+    if (T has eq_ref access based on func_dep_tables)
+      return TRUE;
+
+    Apply the same do-while loop-based approach to available equalities
+      T.column1=func(other columns)
+    to spread the set of functionally-dependent columns. The goal is to get 
+    all columns of a certain unique key to be bound.
+  }
+
+
+3.4.4 Functional dependency propagation - option 2
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Analyze the ON expression(s) and build a list of  
+
+  tbl.field = expr(...) 
+
+equalities. tbl here is a table that belongs to a join nest that could
+potentially be eliminated.
+
+besides those, add to the list 
+ - An element for each unique key in the table that needs to be eliminated
+ - An element for each table that needs to be eliminated
+ - An element for each join nest that can be eliminated (i.e. has no 
+   references from outside).
+
+Then, setup "reverse dependencies": each element should have pointers to
+elements that are functionally dependent on it:
+
+- "tbl.field=expr(...)" equality is functionally dependent on all fields that
+   are used in "expr(...)" (here we take into account only fields that belong 
+   to tables that can potentially be eliminated).
+- a unique key is dependent on all of its components
+- a table is dependent on all of its unique keys
+- a join nest is dependent on all tables that it contains
+
+These pointers are stored in form of one bitmap, such that:
+
+  "X depends on Y" == test( bitmap[(X's number)*n_objects + (Y's number)] )
+
+Each object also stores a number of dependencies it needs to be satisfied
+before it itself is satisfied:
+
+- "tbl.field=expr(...)" needs all its underlying fields (if a field is
+  referenced many times it is counted only once)
+
+- a unique key needs all of its key parts
+
+- a table needs only one of its unique keys
+
+- a join nest needs all of its tables
+
+(TODO: so what do we do when we've marked a table as constant? We'll need to
+update the "field=expr(....)" elements that use fields of that table. And the 
+problem is that we won't know how much to decrement from the counters of those 
+elements. 
+
+Solution#1: switch to table_map() based approach.
+Solution#2: introduce separate elements for each involved field.
+  field will depend on its table,
+  "field=expr" will depend on fields.
+)
+
+Besides the above, let each element have a pointer to another element, so that
+we can have a linked list of elements. 
+
+After the above structures have been created, we start the main algorithm.
+
+The first step is to create a list of functionally-dependent elements. We walk
+across array of dependencies and mark those elements that are already bound
+(i.e. their dependencies are satisfied). At the moment those immediately-bound
+are only "field=expr" dependencies that don't refer to any columns that are
+not bound. 
+
+The second step is the loop
+
+  while (bound_list is not empty)
+  {
+    Take the first bound element F off the list.
+    Use the bitmap to find out what other elements depended on it
+    for each such element E
+    {
+      if (E becomes bound after F is bound)
+        add E to the list;
+    }
+  }
+  
+The last step is to walk through elements that represent the join nests. Those
+that are bound can be eliminated.
+
+4. Removal operation properties
+===============================
+* There is always one way to remove (no choice to remove either this or that)
+* It is always better to remove as much tables as possible (at least within 
+  our cost model).
+Thus, no need for any cost calculations/etc. It's an unconditional rewrite.
+
+
+5. Removal operation
+====================
+(This depends a lot on whether we make table elimination a one-off rewrite or
+conditional)
+
+At the moment table elimination is re-done for each join re-execution, hence
+the removal operation is designed not to modify any statement's permanent
+members.
+
+* Remove the outer join nest's nested join structure (i.e. get the
+  outer join's TABLE_LIST object $OJ and remove it from $OJ->embedding,
+  $OJ->embedding->nested_join. Update table_map's of all ancestor nested
+  joins). [MARK2]
+
+* Move the tables and their JOIN_TABs to the front of join order, like it is
+  done with const tables, with exception that if eliminated outer join nest
+  was within another outer join nest, that shouldn't prevent us from moving
+  away the eliminated tables. 
+
+* Update join->table_count and all-join-tables bitmap.
+   ^ TODO: not true anymore ^
+
+* That's it. Nothing else?
+
+6. User interface
+=================
+
+6.1 @@optimizer_switch flag
+---------------------------
+Argument againist adding the flag:
+* It is always better to perform table elimination than not to do it.
+
+Arguments for the flag:
+* It is always theoretically possible that the new code will cause unintended
+  slowdowns.
+* Having the flag is useful for QA and comparative benchmarking.
+
+Decision so far: add the flag under #ifdef. Make the flag be present in debug
+builds.
+
+6.2 EXPLAIN [EXTENDED]
+----------------------
+There are two possible options:
+1. Show eliminated tables, like we do with const tables.
+2. Do not show eliminated tables. 
+
+We chose option 2, because:
+- the table is not accessed at all (besides locking it)
+- it is more natural for anchor model user - when he's querying an anchor-
+  and attributes view, he doesn't care about the unused attributes.
+
+EXPLAIN EXTENDED+SHOW WARNINGS won't show the removed table either. 
+
+NOTE: Before this WL, the warning text was generated after all JOIN objects
+have been destroyed. This didn't allow to use information about join plan 
+when printing the warning. We've fixed this by keeping the JOIN objects until
+the warning text has been generated.
+ 
+Table elimination removes inner sides of outer join, and logically the ON
+clause is also removed. If this clause has any subqueries, they will be 
+also removed from EXPLAIN output.
+
+An exception to the above is that if we eliminate a derived table, it will
+still be shown in EXPLAIN output. This comes from the fact that the FROM
+subqueries are evaluated before table elimination is invoked.
+TODO: Is the above ok or still remove parts of FROM subqueries?
+
+7. Miscellaneous adjustments
+============================
+
+7.1 Fix used_tables() of aggregate functions
+--------------------------------------------
+Aggregate functions used to report that they depend on all tables, that is,
+   
+   item_agg_func->used_tables() == (1ULL << join->tables) - 1
+
+always. Fixed it, now aggregate function reports that it depends on the
+tables that its arguments depend on. In particular, COUNT(*) reports that it
+depends on no tables (item_count_star->used_tables()==0). One consequence of
+that is that "item->used_tables()==0" is not equivalent to
+"item->const_item()==true" anymore (not sure if it's "anymore" or this has 
+been already so for some items). 
+
+7.2 Make subquery predicates collect their outer references
+-----------------------------------------------------------
+Per-column functional dependency analysis requires us to take a 
+  
+  tbl.field = func(...)
+
+equality and tell which columns of which tables are referred from func(...)
+expression. For scalar expressions, this is accomplished by Item::walk()-based
+traversal. It should be reasonably cheap (the only practical Item that can be 
+expensive to traverse seems to be a special case of "col IN (const1,const2,
+...)". check if we traverse the long list for such items). 
+
+For correlated subqueries, traversal can be expensive, it is cheaper to make
+each subquery item have a list of its outer references. The list can be
+collected at fix_fields() stage with very little extra cost, and then it could
+be used for other optimizations.
+
+
+8. Other concerns
+=================
+
+8.1 Relationship with outer->inner joins converter
+--------------------------------------------------
+One could suspect that outer->inner join conversion could get in the way
+of table elimination by changing outer joins (which could be eliminated)
+to inner (which we will not try to eliminate). 
+This concern is not valid: we make outer->inner conversions based on
+predicates in WHERE. If the WHERE referred to an inner table (this is a
+requirement for the conversion) then table elimination would not be
+applicable anyway.
+
+8.2 Relationship with prepared statements
+-----------------------------------------
+On one hand, it's natural to desire to make table elimination a 
+once-per-statement operation, like outer->inner join conversion. We'll have
+to limit the applicability by removing [MARK1] as that can change during
+lifetime of the statement.
+
+The other option is to do table elimination every time. This will require to
+rework operation [MARK2] to be undoable.
+  
+
+8.3 Relationship with constant table detection
+----------------------------------------------
+Table elimination is performed after constant table detection (but before
+the range analysis). Constant tables are technically different from 
+eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
+Considering we've already done the join_read_const_table() call, is there any
+real difference between constant table and eliminated one? If there is, should
+we mark const tables also as eliminated?
+from user/EXPLAIN point of view: no. constant table is the one that we read
+one record from. eliminated table is the one that we don't acccess at all.
+TODO
+
+9. Tests and benchmarks
+=======================
+Create a benchmark in sql-bench which checks if the DBMS has table
+elimination.
+[According to Monty] Run
+ - query Q1 that would use elimination
+ - query Q2 that is very similar to Q1 (so that they would have same
+   QEP, execution cost, etc) but cannot use table elimination.
+then compare run times and make a conclusion about whether the used dbms 
+supports table elimination.
 

	------------------------------------------------------------

		-=-=(View All Progress Notes -> 35 total)=-=-
	http://askmonty.org/worklog/index.pl?tid=17&nolimit=1


-----------------------------------------------------------------------
WorkLog (v4.0.0)