.. Factorized databases III Základní horizontální logolink 13 Jakub Zavodny (University of Oxford, UK) Palacky University, Olomouc, Czech Republic Základní horizontální verze logolinku v češtině Základní horizontální verze logolinku v angličtině J. Zavodny (U Oxford) Factorized databases III September 3, 2013
September 3, 2013 Symposium on Relational Data Analysis, Olomouc Factorised Databases 3. Recent Developments: Factorisations with Pointers Jakub Závodný, University of Oxford based on joint work with PhD supervisor Dan Olteanu
Factorisations Q(A,B,C,D,E,F) = R(A,B,C) S(A,B,D) T(A,E) U(E,F) A T B E U R C D S F ( a ( b ( c ) ( d )) ( e ( f ))) a A b B c C d D e E f F
Repeating Subexpressions in Factorisations Q(A,B,C,D,E,F) = R(A,B,C) S(A,B,D) T(A,E) U(E,F) A T B E U R C D S F ( a ( b ( c ) ( d )) ( e ( f ))) a A b B c C d D e E f F The entire expression ( f F f ) depends only on the preceding e. Given e, the expression ( f F f ) is same for each a.
Using Pointers in Factorisations Q(A,B,C,D,E,F) = R(A,B,C) S(A,B,D) T(A,E) U(E,F) A T B E U R C D S F Store the expression ( f F f ) once for each e and use pointers! ( a ( b ( c ) ( d )) )) ( e U e a A b B c C d D e E {U e = f F e f }
Using Pointers in Factorisations Q(A,B,C,D,E,F) = R(A,B,C) S(A,B,D) T(A,E) U(E,F) A T B E U R C D S F ( a ( b ( c ) ( d )) )) ( e U e a A b B c C d D e E Total size of factorisation with pointers is O( D ). {U e = f F e f }
Flat vs. Factorisation vs. Factorisation with Pointers For a conjunctive query Q, For any D, Q(D) is O( D ρ (Q) ). For any D, Q(D) has factorisation of size O( D s(q) ). [AGM 08] [OZ 11] For any D, Q(D) has factorisation with pointers of size O( D s (Q) ). [OZ 13] ρ (Q) = fractional edge cover number of the entire query. s(q) = fractional edge cover number of root-to-leaf paths in best f-tree. s (Q) = fractional edge cover number of dependency paths in best f-tree. 1 s (Q) s(q) ρ (Q) Q
(Hyper)Tree Decompositions F-trees are closely related to tree decompositions and path decompositions. R S A C D V B E F G H U T s r A,B,C v B,C,E w C,D,E B,G,E t E,G,H B,F,G u A B C E D G H F s (Q) = fractional hypertree width of the hypergraph of Q s(q) fractional hyperpath width of the hypergraph of Q
Road Map of Query Decomposition Parameters ρ (Q): size bounds for results of Q. s(q): size bounds for factorisations. s (Q): size bounds for factorisations with pointers. fhw(q): fractional hypertree width fhpw(q): fractional hyperpath width 1 s (Q) = fhw(q) fhpw(q) s(q) ρ (Q) Q }{{} factor O(log Q ) Each can express a gap of any size as permitted by other inequalities.
September 3, 2013 Symposium on Relational Data Analysis, Olomouc Factorised Databases 3. Future Directions: Instance-based Factorisation Jakub Závodný, University of Oxford based on joint work with PhD supervisor Dan Olteanu
Instance-based Factorisation All previous work was about joins / query results. Predictable structure factorisable using f-trees. nice size bounds. allows for fast querying. What about general relations? Given a relation R, find a good factorisation. over f-tree? as small as possible? allowing for fast querying?
Instance-based Factorisation chain country Tesco Czech Republic Tesco Slovakia Tesco Hungary Tesco Poland Tesco UK Tesco Spain Lidl Czech Republic Lidl Slovakia Lidl Hungary Lidl Poland Lidl UK Aldi Hungary Aldi Poland Aldi UK Aldi Spain ( Tesco Lidl ) ( CzechRep Slovakia ) ( Tesco Lidl Aldi ) ( Hungary Poland UK ) ( Tesco Aldi ) Spain
Instance-based Factorisation Smallest possible number of products biclique cover. CzechRep Slovakia Hungary Poland Spain UK Tesco Lidl Aldi ( Tesco Lidl ) ( CzechRep Slovakia ) ( Tesco Lidl Aldi ) ( Hungary Poland UK ) ( Tesco Aldi ) Spain Biclique cover is NP-hard.
Instance-based Factorisation CzechRep Slovakia Hungary Poland Spain UK Tesco Lidl Aldi ( Tesco Lidl ) ( CzechRep Slovakia ) ( Tesco Lidl Aldi ) ( Hungary Poland UK ) ( Tesco Aldi ) Spain Heuristics: Find large, maximal bicliques (formal concepts). Remove and factorise the rest. + Adaptations for higher dimensions. + P-time factorisation algorithm for exact products.
Thank you!