Minimization of XML Tree Pattern Queries in the Presence of Integrity Constraints
Yangjun Chen* and Dunren Che**
*Department of Applied Computer Science, University of Winnipeg, 515 Portage Avenue, Winnipeg, Manitoba R3B 2E9, Canada
**Department of Computer Science, Southern Illinois University, Carbondale, IL 62901, USA
In this paper, we provide a polynomial-time tree pattern query minimization algorithm whose efficiency stems from two key observations: (i) Inherent redundant “components” usually exist inside the rudimentary query provided by the user. (ii) Irredundant nodes may become redundant when constraints such as co-occurrence and required child/descendant are given. We show the result that the algorithm obtained by first augmenting the input tree pattern using the constraints, and then applying minimization, always finds the unique minimal equivalent to the original query. We complement our analytical results with an experimental study that shows the effectiveness of our tree pattern minimization techniques.
-  J. McHugh and J. Widom, “Query Optimization for XML,” Proceedings of the 25th VLDB Conference, Edinburgh, Scotland, 1999.
-  W. Fan and J. Simeon, “Integrity Constraints for XML,” Proceedings of ACM PODS Conference, pp. 23-34, 2000.
-  S. Amer-Yahia et al., “Minimization of Tree Pattern Queries,” Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 497-508, 2001.
-  S. Chakravathy, J. Grant, and J. Minker, “Foundations of Semantic Query Optimization for Deductive Databases,” Foundations of DD and LP, 1988.
-  D. Florescu, A. Levy, and D. Suciu, “Query Containment for Conjunctive Queries with Regular Expressions,” Proceedings of the 17th ACM Symp., Principles of Database Systems, pp. 139-148, 1998.
-  D. Calvanese, G. DeGiacomo, and M. Lenzerini, “Decidability of Query Containment under Constraints,” Proceedings of 17th ACM Symp., Principles of Database Systems, pp. 149-158, 1998.
-  M. R. Garey and D. S. Johnson, “Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., NY, 1979.
-  P. T. Wood, “Optimizing Web Queries Using Document Type Definitions,” Proceedings of 2nd ACM CIKM International Workshop on We Information and Data Management, pp. 28-32, 1999.
-  P. T. Wood, “On the Equivalence of XML Patterns,” Proceedings of 1st International Conference on Computational Logic, Lecture Notes in Artificial Intelligence 1861, pp. 1152-1166, Springer-Verlag, NY, 2000.
-  P. T. Wood, “Rewriting XQL Queries on XML Repositories,” Proceedings of 17th British National Conf. on Databases, Lecture Notes in Computer Science 1832, pp. 209-226, Springer-Verlag, NY, 2000.
-  P. T. Wood, “Minimizing Simple XPath Expressions,” WebDB 2001.
-  P. T. Wood, “Containment for XPath Fragments under DTD Constraints,” Proceedings of 9th International Conference of Database Theory, pp. 300-314, 2003.
-  World Wide Web Consortium. XML Path Language (XPath), W#C Recommendation, Version 1.0, November 1999. See
-  G. Miklau and D. Suciu, “Containment and Equivalence for an XPath Fragment,” Proceedings of 21st ACM Symp., Principles of Database Systems, 2002.
-  D. Chamberlin, J. Clark, D. Florescu, and M. Stefanescu, “XQuery1.0: An XML Query Language,”
-  A. Deutch, M. Fernandex, D. Florescu, A. Levy, and D. Suciu, “A Query Language for XML,” WWW’99.
-  S. Al-Khalifa, H. V. Jagadish, N. Koudas, J. M. Patel, D. Srivastava, and Y. Wu, “Structural Joins: A Primitive for Efficient XML Query Pattern Matching,” Proceedings of the 18th International Conference on Data Engineering, 2002.
-  D. Che and K. Aberer, “Query Processing and Optimization in XML Structured-Document Databases,” In preparation for the VLDB Journal.
-  Y. R. S. Yalamanchili, “Empirical Study of XML Query Optimization,” SIUC thesis collection, Spring 2005.
-  S. Flesca, F. Furfaro, and E. Masciari, “On the Minimization of Xpath Queries,” Proceedings of the 29th VLDB Conference, Berlin, Germany, 2003
-  D. Lee and W. W. Chu, “Constraints-preserving Transformation from XML Document Type Definition to Relational Schema,” Proceedings of the 19th International Conference on Conceptual Modeling, pp. 323-338, 2000.
-  C. Yu and L. Popa, “Constraint-Based XML Query Rewriting for Data Integration,” Proceedings of the ACM SIGMOD International Conference, Paris, France, 2004.
-  Y. Chen and D. Che, “Efficient Processing of XML Tree Pattern Queries,” Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol.10, No.5, pp. 738-743, 2006.