Benchmarking XML Data Management Systems
|
The need to efficiently store and manage large amounts of XML data is rapidly increasing due to the growing use of XML as an improved web format, as the native data format for a variety of applications and as a standard interchange format especially in the e-business domain.
Two main types of systems are promoted to manage XML data, namely
- native XML data stores, and
- relational/object-relational DBMS augmented with an extension to store and manipulate XML data.
Native data stores are tailored to XML requirements and thus promise performance benefits and improved support for specific XML requirements (e.g., complex document structure, fast path navigation, text search). Relational and object-relational systems, on the other hand, typically provide good scalability and a large repertoire of performance-improving techniques, e.g. for query processing, that can be exploited for at least certain usage forms of XML data. Furthermore, they may avoid having separate data management systems for SQL and XML.
We are currently evaluating several XML data management systems using a newly developed benchmark called XMach-1 [4]. While this is the first version of the benchmark we will improve it having practical results and new requirements. The current version of the paper is a revised version replacing the first draft from September 2000.
Since the first public draft of the XQuery query language for XML is published we have formulated the queries [5] from our specification in this language.
We have now updated the queries to fix some minor bugs and to reflect the changes made in the XQuery specification draft from 7 June 2001.
Implementation
We now have released the complete XMach-1 benchmark reference implementation consisting of a data generator to populate the XML benchmark database and a query execution framework. Please download it here [6]. This archive contains the sources, class files and documentation of the data generator and the generic query framework. For further information see the README file in directory doc.
Publications
- Böhme, T.; Rahm, E.: Supporting Efficient Streaming and Insertion of XML Data in RDBMS. [7] Proc. 3rd Int. Workshop Data Integration over the Web (DIWeb), LNCS, 2004
- Böhme, T.; Rahm, E.: Multi-User Evaluation of XML Data Management Systems with XMach-1. [8] Lecture Notes in Computer Science (LNCS) 2590, pp. 148-159, Springer, 2003
- Böhme, T., Rahm, E.: Benchmarking von XML-Datenbanksystemen. In: Web & Datenbanken. Dpunkt-Verlag, Sep. 2002
- Rahm, E., Böhme, T.: XMach-1: A Multi-User Benchmark for XML Data Management. [9] Proc. VLDB workshop Efficiency and Effectiveness of XML Tools, and Techniques (EEXTT2002), Hongkong, Aug. 2002 (Invited Talk)
- Böhme, T.: XML-Datenbanksysteme: Architekturen und Benchmarks. [10] In Bullinger, H.-J.; Weisbecker, A. (eds.): Proc. Innovationsforum "Content Management - Digitale Inhalte als Bausteine einer vernetzten Welt", Fraunhofer IRB Verlag, Stuttgart, 2002
- T. Böhme, E. Rahm: Benchmarking XML Database Systems -- First Experiences, [11] Position Paper, Ninth International Workshop on High Performance Transaction Systems (HPTS) [12], Pacific Grove, California, 14.-17. October, 2001
- T. Böhme, E. Rahm: XMach-1: A Benchmark for XML Data Management. [13] In Proceedings of German database conference BTW2001, Oldenburg, 7.-9. March, Springer, Berlin 2001 (HTML version [14])
Project Members
- Prof. Dr. E. Rahm [15]
- T. Böhme [16]
Links
- XML specification (W3C) [17]
- E-commerce benchmark TPC-W [18]
- Benchmark handbook (Jim Gray) [19]
- XML Database Products (Ronald Bourret) [20]
-
Other XML Database Benchmarks
- The Michigan Benchmark [21]
- XBench - A Family of Benchmarks for XML DBMSs [22]
- XMark - An XML Benchmark Project [23]
- The XOO7 Benchmark [24]