DeWitt, David J.;
An Effective Change Detection Algorithm for XML
Proc. 19th Intl. Conf. Data Engineering March 5-8, 2003,
IEEE Computer Society;
Abstract: XML has become the de facto standard format for web
publishing and data transportation. Since online information
changes frequently, being able to quickly detect changes in XML
documents is important to Internet query systems, search engines,
and continuous query systems. Previous work in change detection on
XML, or other hierarchically structured documents, used an ordered
tree model, in which left-to-right order among siblings is
important and it can affect the change result. This paper argues
that an unordered model (only ancestor relationships are
significant) is more suitable for most database applications.
Using an unordered model, change detection is substantially harder
than using the ordered model, but the change result that it
generates is more accurate. This paper proposes X-Diff, an
effective algorithm that integrates key XML structure
characteristics with standard tree-to-tree correction techniques.
The algorithm is analyzed and compared with XyDiff [CAM02], a
published XML diff algorithm. An experimental evaluation on both
algorithms is provided.