XML Specifications 1.0/1.1
Introduction
In most scenarios with long-lived documents, we encounter the problem of document versions. Let us consider the case of W3C specifications like XQuery 1.0/1.1, XPath 1/2, XML 1.0/1.1., or even MathML1.0/1.0.1/2.0/2.0(2e)/3.0. They are encoded in XML format XMLSpec, so TNTBase and VDocs apply. Usually some parts of specification remained the same, while other parts change between the versions, and it is an important task to track the differences. For this use case we are experimenting with XML 1.0 and 1.1 specifications to supply the user with a view that will show only the relevant changes in the formal parts of specification branches. It is rather simple to provide an Diff VDoc via an XQuery that summarize changes in formal parts (the rules of the XML grammar are marked up by special elements in XMLSpec), ignores document order (grammars are sets, not lists of rules), and presents them as a document XMLSpec documents upgraded with difference alternatives. Note that our XQuery-based XML-diff comes in handy here. This VDoc gives a user better understanding in which direction the development is going and what changes are intended ones and which are made by mistake. Our Diff VDoc is also editable that allows a user to fix obvious bugs right on spot, without navigating to the source files. Once Diff VDoc is settled in TNTBase it can be reused to filter only relevant differences as well as transparently editing them, all in one place. Currently W3C stores specifications in a CVS repository, but does not make use of its differencing facilities for version tracking as diff is text-based and outputs even less and least relevant differences. Note that the Diff VDoc encapsulates a particular notion of relevance in the filtering part, which may need to be explained in a document preamble. Thus the representational form of a VDoc which mixes document parts and queries is beneficial. Moreover, there can be multiple Diff VDocs for tracking (and editing) various aspects of the differences in the specifications. Such Diff VDocs may even take over the role of conflict editors we currently have in version control aware IDEs.
Realization
Realization is rather simple taking into account that we already have XQuery function tntx:tnt-diff($source as element(), $target as element()) as element(). The main query for Diff VDoc is:
import module namespace tntx = 'http://tntbase.mathweb.org/ns/workaround' at '/info/kwarc/tntbase/core/xq/xml-tnt-diff.xq';
declare function tnt:ebnf($node as node()) as element()* {
$node//*[@lang='ebnf']
};
declare function tnt:spec1() as node() {
tnt:doc('/spec-1.0.xml')
};
declare function tnt:spec2() as node() {
tnt:doc('/spec-1.1.xml')
};
(: Formal parts that are present in both specs :)
let $ids := data(tnt:spec1()//prod[@id = tnt:spec2()//prod/@id]/@id)
return
for $i in $ids return
let $f1:= tnt:spec1()//prod[@id = $i]
let $f2:= tnt:spec2()//prod[@id = $i]
let $diff := tntx:tnt-diff($f1, $f2)
return if(not(empty($diff))) then
<diff>
{
tnt:make-editable($f1), tnt:make-editable($f2)
}
</diff>
else ()
Note that we export our XML diff from a slightly different namespace. This is because DB XML has some problems with importing multiple modules from the same namespace, therefore we came up with such workaround.
VDoc Spec is pretty straightforward:
<tnt:virtualdocument xmlns:tnt="http://tntbase.mathweb.org/ns">
<tnt:skeleton xml:id="exercises">
<diffs xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Diffs of specifications formal parts</dc:title>
<dc:creator>Vyacheslav Zholudev</dc:creator>
<tnt:xqinclude>
<tnt:query href="tntbase:/query.xq"/>
<tnt:return><tnt:result/></tnt:return>
</tnt:xqinclude>
</diffs>
</tnt:skeleton>
</tnt:virtualdocument>
And, finally, the VDoc content might look like:
<diffs xmlns:tnt="http://tntbase.mathweb.org/" tnt:revnum="120">
...
<diff>
<prod id="NT-Char" num="2" tnt:xpath="/spec[1]/body[1]/div1[2]/div2[2]/scrap[1]/prodgroup[1]/prod[1]" tnt:doc="/spec-1.0.xml">
<lhs>Char</lhs>
<rhs>
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
</rhs>
<com>
any Unicode character, excluding the surrogate blocks, FFFE, and FFFF.
</com>
</prod>
<prod id="NT-Char" num="2" tnt:xpath="/spec[1]/body[1]/div1[2]/div2[2]/scrap[1]/prodgroup[1]/prod[1]" tnt:doc="/spec-1.1.xml">
<lhs>Char</lhs>
<rhs>
[#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
</rhs>
<com>
any Unicode character, excluding the surrogate blocks, FFFE, and FFFF.
</com>
</prod>
</diff>
...
</diffs>
Note that parts from both specifications are editable.
