This is the mail archive of the docbook@lists.oasis-open.org mailing list for the DocBook project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [docbook] Re: [DBX5] Is this a DocBook document?


Norman Walsh wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

/ Tobias Reif <tobiasreif@pinkjuice.com> was heard to say:
| Processing tools such as validators need to be able to find out the
| main language of a document, so that they can know which schema(s)
| apply.

Another view is that schemas are "out of band" for validators. A
document can be validated (even successfully!:-) under many different
schemas.

Sure! There's a misunderstanding, we don't disagree AFAICS. The whole thing comes from the (hopefully temporary) problems that arise when a document doesn't reference a schema (as they did/do with DTD doctype declarations). I think a document should *not* have to reference a schema, thus I'm trying to find a way for my validator to find out which schema(s) to use when validating a document which doesn't reference a schema (the same question you posed in your blog AFAICS).


| The namespace plus a version attribute seems to be a possible solution.

Versioning is hard. I humbly point to some ongoing work in the TAG
on this subject: http://www.w3.org/2001/tag/doc/versioning.html

I'll check this, thanks for the link.


But a version attribute doesn't have to solve all problems of versioning:

SVG for example has a version attribute which seems to work well so far (same namespace name across versions), eg:

<svg xmlns="http://www.w3.org/2000/svg"; version="1.2">...

  <svg xmlns""http://www.w3.org/2000/svg"; version="1.1"
    baseProfile="tiny">...


http://www.w3.org/TR/SVG11/struct.html#SVGElement :


"version = "<number>"

Indicates the SVG language version to which this document fragment conforms.

[...] For SVG 1.1, the attribute should have the value "1.1"."


| So I propose that the next DocBook XML be in a namespace, and that
| each DBX document should have a required version attribute on it's
| root element, plus an optional profile attribute.

Putting DocBook in a namespace is a possibility. It's going to be
hugely backwards incompatible, though. It's not a step I'd want to
take lightly.

Absolutely.


I have mixed feelings about a version attribute. There are lots of
DocBook documents that are valid under many different versions. If I
write a 4.2 document today and it's still valid under 4.3, what does
version=4.2 mean?

It means that the document is written in 4.2, and the assertion holds true. If the assertion that the document is written in 4.3 also holds true, then this doesn't invalidate the first assertion.


If a document is written in a subset of 5.3 that is also a subset of 5.2, then that's not a problem for my validator or transformation tool, AFAICS.

The problem for which I need a solution is the following:

My validator or transformation tool need to know what kind of document it is dealing with, so it knows if it can handle it.

If the document is in the DocBook namespace and has a version attribute with value "5.0" then the validator can validate the document against (a) DBX 5.0 schema(s), and the transformation tool can generate XHTML etc.

If the transformation tool gets a document with a version attribute higher than 5.0, it can raise an error "I only support DocBook up to version 5.0, please try a later version of me."

(If the author knows that the doc is written using a subset of 5.3 that also is a subset of 5.0, and he wants to process it using tools which only support 5.0, then he can change the version attribute to specify "5.0" or ask the tools to ignore the version attribute specifying "5.3".)

| Ideally this should be standardized on the XML level:
|
| 1.
| xmlns=""
| (exists)

That's putting DocBook in "no namespace". That's not quite the same as
xmlns="http://www.oasis-open.org/docbook/"; or something like that.

I know.


Just as
  xml:version=""
below,
  xmlns=""
stands for
  xmlns="[value here]"

xml:version="" was meant to mean
"This is an examlple of how to put the whole document in a namespace. It could also done using a myriad of alternative ways, for example by using prefixes or mixing them with default namespace declarations."


Sorry for the confusion.

| 2.
| xml:version=""
| (doesn't exist yet)
| especiall needed if the language is not version 1.0 or if the
| namespace name doesn't contain version info (eg is the same for all
| versions, which I prefer).

I don't think xml:version is very likely.

How can my XSLT find out if it supports the version of DocBook that the document is written in? Various languages might choose different names for the version attribute (as with profile and SVG's baseProfile), thus standardization would help


When the document doesn't have a DTD document type declaration (no FPI etc), then how can tools find out which language is used? What schema(s) should my validator apply?

If there wil be no "xml:version" attribute, then I still see the need for a "version" attribute in DBX5+.

| 3.
| optional xml:profile=""
| (doesn't exist yet)
| Useful if a subset is used.

How does profile differ from version and namespace?

The optional profile attribute specifies the profile, which is a subset of the language identified by the namespace name and the version attribute.


What sorts of values does it hold?

SVG for example:


http://www.w3.org/TR/SVG11/struct.html#SVGElement

"baseProfile = profile-name

Describes the minimum SVG language profile that the author believes is necessary to correctly render the content. The attribute does not specify any processing restrictions; It can be considered metadata. For example, the value of the attribute could be used by an authoring tool to warn the user when they are modifying the document beyond the scope of the specified baseProfile. Each SVG profile should define the text that is appropriate for this attribute.

If the attribute is not specified, the effect is as if a value of "none" were specified."

http://www.w3.org/TR/SVGMobile/

http://www.w3.org/TR/SVGMobile/#sec-structure

"
<?xml version="1.0" standalone="yes"?>
<html xmlns="http://www.w3.org/1999/xhtml";
xmlns:svg="http://www.w3.org/2000/svg";>
<head>
<title xml:lang="en">Sample XHTML + SVG document</title>
</head>
<body>
<svg:svg width="4cm" height="8cm" version="1.1" baseProfile="tiny" >
<svg:ellipse cx="2" cy="4" rx="2" ry="1" />
</svg:svg>
</body>
</html>
"


"The 'baseProfile' attribute on the outermost 'svg' element must have the value "tiny" for SVG Tiny content, and "basic" for SVG Basic content. The 'baseProfile' attribute on nested child 'svg' elements is ignored. The SVG 1.1 specification states that the 'version' attribute of the outermost 'svg' element in SVG 1.1 content must have the value "1.1"."


But the profile attribute would be optional, and would only really make sense if the DB TC itself would specify profiles (subsets).


| Catalog entries could look like this:
| (SVG is used as example since it is namespaced, uses a version
| attribute, and also shows the requirement for specification of the
| profile attribute)
|
| <language
|    name="SVG"
|    ns="http://www.w3.org/2000/svg";
|    version="1.1">
|    <schemas>
|      <schema
|        official="yes"
|        schema-lang="DTD"
|        location="http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"/>
|      <schema
|        official="no"
|        schema-lang="RNG"
|        location="http://www.w3.org/Graphics/SVG/1.1/rng/svg11.rng"/>
|    </schemas>
| </language>

This looks more like a RDDL document than a catalog.

OASIS catalogs don't only map online URLs to local paths, but also FPIs to schemas. The latter is the reason why I use the name "catalog" for resources that map language identifiers (or language descriptions) to schemas.


Catalogs suffer from the limitation that the resolver doesn't really
know why the resource is being retrieved, so simple URI comparison is
about all it can do.

My validator needs to know which schema(s) to use when validating XML documents that don't reference a schema (eg no doctype declaration present). Here's what I might use for now:
(draft of a draft, most likely includes errors and design flaws :)


<?xml version="1.0"?>
<catalog xmlns="http://www.pinkjuice.com/catalog/"; version="0.1">
<!--
validator:
1. try to find out lang, look for local schema(s), validate
2. else DTD doctype declaration / OASIS catalog route
-->
  <language name="SVG 1.1">
    <doc>
      <has>namespace-uri(/svg)='http://www...'</has>
      <has>/svg/@version='1.1'</has>
      <has>not(/svg/@baseProfile)</has>
    </doc>
    <schemas>
      <official>
        <schema language="DTD">
          <home>http://</home>
          <local>file://</local>
        </schema>
      </official>
      <inofficial>
        <schema language="RNG">
          <home>http://</home>
          <local>file://</local>
        </schema>
      </inofficial>
    </schemas>
  </language>
  <!--
  put DBX5 in a namespace, require version attr on root
  element, add optional profile attribute
  -->
  <language name="DBX 5.0">
    <doc>
      <has>namespace-uri(/*)='http://...'</has>
      <has>/*/@version='5.0'</has>
    </doc>
  </language>
  <!-- ... -->
</catalog>

I'm mostly dealing with documents containing no or tiny portions from other namespaces which can be pragmatically ignored (in my scenarios) when validating, I try to keep stuff simple.

The above "catalog" mechansim is not a proposal, it's just a draft for a local solution.
But it should demonstrate that putting DBX5 in a namespace and requiring a version attribute on the root element of each document (that's my request/proposal) would be useful for humans and for various types of tools such as validators.


If we want to escape reliance on DTD, we need to find new ways for providing the information that we provided via DTD doctype declarations. Namespace name and version attribute will also have the advantage of being accessible from XSLT, thus are also useful in documents which do have a doctype declaration (and ns/version can also be used with inlined fragments).

Tobi

--
http://www.pinkjuice.com/


To unsubscribe from this list, send a post to docbook-unsubscribe@lists.oasis-open.org.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]