xmlvalid
Name
xmlvalid — validate xml documents [version 1.2]
Synopsis
xmlvalid [
OPTION
...] [URL...]
Usage
xmlvalid -c --sum --novalidate --names foo*.xml
Description
xmlvalid is a command-line utility for validating XML files.
It is produced and maintained by ElCel Technology.
For each URL, xmlvalid validates the XML contents against
a DTD. The DTD is determined by the DOCTYPE declaration within the XML file
or may be specified explicitly by using the --dtd option.
xmlvalid does not perform validation against an XML schema.
Each filename passed on the command line is treated as a URL.
If no protocol is present in the URL, it is assumed to refer to a local file.
For example 'c:\test.xml' is treated as being equivalent to the URL 'file:///c:\test.xml'.
To read from the standard input specify a URL of '-'.
xmlvalid can be configured to route HTTP network requests
via a proxy server. See Network Access Options for further details.
On Microsoft Windows platforms, to compensate for the lack of shell file name expansion, xmlvalid
automatically expands file names containing wild-card characters ('*' and '?') into a list of matching files.
This expansion only occurs when the URL(s) look like file names, i.e. they do not contain a
protocol such as "file:".
Messages produced by xmlvalid can be translated into your native language.
This is described in the section titled Native Language Support.
General Options
Single character option names may be concatenated together. POSIX-style option
names (those beginning with --) must be specified separately, but may be abbreviated.
- -c --context
Enhance error messages by displaying a fragment of the input document with an
arrow beneath indicating the position in the input document where the error
was detected.
- --dtd DTD
Treat each document as if it contains a reference to the specified DTD URL.
This option may be useful when a document needs to be validated but is lacking
a DOCTYPE declaration. It may also be useful when the DTD referenced by the
document has moved (or the document itself has moved) making the DTD inaccessible.
However this situation may be better addressed by the use of
a XML catalog - see the section titled Entity Resolution.
This option has no effect on the internal DTD subset that
may be present in the input file(s).
- -d --nonsdecl
Disable the validation of namespace declaration attributes
(e.g. xmlns:foo="xxx"). By default, when validating, all
attributes are validated - including those prefixed with the
xmlns namespace prefix. This is the correct behaviour according to the XML 1.0
and XML namespace recommendations. This option only takes effect when the
--namespaces option is specified.
- -e --noextentities
Prevent the reading of external entities, including the external DTD subset.
- -E --nopes
Disable the processing of parameter entities, including those in the internal DTD subset.
- -h --help
Display a brief help page with available options and exit.
- -i --interop
Enable additional tests that check the xml for
interoperability with SGML-based systems.
- -I --interactive
Prompt for console input after each input file that contained errors. Useful
when validating many files in one go on environments that do not support the
more command.
- -m --maxerrors COUNT
Limit the number of error messages displayed for each input file to COUNT.
- -n --namespaces
Enable XML namespace processing. When specified, xmlvalid
becomes namespace aware and checks that element and attribute names conform
to the QName production and that namespace prefixes are correctly
specified.
- -q --quiet
Prevents the display of status messages.
- -Q --veryquiet
Prevents the display of all messages, even errors. This option
may be useful when calling xmlvalid from a batch
script when you only need the final return code or when using the
--summary option.
- -s --summary
After processing all input files, display a summary of the warnings, errors,
and fatal errors encountered.
- -v --novalidate
Disable validation tests. This effectively turns xmlvalid
into a well-formedness checker. However, the external DTD subset and other
external entities are still read in the same way as when xmlvalid
is acting as a validating processor.
- --verbose
Display progress information.
- -V --version
Display the version number and exit.
- -w --warnings
The XML 1.0 recommendation specifies a number of conditions
that XML processors may report as warnings but that
are not errors. This option enables these tests which are
disabled by default.
Network Access Options
xmlvalid uses the capabilities of the ElCel Technology library
to access files from the Internet. In some organizations access to the Internet
is provided via a proxy server, sometimes requiring authentication.
The following options can be used to control how xmlvalid
accesses network resources.
- --httpproxy SERVER[:PORT]
This option, or the use of the ET_HTTP_PROXY environment variable,
causes xmlvalid to use the specified HTTP proxy server
to satisfy HTTP network requests. If a port number is not specified then 8080
is used by default.
- -p --password PASSWORD
This option, or the use of the ET_HTTP_PASSWORD environment variable,
specifies the password to send to origin HTTP servers for authentication.
- -P --proxypassword PASSWORD
This option, or the use of the ET_HTTP_PROXY_PASSWORD environment variable,
specifies the password to send to the HTTP proxy server for authentication.
- -u --user USER
This option, or the use of the ET_HTTP_USER environment variable,
specifies the user name to send to origin HTTP servers for authentication.
- -U --proxyuser USER
This option, or the use of the ET_HTTP_PROXY_USER environment variable,
specifies the user name to send to the HTTP proxy server for authentication.
XML Catalog Options
The DTD and other external entities referenced within the input file(s) can be
resolved using an XML Catalog. This is further described in the section titled
Entity Resolution.
- -g --catalog CATALOG
This option, or the use of the ET_XMLCAT_CATALOG environment variable,
causes xmlvalid to use the specified XML catalog entry file
for Entity Resolution.
- --nocatalogpis
Disable the processing of <?oasis-xml-catalog?>
processing instructions.
- --prefer [system|public]
This option, or the ET_XMLCAT_PREFER environment variable,
is used to set the application
preference for system or public identifiers. The default value is 'public',
which means that public catalog entries may be used to resolve
external entities even when a system identifier exists for the resource. Note
that system catalog entries still take precedence over public catalog entries
even when this option is set to 'public'.
Return Codes
0 - Success,
1 - Warnings issued,
2 - Errors detected,
3 - Fatal errors detected
Entity Resolution
xmlvalid can use a XML catalog to look up and
resolve public and system identifiers. This important feature
is present in SGML systems but was originally absent from most
XML tools. The XML catalog file is specified with the
--catalog option, the
ET_XMLCAT_CATALOG environment variable or the
<?oasis-xml-catalog?> processing instruction imbedded
in the prolog (before the DOCTYPE declaration) of your XML document.
The format and semantics of the XML catalog entries follow the
OASIS XML Catalog specification.
Control over whether public or system identifiers are preferred is provided by
means of the --prefer option or the
ET_XMLCAT_PREFER environment variable.
The OASIS XML Catalog specification.
describes a powerful set of features which cannot adequately be described here. However
a brief example of a valid catalog entry file is shown below:-
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<public publicId="-//OASIS//DTD DocBook XML V4.1.2//EN"
uri="docbookx.dtd"/>
<system systemId="docbookx.dtd" uri="docbookx.dtd"/>
<delegateSystem systemIdStartString="doc"
catalog="http://www.acme.org/DocBook/catalog"/>
<delegatePublic publicIdStartString="-//OASIS"
catalog="file:///usr/doc/oasis/catalog.xml"/>
</catalog>
Note how the example makes use of the default namespace. All catalog elements
must be within the
urn:oasis:names:tc:entity:xmlns:xml:catalog
namespace.
If no catalog is specified (either using the --catalog option, the
ET_XMLCAT_CATALOG environment variable or the <?oasis-xml-catalog?>
processing instruction) or a catalog match fails to occur, then xmlvalid will read external entities by dereferencing
the system identifiers. The <?oasis-xml-catalog?> processing instruction
can be disabled by specifying the --nocatalogpis option.
Note: it is possible to see the effect that the XML Catalog is having on entity
resolution by using the --verbose option.
Native Language Support
By default, xmlvalid produces error messages in the
English language but these can be replaced with messages written in a
native language of your choice.
The ElCel Technology Native Language Authoring Kit
contains message catalogs translated into
other languages. You may be lucky and find that messages for your native language
have already been translated. More likely, if you wish to use native language messages
you will need to undertake the translation work yourself. The Kit contains
pro-forma message catalogs written in English which form the basis for the
native language versions. It is not necessary to translate all messages for the
exercise to be useful, translating just the common messages is feasible.
Once you have the necessary message catalogs, it is quite straight forward to configure
xmlvalid to use them. This is achieved by setting
two environment variables: LANG and ET_MSG_DIR.
- LANG
This is used by many programs and utilities to determine the locale category for
native language, local customs and coded characters. It normally contains
a language and region code such as "en_GB", "en_AU" or "fr_FR";
- ET_MSG_DIR
This is used to specify the base directory under which the
native language message catalogs are located. On UNIX systems
this is commonly /usr/share or /usr/share/locale but may be
any valid directory name.
When searching for message catalogs, xmlvalid concatenates
the environment variables like this: $ET_MSG_DIR/elcel/$LANG/.
Within the message directory, the message catalog files have a suffix of
.msg and are named in accordance with
the library or application to which they refer.
For example, if ET_MSG_DIR=/usr/share
and LANG=fr_FR the message catalog file
containing XML validation messages (in the French language) would be
/usr/share/elcel/fr_FR/xml.msg. These messages may be shared
by other tools built with . The catalog file containing messages
specific to xmlvalid would be named
/usr/share/elcel/fr_FR/xmlvalid.msg.
Feedback
We welcome feedback about our products. If you have a bug report, a suggested
enhancement or simply enjoy using xmlvalid please
let us know. support@elcel.com.
xmlvalid version 1.2, March 2003