fn:unparsed-text
The fn:unparsed-text
function reads an external resource (for example, a
file) and returns a string representation of the resource.
Signatures
fn:unparsed-text($href as xs:string?) as xs:string?
fn:unparsed-text(
$href as xs:string?,
$encoding as xs:string
) as xs:string?
Properties
This function is deterministic, context-dependent, and focus-independent. It depends on static base URI.
Rules
The $href
argument must be a string in the form of a URI
reference, which must contain no fragment identifier, and
must identify a resource for which a string representation is
available. If the URI is a relative URI reference, then it is resolved relative to
the
static base URI property from the static context.
The mapping of URIs to the string representation of a resource is the mapping defined in the available text resources component of the dynamic context.
If the value of the $href
argument is an empty sequence, the function
returns an empty sequence.
The $encoding
argument, if present, is the name of an encoding. The values
for this attribute follow the same rules as for the encoding
attribute in
an XML declaration. The only values which every implementation is required to recognize are
utf-8
and utf-16
.
The encoding of the external resource is determined as follows:
-
external encoding information is used if available, otherwise
-
if the media type of the resource is
text/xml
orapplication/xml
(see [RFC 2376]), or if it matches the conventionstext/*+xml
orapplication/*+xml
(see [RFC 7303] and/or its successors), then the encoding is recognized as specified in [XML 1.0], otherwise -
the value of the
$encoding
argument is used if present, otherwise -
the processor may use implementation-defined heuristics to determine the likely encoding, otherwise
-
UTF-8 is assumed.
The result of the function is a string containing the string representation of the resource retrieved using the URI.
Error Conditions
A dynamic error is raised [ERRFOUT1170] if $href
contains a fragment identifier, or if it cannot be resolved
to an absolute URI (for example, because the base-URI property in the static context
is absent),
or if it cannot be used to retrieve the string
representation of a resource.
A dynamic error is raised [ERRFOUT1190] if the value of the
$encoding
argument is not a valid encoding name, if the processor does not support the specified encoding, if
the string representation of the retrieved resource contains octets that cannot be
decoded into Unicode characters using the specified
encoding, or if the resulting characters are not permitted XML characters.
A dynamic error is raised [ERRFOUT1200] if $encoding
is absent and the processor cannot infer the
encoding using external information and the encoding is not UTF-8.
Notes
If it is appropriate to use a base URI other than the dynamic base URI (for example,
when resolving a relative URI reference read from a source document) then it is
advisable to resolve the relative URI reference using the fn:resolve-uri
function before passing it to the fn:unparsed-text
function.
There is no essential relationship between the sets of URIs accepted by the two
functions fn:unparsed-text
and fn:doc
(a URI accepted by one
may or may not be accepted by the other), and if a URI is accepted by both there is
no
essential relationship between the results (different resource representations are
permitted by the architecture of the web).
There are no constraints on the MIME type of the resource.
The fact that the resolution of URIs is defined by a mapping in the dynamic context means that in effect, various aspects of the behavior of this function are implementation-defined. Implementations may provide external configuration options that allow any aspect of the processing to be controlled by the user. In particular:
-
The set of URI schemes that the implementation recognizes is implementation-defined. Implementations may allow the mapping of URIs to resources to be configured by the user, using mechanisms such as catalogs or user-written URI handlers.
-
The handling of media types is implementation-defined.
-
Implementations may provide user-defined error handling options that allow processing to continue following an error in retrieving a resource, or in reading its content. When errors have been handled in this way, the function may return a fallback document provided by the error handler.
-
Implementations may provide user options that relax the requirement for the function to return deterministic results.
The rules for determining the encoding are chosen for consistency with [XInclude 1.0]. Files with an XML media type are treated specially because there are use cases for this function where the retrieved text is to be included as unparsed XML within a CDATA section of a containing document, and because processors are likely to be able to reuse the code that performs encoding detection for XML external entities.
If the text file contains characters such as <
and &
,
these will typically be output as <
and &
if
the string is serialized as XML or HTML. If these characters actually represent markup
(for example, if the text file contains HTML), then an XSLT stylesheet can attempt
to
write them as markup to the output file using the disable-output-escaping
attribute of the xsl:value-of
instruction. Note, however, that XSLT
implementations are not required to support this feature.
Examples
This XSLT example attempts to read a file containing 'boilerplate' HTML and copy it directly to the serialized output file:
<xsl:output method="html"/>
<xsl:template match="/">
<xsl:value-of select="unparsed-text('header.html', 'iso-8859-1')"
disable-output-escaping="yes"/>
<xsl:apply-templates/>
<xsl:value-of select="unparsed-text('footer.html', 'iso-8859-1')"
disable-output-escaping="yes"/>
</xsl:template>