Discussion:
Problem with createProcessingInstruction() function
(too old to reply)
cold
2005-06-20 14:07:25 UTC
Permalink
I'm using VB6 and MSXML3. I create a XML file using the XML DOM but I have a
problem with the createProcessingInstruction() function, because I need to
specify that the file is encoded with UTF-8 encoding. My code is very
simple:

Dim pi As IXMLDOMProcessingInstruction
Set pi = doc.createProcessingInstruction("xml", "version=""1.0""
encoding=""UTF-8""")
Call doc.appendChild(pi)

Now, after this call, the pi.Text value is

version="1.0" encoding="UTF-8"

but the pi.xml property value is

<?xml version="1.0"?>

Why? I tried with UTF-16 encoding, and it works fine...is UTF-8 the default,
so MSXML ignores my specification? Anyway, I need to specify the encoding,
as the program that is supposed to read the XML file won't be able to parse
the file correctly without that.

Thank you in advance for your help

Sebastiano
Martin Honnen
2005-06-20 15:48:49 UTC
Permalink
Post by cold
I'm using VB6 and MSXML3. I create a XML file using the XML DOM but I have a
problem with the createProcessingInstruction() function, because I need to
specify that the file is encoded with UTF-8 encoding. My code is very
Dim pi As IXMLDOMProcessingInstruction
Set pi = doc.createProcessingInstruction("xml", "version=""1.0""
encoding=""UTF-8""")
Call doc.appendChild(pi)
Now, after this call, the pi.Text value is
version="1.0" encoding="UTF-8"
but the pi.xml property value is
<?xml version="1.0"?>
Why? I tried with UTF-16 encoding, and it works fine...is UTF-8 the default,
so MSXML ignores my specification? Anyway, I need to specify the encoding,
as the program that is supposed to read the XML file won't be able to parse
the file correctly without that.
The problem is that the .xml property is a string and a string is UTF-16
encoded thus when the .xml property is accessed it is kind of wrong to
include encoding="UTF-8". If you call the save method on the whole
document however then the document should be properly serialized with an
XML declaration and the encoding as specified e.g.
doc.save "file.xml"
should give a a document with
<?xml version="1.0" encoding="UTF-8"?>
I think.
On the other hand any XML parser needs to be able to deal with UTF-8 and
UTF-16 without it being explictly specified in the XML declaration.
--
Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
G***@verizon.net
2005-06-20 18:00:17 UTC
Permalink
I believe you can change it from UTF-8 to 16 (and vice-versa) using the
SAX Reader/Writer...

this is from the MSXML SDK...

encoding Property [Visual Basic]
Sets and gets encoding for the output.

[Visual Basic]
Implementation Syntax
Property Let IMXWriter_encoding(ByVal RHS As String)
Property Get IMXWriter_encoding() As String
Usage Syntax
oMXXMLWriter.encoding = strValue
strValue = oMXXMLWriter.encoding
Remarks
String. Read/write. The default string is dependent on implementation.
Microsoft® Visual Basic® strings are always UTF-16 encoded.


See Also
Character Encoding, XML, and MSXML | MXHTMLWriter CoClass | MXXMLWriter
CoClass

and some sample code...

#DefineFunction ParseXML(xml)
rdr = ObjectOpen("Msxml2.SAXXMLReader.3.0")
wrt = ObjectOpen("Msxml2.MXXMLWriter.3.0")
wrt.byteOrderMark = @False
wrt.omitXMLDeclaration = @False ; <---- key
wrt.indent = @True
;'set the writer to the content handler
rdr.contentHandler = wrt
rdr.dtdHandler = wrt
rdr.PutProperty("http://xml.org/sax/properties/lexical-handler", wrt)
rdr.PutProperty("http://xml.org/sax/properties/declaration-handler",
wrt)
rdr.Parse(xml)
newxml = wrt.output
objectclose(wrt)
objectclose(rdr)
return(newxml)
#EndFunction


xmlDoc = ObjectOpen("Msxml2.DOMDocument.3.0")
xmlDoc.async = @False
xmlDoc.loadXML(`<root/>`)

pri = xmlDoc.createProcessingInstruction("xml", "version='1.0'")
xmlDoc.insertBefore(pri, xmlDoc.documentElement)

clipput(xmlDoc.xml)
message("Debug", xmlDoc.xml)

produces...

<?xml version="1.0"?>
<root/>


; now run it thru SAX Reader/Writer
xmlDoc.loadXML(ParseXML(xmlDoc.xml))
clipput(xmlDoc.xml)


; produces...
<?xml version="1.0" encoding="UTF-16" standalone="no"?>
<root/>

exit
cold
2005-06-22 12:02:54 UTC
Permalink
Thanks a lot

Sebastiano

<***@verizon.net> ha scritto nel messaggio news:***@g14g2000cwa.googlegroups.com...
I believe you can change it from UTF-8 to 16 (and vice-versa) using the
SAX Reader/Writer...

this is from the MSXML SDK...

encoding Property [Visual Basic]
Sets and gets encoding for the output.

[Visual Basic]
Implementation Syntax
Property Let IMXWriter_encoding(ByVal RHS As String)
Property Get IMXWriter_encoding() As String
Usage Syntax
oMXXMLWriter.encoding = strValue
strValue = oMXXMLWriter.encoding
Remarks
String. Read/write. The default string is dependent on implementation.
Microsoft® Visual Basic® strings are always UTF-16 encoded.


See Also
Character Encoding, XML, and MSXML | MXHTMLWriter CoClass | MXXMLWriter
CoClass

and some sample code...

#DefineFunction ParseXML(xml)
rdr = ObjectOpen("Msxml2.SAXXMLReader.3.0")
wrt = ObjectOpen("Msxml2.MXXMLWriter.3.0")
wrt.byteOrderMark = @False
wrt.omitXMLDeclaration = @False ; <---- key
wrt.indent = @True
;'set the writer to the content handler
rdr.contentHandler = wrt
rdr.dtdHandler = wrt
rdr.PutProperty("http://xml.org/sax/properties/lexical-handler", wrt)
rdr.PutProperty("http://xml.org/sax/properties/declaration-handler",
wrt)
rdr.Parse(xml)
newxml = wrt.output
objectclose(wrt)
objectclose(rdr)
return(newxml)
#EndFunction


xmlDoc = ObjectOpen("Msxml2.DOMDocument.3.0")
xmlDoc.async = @False
xmlDoc.loadXML(`<root/>`)

pri = xmlDoc.createProcessingInstruction("xml", "version='1.0'")
xmlDoc.insertBefore(pri, xmlDoc.documentElement)

clipput(xmlDoc.xml)
message("Debug", xmlDoc.xml)

produces...

<?xml version="1.0"?>
<root/>


; now run it thru SAX Reader/Writer
xmlDoc.loadXML(ParseXML(xmlDoc.xml))
clipput(xmlDoc.xml)


; produces...
<?xml version="1.0" encoding="UTF-16" standalone="no"?>
<root/>

exit

Loading...