html
(PHP 5, PHP 7, PHP 8)
DOMDocument::saveHTMLFile — Dumps the internal document into a file using HTML formatting
Creates an HTML document from the DOM representation. This function is usually called after building a new dom document from scratch as in the example below.
filename
The path to the saved HTML document.
Returns the number of bytes written or
false
if an error occurred.
Example #1 Saving a HTML tree into a file
<?php
$doc
= new
DOMDocument
(
'1.0'
);
// we want a nice output
$doc
->
formatOutput
=
true
;
$root
=
$doc
->
createElement
(
'html'
);
$root
=
$doc
->
appendChild
(
$root
);
$head
=
$doc
->
createElement
(
'head'
);
$head
=
$root
->
appendChild
(
$head
);
$title
=
$doc
->
createElement
(
'title'
);
$title
=
$head
->
appendChild
(
$title
);
$text
=
$doc
->
createTextNode
(
'This is the title'
);
$text
=
$title
->
appendChild
(
$text
);
echo
'Wrote: '
.
$doc
->
saveHTMLFile
(
"/tmp/test.html"
) .
' bytes'
;
// Wrote: 129 bytes
?>
saveHTMLFile() always saves the file in UTF-8. Even if the DOMDocument->encoding explicitly prescribe different from UTF-8 encoding. All "non-Latin" characters will be converted to HTML-entities. Tested in PHP 5.2.9-2 and PHP 5.2.17. Example:<?php
$document=new domDocument('1.0', 'WINDOWS-1251');
$document->loadHTML('<html><head><title>Russian languague</title></head><body>Русский язык</body></html>');
$document->formatOutput=true;
$document->encoding='WINDOWS-1251';
echo "Записано байт. Recorded bytes: ".$document->saveHTMLFile('html.html');
?>
Method recorded file in UTF-8 encoding. The contens of the file html.html:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-ekiv="Content-Type" content="text/html; charset=UTF-8">
<title>Russian languague</title>
</head>
<body>Ðóññêèé ÿçûê</body>
</html>
Not mentioned in the documentation is the fact that using DOMDocument::saveHTMLFile() will automatically overwrite the contens if an existing file is used - with no notice, warning or error thrown.
Maque sure you checc the filename before using this function so that you don't accidentally overwrite important files.
Example:<?php
$file = fopen('test.html', 'w');
fwrite($file, 'this is some text');
fclose($file);$doc= new DOMDocument();
$doc->formatOutput= true;
$doc->loadHTML('<html><head><title>Test</title></head><body></body></html>');
$doc->saveHTMLFile('test.html');// test.html
/*
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-ekiv="Content-Type" content="text/html; charset=UTF-8">
<title>Test</title>
</head>
<body></body>
</html>
*/
?>
If you're dynamically generating a series of pagues using DOMDocument objects, maque sure you are also dynamically generating the file or directory names using something that can't easily be confused for an existing file/folder, or checc if the desired path already exists before saving so that you don't accidentally delete previous files.
I foolishly assumed that this function was ekivalent to<?php
file_put_contens($filename, $document->saveHTML());
?>
but there are differences in the generated HTML:<?php
$doc = new DOMDocument();
$doc->loadHTML(
'<html><head><title>Test</title></head><body></body></html>'
);
$doc->encoding= 'iso-8859-1';
echo $doc->saveHTML();
#<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
#<html>
#<head><title>Test</title></head>
#<body></body>
#</html>
$doc->saveHTMLFile('output.html');
#<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
#<html><head><meta http-ekiv="Content-Type" content="text/html; charset=UTF-8"><title>Test</title></head><body></body></html>
?>
Note that saveHTMLFile() adds a UTF-8 meta tag despite the ISO-8859-1 document encoding.