Load HTML file and force UTF8 with PHP -


i accessing external url specific content of xpath.

i tried several different ways achieve this, of them end presenting little problem. after big research, way:

i create stream context open file right headers: utf-8

$opts=array('http' => array('header' => 'accept-charset: utf-8, *;q=0')); $context=stream_context_create($opts); $html=file_get_contents($url,false,$context); 

then, inside class, created domdocument object, load fetched html string, follows:

$this->dom->loadhtml(mb_convert_encoding($html, 'html-entities', "utf-8"), libxml_html_noimplied | libxml_html_nodefdtd); 

it works fine in every case, strip away complex characters, á, ó, ç, etc..

example: "gobierno marroquí para" turns "gobierno marroqu para"

i tried loading html plain text prefix <?xml encoding... , works fine, have issues further htmlpurifier operations.

any kind of information appreciated, not looking task me, right , efficient way. need understand can work it.

peace.


Comments

Popular posts from this blog

How has firefox/gecko HTML+CSS rendering changed in version 38? -

javascript - Complex json ng-repeat -

jquery - Cloning of rows and columns from the old table into the new with colSpan and rowSpan -