• 0

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191


File: /var/www/html/cnasolution/site/application/views/question.php
Line: 191
Function: _error_handler

File: /var/www/html/cnasolution/site/application/controllers/Questions.php
Line: 419
Function: view

File: /var/www/html/cnasolution/site/index.php
Line: 315
Function: require_once

name Punditsdkoslkdosdkoskdo

Trying to get values from a html page with PHP DOMDocument

There is a web page accessable to me via a intranet URL that I have no access to edit. It contains various span elements with text in them that I want to capture to use elsewhere. The span elements that I want each have a unique id so I would like to use this id to identify and capture the text I want. I', trying to use PHP's Domdocument to do this.

Here is an example of the html from the url.


Note: if I visit the url in a browser, I can see it's a full HTML document, the above is just a snippet.

Here is some of the the PHP code I'm trying to use to grab the various values.

// scrape the page to pull data.                     $page = file_get_contents([full url I have pulled from database here including http bit etc]);                     $doc = new DOMDocument();                      $doc->validateOnParse = true;                     $doc->preserveWhiteSpace = false;                     $doc->loadHTML($page);                      // define id attributes                     foreach($doc->getElementsByTagName('span') as $element)     {                             $element->setIdAttribute('id',true);                                                                                                                                     }                     // now work out from the table which ids we need to scrape and how many.                              $Column1Name = $ReadIDMapsRow['column1_name'];                             $Column1Value = $doc->getElementById($ReadIDMapsRow['column1_id']);                             $Column1ValueText =  $Column1Value->textContent; 

(In the above code, $ReadIDMapsRow['column1_id'] contains the id of the element I'm trying to capture, a string 'car7'.)

But when I looking at a get_defined_vars() debug print out I have on the output page I'm putting all this into, I can see the var $Column1ValueText is empty. (Along with any others I'm getting the same way)

    [Column1Name] => CAR     [Column1Value] =>      [Column1ValueText] =>  

It might be relevant that I also noticed that when I look at my debug into, I can see that the $doc debug info says

    [doc] => DOMDocument Object         (             [doctype] => (object value omitted)  <- this is a lie, it does have a doc type!             [implementation] => (object value omitted)             [documentElement] => (object value omitted)             [actualEncoding] =>              [encoding] =>              [xmlEncoding] =>              [standalone] => 1 

But if I inspect the page in Chrome it does have a doc type declaration at the top, and It's not just Chrome being generous and adding it, because I can see it in the $page var in my debug also:

 [page] =>    ... 
Download script demo [LINK] [Origin]
Download script demo [LINK 2] [Onedrive] Download script demo [LINK 2] [Google drive]