Get file content from multiple web pages using Dom Xpath

Chandimal Harischandra

When I execute this code only 2nd iteration's details are showed.
How can I iterate through both pages page 1 and page 2?

I have following code div which resides img:

  $ratings = array();       
for ($pageNum = 1; $pageNum < 3; $pageNum++) {

    $html = file_get_contents("http://www.example.com/store/abc/page/$pageNum");            
    @$dom = DOMDocument::loadHTML($html);

    //Init the XPath object
    $xpath = new DOMXpath($dom);

    //Query the DOM
    $rating = $xpath->query( '//div[contains(@class, "rating fl")]//img' );

    //Display the results as in the previous example

    foreach ($rating as $link) {
        //echo  $link->getAttribute('title'),'<br>';            
        $ratings[] = $link->getAttribute('title');                    
        if (sizeof($ratings) == 15) {
            //  var_dump($ratings);
        }
    }
}
chris85

You reset $ratings on every iteration so you only have the last passes values in the $ratings array.

Simplified version:

for($pageNum=1; $pageNum<3;$pageNum++){
    $ratings = array();
    $rating = array(0,1,2);
    foreach($rating as $link){
        $ratings[]  = $pageNum;
        echo $pageNum;
     }
}
print_r($ratings);

Output:

111222Array
(
    [0] => 2
    [1] => 2
    [2] => 2
)

If you comment out the ratings intializing or move it outside the loop it should work as expected.

for($pageNum=1; $pageNum<3;$pageNum++){
    //$ratings = array();
    $rating = array(0,1,2);
    foreach($rating as $link){
        $ratings[]  = $pageNum;
        echo $pageNum;
     }
}
print_r($ratings);

Output:

111222Array
(
    [0] => 1
    [1] => 1
    [2] => 1
    [3] => 2
    [4] => 2
    [5] => 2
)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Using dom to scrap the source code coming from file_get_content

From Dev

how to scrape content from a line break in a web pages using cheerio

From Dev

how to scrape content from a line break in a web pages using cheerio

From Dev

How to get DOM from a file using jsdom?

From Dev

php - DOM and file get content

From Dev

Get blob content from file using wildcard

From Dev

Get file path from system directory using Flutter web (chrome) to read file content Eg: CSV or Text file

From Dev

Extracting values from xml file using xpath with multiple conditions

From Dev

Select all elements from xml file using multiple xpath expressions

From Dev

How to get the content from web browser using python?

From Dev

Extracting semi-structured user generated content from web pages using Python

From Dev

Get the text content of an html element using xpath

From Dev

How to get content in a header using XPath

From Dev

Web scraping using python for multiple pages

From Dev

How to get DOM structure of matching xpath from xml structure using java/python

From Dev

Scraping text from multiple web pages in Python

From Dev

Get html content by id selector from external file using JQuery

From Dev

Separate Content from Presentation using Shadow DOM

From Dev

How do I extract data from multiple related web pages in Android using Jsoup?

From Dev

How to get multiple values using XPath?

From Dev

how to get multiple data from xpath query?

From Dev

.SelectSingleNode in Powershell script using xPath not working on extracting values from web.config file

From Dev

Scraping web content using xpath won't work

From Dev

Perl Get the web content then writing it as a text file

From Dev

How to get xpath from "Page Should Contain" in a web page using RobotFramework and Selenium

From Dev

Get link of inner div attribute using dom document and xpath php

From Dev

PowerShell get multiple elements from DOM

From Dev

Pages not being unloaded from DOM when using index.html

From Dev

Generate static HTML pages with content from text file

Related Related

  1. 1

    Using dom to scrap the source code coming from file_get_content

  2. 2

    how to scrape content from a line break in a web pages using cheerio

  3. 3

    how to scrape content from a line break in a web pages using cheerio

  4. 4

    How to get DOM from a file using jsdom?

  5. 5

    php - DOM and file get content

  6. 6

    Get blob content from file using wildcard

  7. 7

    Get file path from system directory using Flutter web (chrome) to read file content Eg: CSV or Text file

  8. 8

    Extracting values from xml file using xpath with multiple conditions

  9. 9

    Select all elements from xml file using multiple xpath expressions

  10. 10

    How to get the content from web browser using python?

  11. 11

    Extracting semi-structured user generated content from web pages using Python

  12. 12

    Get the text content of an html element using xpath

  13. 13

    How to get content in a header using XPath

  14. 14

    Web scraping using python for multiple pages

  15. 15

    How to get DOM structure of matching xpath from xml structure using java/python

  16. 16

    Scraping text from multiple web pages in Python

  17. 17

    Get html content by id selector from external file using JQuery

  18. 18

    Separate Content from Presentation using Shadow DOM

  19. 19

    How do I extract data from multiple related web pages in Android using Jsoup?

  20. 20

    How to get multiple values using XPath?

  21. 21

    how to get multiple data from xpath query?

  22. 22

    .SelectSingleNode in Powershell script using xPath not working on extracting values from web.config file

  23. 23

    Scraping web content using xpath won't work

  24. 24

    Perl Get the web content then writing it as a text file

  25. 25

    How to get xpath from "Page Should Contain" in a web page using RobotFramework and Selenium

  26. 26

    Get link of inner div attribute using dom document and xpath php

  27. 27

    PowerShell get multiple elements from DOM

  28. 28

    Pages not being unloaded from DOM when using index.html

  29. 29

    Generate static HTML pages with content from text file

HotTag

Archive