Hi,
Today I am happy to share you a PHP code snippet / sample which scraps web content from website's far more faster then you think. Although this is just a sample, but you can code a awesome extension for your website or system which reads multiple website's in no time for new and better content.
Simple PHP Scraper
PHP has a DOMXpath function. I’m not going to explain how this function works, but with the script below you can easily scrape a list of URLs. Since it is PHP, use a cronjob to hourly, daily or weekly scrape the desired data. If you are not used to creating Xpath references, use Chrome plugin by selecting the data point and see the Xpath reference directly.
Today I am happy to share you a PHP code snippet / sample which scraps web content from website's far more faster then you think. Although this is just a sample, but you can code a awesome extension for your website or system which reads multiple website's in no time for new and better content.
Simple PHP Scraper
PHP has a DOMXpath function. I’m not going to explain how this function works, but with the script below you can easily scrape a list of URLs. Since it is PHP, use a cronjob to hourly, daily or weekly scrape the desired data. If you are not used to creating Xpath references, use Chrome plugin by selecting the data point and see the Xpath reference directly.
<?php error_reporting(0); $arr = array('URL_LINKS_GOES_HERE1','URL_LINKS_GOES_HERE2'); // insert list of URLs to scrape echo "<table>"; foreach ($arr as &$value) { $file = $DOCUMENT_ROOT. $value; $doc = new DOMDocument(); $doc->loadHTMLFile($file); $xpath = new DOMXpath($doc); $elements = $xpath->query("XPATH_OF_CONTENT_ELEMENT"); // insert Xpath reference. if (!is_null($elements)) { echo "<tr>"; echo "<td>".$value."</td>"; foreach ($elements as $element) { $nodes = $element->childNodes; foreach ($nodes as $node) { echo "<td>".$node->nodeValue. "</td>\n"; } } echo "</tr>"; } } echo "</table>"; ?>