There are times where a PHP script needs to fetch the HTML from a remote page, or even post some data to a remote location. Learning how to use cURL or fsockopen() can be time consuming and unnecessary. There is a nice PHP class called Snoopy that can take care of all these functions very easily.


First, head on over to the project page and download a copy of the script. Once you extract the archive, you will only need Snoopy.class.php. Upload it to your favorite web host, and create a blank PHP script to get started with some examples.

Fetching a Page

Fetching data from a remote URL is very useful to have if you don’t want to mess with cURL or fsockopen. Snoopy takes care of all that for you.

<?php
/* load the snoopy class and initialize the object */
require('../includes/Snoopy.class.php');
$snoopy = new Snoopy();

/* load the page and print the results */
$snoopy->fetch('http://snoopy.sourceforge.net/');
echo '<pre>' . htmlspecialchars($snoopy->results) . '</pre>';
?>

Fetching a web page may be of little use, but we can also use this to retrieve remote XML or CSV files. Here is a more practical example.

Run This Example

<?php
/* load the snoopy class and initialize the object */
require('../includes/Snoopy.class.php');
$snoopy = new Snoopy();

/* fetch the data */
$snoopy->fetch('http://www.weather.gov/xml/current_obs/KLOT.xml');

/* parse the XML data */
$xml = new SimpleXMLElement($snoopy->results);

/* output some of the elements */
echo '<h3>Some Parsed Information</h3>';
echo '<img src="' . $xml->icon_url_base . '/' . $xml->icon_url_name . '" /><br />';
echo '<b>Reporting Station:</b> ' . $xml->location . '<br />';
echo '<b>Observation Time:</b> ' . $xml->observation_time_rfc822 . '<br />';
echo '<b>Temp:</b> ' . $xml->temperature_string . '<br />';
echo '<b>Wind:</b> ' . $xml->wind_string . '<br />';


echo '<hr /><h3>Raw XML</h3>';
echo '<pre>' . htmlspecialchars($snoopy->results) . '</pre>';
?>

Sending Post & Cookie Data

Some APIs, like PayPal’s Instant Payment Notification tool, require data to be sent to their servers in post form. Snoopy makes it as easy as putting the data in an array and sending it to the URL.

Here is the script that sends the some post data and a couple cookies:

<?php
/* load the snoopy class and initialize the object */
require('../includes/Snoopy.class.php');
$snoopy = new Snoopy();

/* set some values */
$p_data['color'] = 'Red';
$p_data['fruit'] = 'apple';

$snoopy->cookies['vegetable'] = 'carrot';
$snoopy->cookies['something'] = 'value';

/* submit the data and get the result */
$snoopy->submit('http://phpstarter.net/samples/118/data_dump.php', $p_data);

/* output the results */
echo '<pre>' . htmlspecialchars($snoopy->results) . '</pre>';
?>

And here is the result data on the server side:

/* $_POST */
array(2) {
  ["color"]=>
  string(3) "Red"
  ["fruit"]=>
  string(5) "apple"
}
/* $_COOKIE */
array(2) {
  ["vegetable"]=>
  string(6) "carrot"
  ["something"]=>
  string(5) "value"
}

Fetching Links

When you need to crawl a page for links, Snoopy has taken care of the trouble to parse the links from the page.

<?php
/* load the snoopy class and initialize the object */
require('../includes/Snoopy.class.php');
$snoopy = new Snoopy();

/* load the page and print the results */
$snoopy->fetchlinks('http://google.com/');
echo '<pre>' . var_export($snoopy->results, true) . '</pre>';
?>

Script output:

array (
  0 => 'http://images.google.com/imghp?hl=en&tab=wi',
  1 => 'http://maps.google.com/maps?hl=en&tab=wl',
  2 => 'http://news.google.com/nwshp?hl=en&tab=wn',
  3 => 'http://www.google.com/prdhp?hl=en&tab=wf',
  4 => 'http://mail.google.com/mail/?hl=en&tab=wm',
  5 => 'http://www.google.com/intl/en/options/',
  6 => 'http://video.google.com/?hl=en&tab=wv',
  7 => 'http://groups.google.com/grphp?hl=en&tab=wg',
  8 => 'http://books.google.com/bkshp?hl=en&tab=wp',
  9 => 'http://scholar.google.com/schhp?hl=en&tab=ws',
  10 => 'http://finance.google.com/finance?hl=en&tab=we',
  11 => 'http://blogsearch.google.com/?hl=en&tab=wb',
  12 => 'http://www.youtube.com/?hl=en&tab=w1',
  13 => 'http://www.google.com/calendar/render?hl=en&tab=wc',
  14 => 'http://picasaweb.google.com/home?hl=en&tab=wq',
  15 => 'http://docs.google.com/?hl=en&tab=wo',
  16 => 'http://www.google.com/reader/view/?hl=en&tab=wy',
  17 => 'http://sites.google.com/?hl=en&tab=w3',
  18 => 'http://www.google.com/intl/en/options/',
  19 => 'http://www.google.com/url?sa=p&pref=ig&pval=3&q=http://www.google.com/ig%3Fhl%3Den%26source%3Diglk&usg=AFQjCNFA18XPfgb7dKnXfKz7x7g1GDH1tg',
  20 => 'http://www.google.com/intl/en/ads/',
  21 => 'http://www.google.com/services/',
  22 => 'http://www.google.com/intl/en/about.html',
  23 => 'http://www.google.com/intl/en/privacy.html',
  24 => 'http://www.google.com/advanced_search?hl=en',
  25 => 'http://www.google.com/preferences?hl=en',
  26 => 'http://www.google.com/language_tools?hl=en',
)

More Options

We can set various other options and values such as:

  • User name/password for basic HTTP authentication
  • Proxy hosts & ports
  • Raw headers
  • Maximum redirects
<?php
/* don't forget to include the file */
$snoopy = new Snoopy;

/* make the request through a proxy server */
$snoopy->proxy_host = "my.proxy.host";
$snoopy->proxy_port = "8080";

/* change the user agent or refer URL */
$snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)";
$snoopy->referer = "http://www.microsnot.com/";

/* set some cookies */
$snoopy->cookies["SessionID"] = 238472834723489l;
$snoopy->cookies["favoriteColor"] = "RED";

/* set some raw headers */
$snoopy->rawheaders["Pragma"] = "no-cache";

/* set some redirect options */
$snoopy->maxredirs = 2;
$snoopy->offsiteok = false; /* allow a redirect to different domain */

/* set user/pass for basic HTTP authentication */
$snoopy->user = "me";
$snoopy->pass = "p@ssw0rd";
?>

There are even more options not covered here. To see what those options are, see the Snoopy.class.php file and the readme.txt file contained in the downloaded package.


5 Responses to “How to Post Data and Fetch Remote Pages from PHP Scripts”

  • Granit Luzhnica

    Thanks for the tutorial.
    But posting data over proxy with snoopy is extremely slow.
    I mean it could take till 20 minutes for a post.
    How could this be fixed?

     

  • Andrew

    The problem is likely the proxy server that you’re posting through. Try using the proxy server through another program (like Firefox or Internet Explorer) to test the speed and reliability before using it with Snoopy.

     

  • geby

    Hello,
    your little Tutorial helped me, thanks for that first.
    But i am in trouble with the POST Data.
    If i do like you discribed everything works fine for most of the form elements.
    But if the elements names or values contain specialcharacters like “[“, “]” or german umlaute like “Ö” etc they were ignored.
    Any idea how to handle this? I tried some stringfunctions (htmlentities, etc) or tried to write &Ouml; instead öf “Ö” in my Code.
    Any experience with that?
    Greets
    Geby
     
    PS: Plz Excuse my bad englisch 😉

     

  • Andrew

    As long as you have your data in quotes, the special characters should be OK.  Are you getting a PHP error/warning?

     

  • geby

    <?php
    require(‘Snoopy.class.php’);
    $snoopy = new Snoopy();
    $p_data[‘season’] = “one”;
    $p_data[‘land’] = “Österreich”;
    $p_data[“row[1]”] = ‘8’;
    $snoopy->submit(‘http://www.example.com’, $p_data);
    echo $snoopy->results;
    ?>
     
    This is what my Code looks like. “$p_data[‘season’] = “one”;” works fine.
    “$p_data[‘land’] = “Österreich”;” and “$p_data[“row[1]”] = ‘8’;” dont. I think because of the “Ö” and the “[“, “]”.
     

     

Leave a Reply





Wordpress doesn't like it when you post PHP code. Go save your code at pastebin, and post the link here.

About the Author

Andrew has been coding PHP applications since 2006, and has plenty of experience with PHP, MySQL, and Apache. He prefers Ubuntu Linux on his desktop and has plenty of experience at managing CentOS web servers. He is the owner of Wells IT Solutions LLC, and develops PHP applications full time for anyone that needs it as well as does desktop computer support locally in the local area. He spends most of his free time exploring new programming concepts and posting on The Webmaster Forums.