Using curl and a user agent string for web scraping pt 2; Now with PHP

First off, Happy Thanksgiving to everyone, US readers especially. I know I should have written this sooner, but the holiday kept me pretty busy this year. And I hope you can accept my late well wishes.

And after you've had your fill of turkey, stuffing, and some pumpkin pie maybe you got the chance to play around with what I posted last week. And now lets build from that. Let's change the situation around a little bit; and say that a command line utility isn't your thing for this project. And you would like to use PHP.

First off, you'll want to make sure that your web server has curl compiled in. And the easiest way is to use phpinfo() to find it, and you'll want to look for this:

If you see something similar to what I have above, your good. Otherwise you'll either need to recompile your php to include libcurl, or complain to your hosting provider. Afterwards we can accomplish what we want with a couple lines of code:

<?php
 $curl_options 
= array(
    
CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.4) Gecko/20091030 Firefox/3.5.4",
    
CURLOPT_RETURNTRANSFER => 1
 
);
 
 
$url"http://profiles.us.playstation.com/playstation/psn/profiles/L_Cypher";

 
$ch curl_init($url);
 
curl_setopt_array($ch,$curl_options);
 
$content curl_exec($ch);
 
curl_close($ch);
?>

And now in your PHP code, you'll have a variable called $content that will contain all of the html code from the website. And how you wanna parse information outside of that is up to you. Of course if anyone has questions I will be happy to help out as best as I can.

Web site design

Yes ... the design is clearly needed to be changed :)
What would be brighter , nebudu (

If I understand well, it's

If I understand well, it's not possible to get information with cURL. May be with file_get_contents

Problem with your technics

Hi,
Great method, but I think that your method does not work any more.

You can try on my test website. I copy your code and add : "echo $content;"

Hi sebqas, Sorry about that,

Hi sebqas,

Sorry about that, but you are correct. Sony changed up their webite a little while ago (maybe 6-8 months or so), and they put it into a format that didn't lend itself to easy web scraping. I added an update to this post. But saddly I forgot to update this one; sorry.

If you made it this far down into the article, hopefully you liked it enough to share it with your friends. Thanks if you do, I appreciate it.

Bookmark and Share