Using Zyte Smart Proxy Manager with PHP

Modified on Tue, 18 Jan, 2022 at 1:27 PM

cURL


Making use of PHP binding for libcurl library:


<?php

$ch = curl_init();

$url = 'https://httpbin.zyte.com/get';
$proxy = 'proxy.zyte.com:8011';
$proxy_auth = '<API KEY>:';

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxy_auth);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_TIMEOUT, 180);
curl_setopt($ch, CURLOPT_CAINFO, '/path/to/crawlera-ca.crt'); //required for HTTPS
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 1); //required for HTTPS

$scraped_page = curl_exec($ch);

if($scraped_page === false)
{
    echo 'cURL error: ' . curl_error($ch);
}
else
{
    echo $scraped_page;
}

curl_close($ch);

?>


Please be sure to download the certificate provided in your Zyte Smart Proxy Manager(formerly Crawlera) account's settings page (visit

https://app.zyte.com/o/<ORG_ID>/crawlera/setup

) and set the correct path to the file in your script.


Refer to curl_multi_exec function to take advantage of Smart Proxy Manager's concurrency feature and process requests in parallel (within the limits set for a given Smart Proxy Manager plan).


Guzzle


Making use of Guzzle, a PHP HTTP client, in the context of Symfony framework:


<?php

namespace AppBundle\Controller;

use GuzzleHttp\Client;
use Symfony\Bundle\FrameworkBundle\Controller\Controller;
use Sensio\Bundle\FrameworkExtraBundle\Configuration\Route;
use Symfony\Component\HttpFoundation\Response;

class CrawleraController extends Controller
{
    /**
     * @Route("/crawlera", name="crawlera")
     */
    
    public function crawlAction()
    {
        $url = 'https://twitter.com';
        $client = new Client(['base_uri' => $url]);
        $crawler = $client->get($url, ['proxy' => 'http://<API KEY>:@proxy.zyte.com:8011'])->getBody();

        return new Response(
            '<html><body> '.$crawler.' </body></html>'
        );
    }
}


Another Guzzle example:


<?php

use GuzzleHttp\Client as GuzzleClient;

$proxy_host = 'proxy.zyte.com';
$proxy_port = '8011';
$proxy_user = '<API KEY>';
$proxy_pass = '';
$proxy_url = "http://{$proxy_user}:{$proxy_pass}@{$proxy_host}:{$proxy_port}";

$url = 'https://httpbin.org/headers';

$guzzle_client = new GuzzleClient();
$res = $guzzle_client->request('GET', $url, [
    'proxy' => $proxy_url,
    'headers' => [
        'X-Crawlera-Cookies' => 'disable',
        'Accept-Encoding' => 'gzip, deflate, br',
    ]
]);

echo $res->getBody();

?>

To save images, you can also use this code snippet:

$stream = \GuzzleHttp\Psr7\Utils::streamFor($r->getBody());

file_put_contents('test.jpg',$stream->getContents());


Sign Up Here and start using Smart Proxy Manager with PHP.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article