Start a new topic
Answered

Using Crawlera with browsermob

I want to use Crawlera with Selenium and browsermob proxy, similar to the polipo post. Here is my code, which times out on all requests:

 

import os
import browsermobproxy
from selenium import webdriver

binary_path = os.path.expanduser('~/browsermob-proxy-2.1.4/bin/browsermob-proxy')
server = browsermobproxy.Server(binary_path)
server.start()
mob_proxy = 'http://login:password@proxy.crawlera.com:8010'
mob = server.create_proxy(params={'httpProxy': mob_proxy})

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=' + mob.proxy)
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get('http://google.com')

 

Any ideas?


Similar code with urllib2 works fine:


  

import urllib2

login = 'http://login:password@proxy.crawlera.com:8010'
proxy = urllib2.ProxyHandler({'http': login})
auth = urllib2.HTTPBasicAuthHandler()
opener = urllib2.build_opener(proxy, auth, urllib2.HTTPHandler)
urllib2.install_opener(opener)
conn = urllib2.urlopen('http://google.com')
return_str = conn.read()

  


Best Answer

sorry, we don't have a solution for this question. No one from the community has answered. You can try posting your question in stackoverflow - https://stackoverflow.com/questions/tagged/scrapinghub


Answer

sorry, we don't have a solution for this question. No one from the community has answered. You can try posting your question in stackoverflow - https://stackoverflow.com/questions/tagged/scrapinghub

can this be marked as unanswered?


1 person likes this

Can we get some documentation on how to use browsermob instead of polipo for selenium and python? Polipo is deprecated, and browsermob allows you to control headers so we could potentially use a a crawlera session . 

Does browsermob support proxy chain or upstream proxy with authentication?

Apparently - https://github.com/lightbody/browsermob-proxy#rest-api


"to authenticate with the chained proxy"

I don't think that works, it doesn't allow to set a parent proxy. I believe those settings are when you are behind a proxy.

   Maybe Java working example will be helpful

BrowserMobProxy proxy = new BrowserMobProxyServer();
proxy.setChainedProxy(InetSocketAddress.createUnresolved("proxy.crawlera.com", 8010));
proxy.chainedProxyAuthorization("MY_API_KEY", "", AuthType.BASIC);
proxy.start(0);
Proxy seleniumProxy = ClientUtil.createSeleniumProxy(proxy);
ChromeOptions options = new ChromeOptions();
options.setCapability(CapabilityType.PROXY, seleniumProxy);
proxy.enableHarCaptureTypes(CaptureType.REQUEST_CONTENT,....);
proxy.newHar("foo");
WebDriver driver = new ChromeDriver(options);

   

Login to post a comment