Use Crawlera with Selenium issues

Posted almost 5 years ago by Chuan H

Post a topic

Un Answered

Chuan H

Hi,

I followed the instructions to implement Crawlera with Selenium.

When using Chrome, there is a warning NET::ERR_CERT_AUTHORITY_INVALID.

When using Firefox, it goes to the url with a security exception but the page loads very slowly and it times out.

My code as below

headless_proxy = "localhost:3128"

#Chrome'

proxy = Proxy({

'proxyType': ProxyType.MANUAL,

'httpProxy': headless_proxy,

'ftpProxy' : headless_proxy,

'sslProxy' : headless_proxy,

'noProxy' : ''

})

chrome_options = Options()

chrome_options.add_argument('--start-fullscreen')

chrome_options.add_experimental_option("excludeSwitches",["ignore-certificate-errors"])

capabilities = dict(DesiredCapabilities.CHROME)

proxy.add_to_capabilities(capabilities)

driver = webdriver.Chrome(desired_capabilities=capabilities, executable_path='chromedriver', options=chrome_options)

driver.set_page_load_timeout(600)

# Firefox

firefox_capabilities = webdriver.DesiredCapabilities.FIREFOX

firefox_capabilities['marionette'] = True

firefox_capabilities['proxy'] = {

"proxyType": "MANUAL",

"httpProxy": headless_proxy,

"ftpProxy": headless_proxy,

"sslProxy": headless_proxy

}

driver = webdriver.Firefox(capabilities=firefox_capabilities)

driver.set_page_load_timeout(600)

I am using scrapy with crawlera and also tried Splash + scrapy-splash but there is a warning " scrapy-splash Call to deprecated function to_native_str. Use to_unicode instead". I followed this instructions https://support.scrapinghub.com/support/solutions/articles/22000234854-how-to-use-crawlera-with-headless-browsers

I just need to find a solution that works for scrapy+crawlera+rendering or crawelra+selenium.

1 Votes

0 Comments