爬蟲的各位看過來

  • 2019 年 10 月 6 日
  • 筆記

最近無意中找到了一個很牛逼的網站,可以把網站的那些請求頭拿到,這大大節省了我們找header的時間,那我們了看看這個網站以及看看如何使用它吧!

網址:https://curl.trillworks.com/

打開網址是這樣的一個頁面。在頁面上有使用教程

Get a curl command from Chrome

1) Open the network tab in Chrome DevTools (Cmd + Opt + I)

2) Control-click a request and navigate to "Copy as cURL".

3) Paste it in the curl command box.

我們用豆瓣電影來試試:

選中圈圈裡的內容,右鍵選擇「copy」,再選擇「Copy as Curl」

之後把copy的內容粘貼到方框里:

生成程式碼:

import requests

headers = {

'Origin': 'https://movie.douban.com',

'Accept-Encoding': 'gzip, deflate, br',

'Accept-Language': 'en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7',

'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36',

'Accept': '*/*',

'Referer': 'https://movie.douban.com/',

'Connection': 'keep-alive',

}

params = (

('include', 'anony_home'),

)

response = requests.get('https://m.douban.com/j/puppy/frodo_landing', headers=headers, params=params)

print(response.text)

#NB. Original query string below. It seems impossible to parse and

#reproduce query strings 100% accurately so the one below is given

#in case the reproduced version is not "correct".

# response = requests.get('https://m.douban.com/j/puppy/frodo_landing?include=anony_home', headers=headers)

是不是很方便?

都不用自己手動一個一個去找了!