简易数据分析 08 | Web Scraper 翻页——点击「更多按钮」翻页

  • 2019 年 10 月 3 日
  • 筆記

???????????? 8 ????

???Web Scraper ??——?????????????????????????????????

??????????????????????????????????????????????????????????

???????????????????????????????????????????????

?????????????????

??????????? web scraper ?? Element click ????????????????????

????????????????????????????????????????

https://sspai.com/tag/%E7%83%AD%E9%97%A8%E6%96%87%E7%AB%A0#home

????????????????????????????????????????????????????

??????????????

1.?? sitmap

?????????????????? sitmap???? sspai_hot?????? https://sspai.com/tag/%E7%83%AD%E9%97%A8%E6%96%87%E7%AB%A0#home?

2.????? selector

??????????????? web scraper ??????????????????????container??????????????????????????????? selector?

???????? selector ? Type ???? Element click???????????????????????????????????????????

????? selector?????????????? Click selector?????????????????????????????

????????????????????

1.Click type

?????click more ???????????????????????? click more????? click once ???????

2.Click element uniqueness

??????? Web Scraper ??????????????? Unique Text???????????????

???????????????????????????????????????????????????????????????????????????????????Web scraper ??????????????????????

3.Multiple

???????????????????????????????????

4.Discard initial elements

??????????????????????????????????????????????? Never discard????????

5.Delay

??????????????????????????delay ??????????????????????? 2000????? 2s ??????????????????????????????

3.??????

???????????????????????????????????????????????????????????????????????????????????????

4.????

?? Sitemap spay_hot -> Scrape ??????????????

????????? Web Scraper ???????????????????????????????????????????? TOP250??????? 250 ??????????????????????? Web Scraper???????????

????

?????? 04 | Web Scraper ??–????????

?????? 06 | ??????????? Web Scraper ??

?????? 07 | Web Scraper ??????