爬虫助手WebScraper中文网

如何解决有的网页能够抓取有的不行的情况?

[复制链接]
发表于 2022-5-16 16:00:12 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
抓取领英公司主页发的帖子的点赞人员清单。有的公司主页的点赞清单能够抓取到(发帖量较少),但是有的主页一个都抓不到(发帖量较多),不知道是哪里出问题。求帮忙看看

PS: 可能需要翻+墙才能正常访问公司主页
能抓到的:https://www.linkedin.com/company/robustel/posts/?feedView=all
抓不到的:https://www.linkedin.com/company/milesightiot/posts/?feedView=all

  1. {"_id":"likelist","startUrl":["https://www.linkedin.com/company/milesightiot/posts/?feedView=all"],"selectors":[{"delay":0,"id":"scroll","multiple":true,"parentSelectors":["_root"],"selector":".feed-shared-update-v2 > div","type":"SelectorElementScroll"},{"clickElementSelector":"span.social-details-social-counts__social-proof-text,span.social-details-social-counts__reactions-count","clickElementUniquenessType":"uniqueCSSSelector","clickType":"clickOnce","delay":0,"discardInitialElements":"do-not-discard","id":"dianzan","multiple":true,"parentSelectors":["_root"],"selector":"div.artdeco-modal","type":"SelectorElementClick"},{"clickElementSelector":"button:contains('Show more results')","clickElementUniquenessType":"uniqueText","clickType":"clickMore","delay":1000,"discardInitialElements":"do-not-discard","id":"showmore","multiple":true,"parentSelectors":["dianzan"],"selector":"div.artdeco-entity-lockup","type":"SelectorElementClick"},{"delay":0,"id":"profile","multiple":true,"parentSelectors":["dianzan"],"selector":"li","type":"SelectorElement"},{"clickElementSelector":"button.artdeco-modal__dismiss","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":1000,"discardInitialElements":"do-not-discard","id":"close","multiple":false,"parentSelectors":["dianzan"],"selector":".artdeco-modal__dismiss li-icon","type":"SelectorElementClick"},{"delay":0,"id":"name","multiple":false,"parentSelectors":["profile"],"regex":"","selector":".artdeco-entity-lockup__title span","type":"SelectorText"},{"delay":0,"id":"headline","multiple":false,"parentSelectors":["profile"],"regex":"","selector":"div.artdeco-entity-lockup__caption","type":"SelectorText"},{"delay":0,"id":"link","multiple":false,"parentSelectors":["profile"],"selector":"a","type":"SelectorLink"},{"delay":0,"id":"avatar","multiple":false,"parentSelectors":["profile"],"selector":"img.ivm-view-attr__img--centered","type":"SelectorImage"}]}
复制代码


Web Scraper中文网 - 用户指南

① 首先下载插件,可以参考《Web Scraper插件版本归档》。
② 安装插件,可以参考《如何在谷歌Chrome浏览器上安装Web Scraper插件》。
③ 插件的使用教程,参考《Web Scraper插件使用教程
※ 遇到问题,鼓励先自行解决或网友互助,在《Web Scraper插件网友互助》求助。


回复

使用道具 举报

 楼主| 发表于 2022-5-16 16:29:47 | 显示全部楼层
简单说下大概的结构。首先,一直加载当前页面的新帖子直到没有,然后点击每个帖子的点赞数打开弹窗,弹窗里加载所有点赞人员(靠点击隐藏起来的'Show more results'按钮),抓取人员信息
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则 需要先绑定手机号

QQ|Archiver|手机版|网站地图|爬虫助手WebScraper中文网 ( 渝ICP备18015624号-16 )

GMT+8, 2024-9-19 09:32 , Processed in 0.095855 second(s), 16 queries .

Powered by Discuz! X3.4

Copyright © 2001-2021, Tencent Cloud.

快速回复 返回顶部 返回列表