0


【Python】爬虫通过验证码

1、将验证码下载至本地

  1. # 获取验证码界面html
  2. url ='http://www.example.com/a.html'
  3. resp = requests.get(url)
  4. soup = BeautifulSoup(resp.content.decode('UTF-8'),'html.parser')#找到验证码图片标签,获取其地址
  5. src = soup.select_one('div.captcha-row img')['src']# 验证码下载至本地
  6. resp = requests.get(src)withopen('../images/verify.png','wb')as f:
  7. f.write(resp.content)

2、解析验证码

  1. pip install ddddocr
  1. ocr = ddddocr.DdddOcr()withopen('../images/verify.png','rb')as f:
  2. img = f.read()
  3. code = ocr.classification(img)print(code)

3、发送验证码

  1. #获取 token,一般验证码框有个隐藏的token
  2. token = soup.find('input',{'name':'csrfToken'}).get('value')# 提交按钮对应的URL
  3. verify_url ='https://www.example.com/verify'# 表单数据具体有哪几项可以在界面提交时查看(F12)
  4. data ={'vcode': code,'token': token,'btnPost':''}# 请求头(F12 从请求里扒)
  5. headers ={'content-type':'application/x-www-form-urlencoded','user-agent':'Mozilla/5.0 (Macintosh;) AppleWebKit/537.36 (KHTML, like Gecko) Chrome'}
  6. response = requests.post(verify_url, data=data, headers=headers)if response.status_code ==200:print('人机验证 - success')else:print('人机验证 - fail')
标签: python

本文转载自: https://blog.csdn.net/weixin_42364929/article/details/143657691
版权归原作者 茉菇 所有, 如有侵权,请联系我们删除。

“【Python】爬虫通过验证码”的评论:

还没有评论