【创作不易,切勿抄袭,转载请注明出处,侵权必究】
本文概览
本文结合开源的tesseract OCR库共尝试三种方法实现了字符型验证码识别与填充提交浏览器插件(非通用型,具体得根据自己网页内容改造,可借鉴本文方法)
- 直接在浏览器控制台用js实现;
- 将1中的脚本引入到浏览器插件中,将cdn资源转离线,并携带在指定页面中,纯js;
- 2中可能会存在浏览器的内容安全策略(CSP)阻止加载外部脚本,本文通过js调用flask实现的外部服务规避此问题(Python + js)。
假设验证码表单如下,全文通用,此后不再重复提及。
<div class="captcha">
<div class="api">
<img id="captchaImg" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAARgAAABQCAYAAADC8mo5AAALNUlEQVR42u2dC/BXRRXH9w/9fUCCSqig4x9GSEUdDZW/KZLoiDo+Ssc3UPiAVKASFdAMU16CoqiAGJhhkpqDNEWoZD7gj49QCo2wEnyAYFmYgiCi5p65h5H5ec6P3713d+/D72dmx5H/vbtn9+zZu4+z52cMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAXLDYpv97TmM8yPCBTW/xu9NsusCmVh7baQ+b3ldkua4g+uiSQx1MFvJd4Vh35yny7wnz90u9TZsCDDCnBpJhnU232bSTh7a6p0q5swugj//ZVJdDHSwU8pjlWHfjhDL+DfP3T9cAgwultoFlWG7TQQ7b6TCbPq1S3ooC6OPRHOqgjgekynevddzP5wllPAbz98/FAQaXVzOSgabueztoozrlK1uZWudcHz/NoQ46K++d7LifvyOUMQ7m758pQsP/NgcyNG3jnRbcOckwXqrSwec6kK9PjcZ0dIH1kZUOzlLeae+wbnspZZwL8/fPc0LDX58DGSbHeL+5TbdW6eDfTCFbS5tWVeS3waZXhHIGF1gfWelgjPDsvxzX7VRFpn1h/n5pzsZS2fDfyYEM30+Q12+UjnRLCvlGKQYvnXzcXVB9ZKmDuTH3ipIwQihjvU3NMAT45UClMzTkQIYjEuTVRcnrLwll62DTxoq8VvHS4BKhnBcKqo8sdbBaeHas4/rNFsp4Bubvn+8KDb82BzLQac1XE+b3qpDfewnzmiXk1Zf/dpTwNxqMvlJAfWSlg7bKYHS24/q9lnL5BxIyUWj4P+ZAhn+myO8RpdPWx8ynp5DHn8znfiStlXIOKKA+stJBL+W5zg7rtotSRn+Yv3/mCw1/cw5kSONkNUvpUC1i7klIpyJHVjz3hvBM7wLqIysdDDOyk16dw7r1VGQ5DObvlzqetkoGsnOgDTBNhhEp8nxCyO+TmJ32MiGP+4Xn5gjP3eRBH2XVwf3Ccwsc12+IUMZmm3bAEOCXrxv9rsq7vAan/5I35iITeT1ShyB/iVGsuH42nWZTd36P7ups70CGNKcmS4X83og5pf6P+eKxtOQsNlYoa54HfWTRD0LoYJnw3O2O6/dLoYyXYf7+OVdo+A28Tq7mEl/rBbiVNi2x6UmeMtNFuPE2DbdpgE1n2jRSeb9jwjq15i9lGme724X3RyrPnm/c3W/R9NE8g34QQgc72vSx8NwFjuv3V6GMe2H+/hkvNPxzvDxa62CQSZo+5S8bueb/zqYZJtqEpCn7IDbqE23qZlMnE92G3WKEZyh5Dq+xTbrw9LnS1b2l8vxBSnntHOoji37wfop9kFp10Kg8d4jDuu2gDGKXw/z987jQ8FO3+jvtwezKRtyNjZquvA+06Scm8tqcwYPAQh4U3rbpowwHps3K3x7n5R05wt1hojsoFF7hKh60LuQv+Z+Vzrgrf3ErqVfqe5IHfYTsB00p8mtSdNCp4jnJj4huc2/nsG7dFFl6wvz981/PBj+Jv0bH8nKIjgWHsXFP42VTVoNRmkGMln/v8J7CMiN7wC7ipdaNPPO6kjeOac+KfDxOsek4E7nPH2yiY9l3PcjbK2E/mJKwT52vyPGI8OxU4bnFjvv4AEWeXWD+fulgso3/Uk2Gq3ip0p03kL/HswjaB5nMM5HH2IiXZ7iUK8JgeDZ/rRt5ObcPL9925pmCpoMkVwToCH+9kBctUboKz0t3n+523M/vFMp4DebvnzNMtvFfqskQ52Li9jz4aIGSqMMeb9O3eXl3ES+JhpoodMF4nmm9ILxLM4q/82Y1feU3lnAQ+kT598W8KUuzTDqFuYuXxKNt+jG3OQ1CfXlp+bMqy9MbBL0145lg5bODHPdzaRCbDfP3zyiTbfwXTYZtuadThLQG3uMYxxuwmgwza9yobGe+GAZzo5Hv/zTjDd+2/Pf9bfqhUo8reWOTDOxmnn1RRLwHed+KPHSfNdFJ29qSz6ZoKUzR9NbwrPMfRo8KOIH7xjU2/YiXOX34g0T7gD1M5CRHs1w66dqd+0XlaZt2gXMEzN8/0g3W0TmQwdVXebyp3VFwRsq2aFDkaEzZFjSA7sbLmC5sVD3YyMjYevO+Fg1wV/MScgIvC6hOD9n0exM5vdGX/CUe+FezsX9UwoFsE8883zKyl/WWqxdTeNC/gT8CPzBRTBtq09NtOsFEsX0O5Y9IA39UWhrcwK6JNSb74DtrPHSwFSZe0Kduwh7OahP/kt97KfcxpLY4s6A62MRtuIbbZXMJB7IPeda5imdjdFOcbmfTiRwFB3vApp/zrJU8u6/nA47BvEw/j5fttHw/iveo9jORM+fXTHSloq6og0s74/6SnisZkqb5PEDGudBYx0sUF45eC1OcxGht0alEOqB/a81l7qO0F82y6OiaPMQpHi8FoprI+zv32fSwiU6jnuYN/qW8YUvBqdYZ2d+l6ImWeDsWbYA5Rfnq1Gcsw7bSRv4qLuPOPJE7dNJYKX2FMl5MOAWWjlyfSdEWaRzdiqCDt4W8hzmoE2360xF0e27/yjJomdjPRC4DtEdG/lx03YN+AWEa79vRPhCdUi7g/kB1fd1EHtrrq2yM+zoJLNxMZoTS8FnLEMItfgstlQ3ipDF1B5rkt4KltmgqsQ60mVMvx+VI/j03OcqbPITJ+ZJi/dJdLvL3omN68m0i94xzeCZM/YLcLq7jPTVy8qSTzV+ZKPLfPNY1ndpRCNY3TXQPbgMPLhuKuESSQhrOzIEMzwYsf0ygL1DnhG0xqcQ6ONG4u16hsZ8pfpDvuiIujwyPkknv6viUIVSEsY68SRdigDkrYVv0L7EOrjDuIg5q9Ff0sbcBXmljwvwGTRIZLgxU/uyAa+hRCdvi8BLrYLpxFzNZQwrR8DrM3z9aiMKGHMhwSICyjzVhTwHmJGiLjwNMjbPUgeTz8weH+dMVCMlxcQrM3z/DA0xPk8hASxbfp1i0efmyUHZXR/kPFvJemaAtlpZYB4R0RO3yZ0q0KyjHwfz982uh4RfmQIZFAcodJJT7cIBZQZuYbTGzxDognhTKft5h/o8K+dN+EzxwAyD9nMTUL4EMdKRYeWxJx4AHOiyjIcGXU2qLoSXvBw8IZdPRrAt/jx6KDobC9P3TysihDQblQIYBnsudZGoL4p0GMhDpct2QmG3Rq6Q62MJoZRD4Rsp863kWJg1erWD+/jlGUey3ciDDoR7LpFlKpRs5/b+P3yVeYmqP/6q1xW4l1MHWdFfKn5Ey36lKvpfB9MMwJMEeQQgZ6Fbvdh7LlEJC3hNwb2NJjLZYXVIdbE0zI/vgkAv+aQnyo1/SnKD07aew9xKO+zLq0NuSYbHH8k5XjKmjp/JGxjBeqS3mllAHEr2NfrkvzlKNgpMtUPKiE7z2MPtw/M2E8f3YM6YM0z3Vly6+LTdhNzP7xNhfkNpibEb9YHoG/XFOlT5ER/V0GfEIE8Vhof0Vuj9Gnrh0X4yi6jVVeZ8uZB4Akw9HCxPmOvvKBDJc6qnO1xjZ12Mvj+18uNIu/Wpsi3My6geXZtAnyZlwvoc+2ORxhgoUjgw0e3kogQyNHupLF+fWCWXd5rmdWyl1nFhjW+ybUT9ozKhf0oBHN4xdBG6nUBJXY88lGwYGGmCuiCmDL7f4e438a5O7B2hrKUrcUzW0xQcBjCOkDuLQg5dMSWKtUNApiiXTBmYOAKhGBxNFtfuFiS5AUkhKCvJEG+X0W1QUnvJ5/juFJD0YMxYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABKyWf98hQWMoBrHAAAAABJRU5ErkJggg==" style="width:230px;height:70px;">
</div>
<div class="button">
<button type="button" class="reflash" id="btnReload">Re Flash</button>
<button type="button" class="speaker" id="btnAudio">Speaker</button>
</div>
</div>
1 直接在浏览器控制台用js实现
分为在线和离线两种方式
1.1 在线方式
javascript">function autoCaptcha() {
// 如果 Tesseract 未加载,则动态加载
if (typeof Tesseract === 'undefined') {
var script = document.createElement('script');
script.src = 'https://cdn.jsdelivr.net/npm/tesseract.js@2.1.4/dist/tesseract.min.js';
script.onload = function() {
console.log('Tesseract.js 已加载');
recognizeCaptcha();
};
document.head.appendChild(script);
} else {
recognizeCaptcha();
}
function recognizeCaptcha() {
var captchaImg = document.getElementById('captchaImg');
if (!captchaImg) {
console.error('未找到验证码图片');
return;
}
var imgSrc = captchaImg.src;
console.log('验证码图片的 src:', imgSrc);
Tesseract.recognize(
imgSrc,
'eng',
{
logger: m => console.log(m)
}
).then(({ data: { text } }) => {
console.log('完整识别结果:', text);
var captcha = text.trim().substring(0, 6);
console.log('提取前6位验证码:', captcha);
var captchaInput = document.getElementById('label-for-captcha');
if (captchaInput) {
captchaInput.value = captcha;
console.log('验证码已自动填入输入框');
} else {
console.error('未找到验证码输入框');
}
var submitBtn = document.getElementById('btnComplete');
if (submitBtn) {
console.log('自动触发提交按钮点击');
submitBtn.click();
} else {
console.error('未找到提交按钮');
}
}).catch(err => {
console.error('识别出错:', err);
});
}
}
// 需要的时候调用该函数:
autoCaptcha();
1.2 离线方式
第2大节就是,比第1节改进的地方为脚本集成在浏览器插件中,此处需要先下载tesseract最新版tesseract.js CDN by jsDelivr - A CDN for npm and GitHubA free, fast, and reliable CDN for tesseract.js. Pure Javascript Multilingual OCRhttps://www.jsdelivr.com/package/npm/tesseract.js
2 将1中的脚本引入到浏览器插件中,并携带在指定页面中
(本章节脚本集成在插件部分是一次失败的尝试,后续搞定的小伙伴开源在评论区留言链接)
2.1 谷歌浏览器插件manifest.json配置如下
先下载1.2中的资源,再提醒下,别忘记。下面配置仅供参考,具体根据自己的。
javascript">{
"manifest_version": 3,
"name": "TestPlugin",
"description": "Extension to auto-verify-submit the website Captcha",
"version": "1.0",
"permissions": [
"activeTab",
"tabs",
"storage"
],
"web_accessible_resources": [
{
"resources": ["scripts/melonticket/package/dist/tesseract.min.js"],
"matches": ["<all_urls>"]
}
],
"content_scripts": [
{
"matches": [
"https://tkglobal.melon.com/reservation/popup/onestop.htm/*",
"https://tkglobal.melon.com/reservation/popup/onestop.htm"
],
"js": ["scripts/melonticket/autoCaptcha.js"],
"run_at":"document_end"
}
}
重点关注1:引入下载好的离线版tesseract,将其移动至插件脚本目录下scripts/你的项目/,然后在插件中引入下方代码,这里为什么不引入在线的cdn呢?原因是浏览器内容安全策略会阻止外来脚本,产生如下错误,因此要使用离线资源。
Refused to load the script '<URL>' because it violates the following Content Security Policy directive: "script-src 'self' 'wasm-unsafe-eval' 'inline-speculation-rules' <URL> <URL> <URL>". Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.
javascript"> "resources": ["scripts/melonticket/package/dist/tesseract.min.js"],
Notes: 如果你有权修改网页代码,重点关注1部分就可用忽略了,直接在网页代码中添加下面片段,该cdn资源就被允许加载了。
<meta http-equiv="Content-Security-Policy" content="script-src 'self' https://cdn.jsdelivr.net 'wasm-unsafe-eval' 'inline-speculation-rules';">
重点关注2:这里便是在存在验证码的页面自动注入js脚本,自动识别验证码并填充提交。
javascript">"js": ["scripts/melonticket/autoCaptcha.js"]
2.2 脚本实现
下面看看autoCaptcha.js脚本如何实现,很明显tesserace使用的离线资源,这个时候你也可以像1一样将代码粘贴在对应页面的浏览器控制台,进行验证。
javascript">function autoCaptcha() {
// 如果 Tesseract 未加载,则动态加载
if (typeof Tesseract === 'undefined') {
var script = document.createElement('script');
// 在线加载tesseract,前提是有网页编辑权限,并引入了上面的meta标签
// script.src = 'https://cdn.jsdelivr.net/npm/tesseract.js@2.1.4/dist/tesseract.min.js';
// 离线加载tesseract
script.src = chrome.runtime.getURL("scripts/melonticket/package/dist/tesseract.min.js");
script.onload = function() {
console.log('Tesseract.js 已加载');
recognizeCaptcha();
};
document.head.appendChild(script);
} else {
recognizeCaptcha();
}
function recognizeCaptcha() {
var captchaImg = document.getElementById('captchaImg');
if (!captchaImg) {
console.error('未找到验证码图片');
return;
}
var imgSrc = captchaImg.src;
console.log('验证码图片的 src:', imgSrc);
Tesseract.recognize(
imgSrc,
'eng',
{
logger: m => console.log(m)
}
).then(({ data: { text } }) => {
console.log('完整识别结果:', text);
var captcha = text.trim().substring(0, 6);
console.log('提取前6位验证码:', captcha);
var captchaInput = document.getElementById('label-for-captcha');
if (captchaInput) {
captchaInput.value = captcha;
console.log('验证码已自动填入输入框');
} else {
console.error('未找到验证码输入框');
}
var submitBtn = document.getElementById('btnComplete');
if (submitBtn) {
console.log('自动触发提交按钮点击');
submitBtn.click();
} else {
console.error('未找到提交按钮');
}
}).catch(err => {
console.error('识别出错:', err);
});
}
}
autoCaptcha();
3 通过js调用flask实现的本地验证服务规避“浏览器的内容安全策略(CSP)阻止加载外部脚本”问题
3.1 通过Anaconda安装python环境
这里就不赘述了,网上教程很多。
3.2 本地安装tesseract程序
直接用3.3会出现下面问题
TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.
因此便需要安装tesseract程序,步骤如下
(1)在python中配置tesseract和pytesseract库
python">pip install tesseract -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pytesseract -i https://pypi.tuna.tsinghua.edu.cn/simple
(2)下载tesseract-ocr,本人安装在D:\software\Tesseract-OCR目录下,具体根据自己需求
(3)安装成功后,添加系统环境变量Path中添加路径、新建系统变量
在Path中添加下面路径
新建系统变量
python">TESSDATA_PREFIX
D:\software\Tesseract-OCR\tessdata
(4)如有其他字符库识别需求(以汉字为例),默认识别的是英文库即ASCII字符
下载训练数据,放在D:\software\Tesseract-OCR\tessdata目录下
(5)打开Anaconda中Lib下site-packages下pytesseract下的pytesseract.py(我用的虚拟环境)
更改代码如下
python"># tesseract_cmd='tesseract'
tesseract_cmd = 'D:\software\Tesseract-OCR\\tesseract.exe'
(6)测试
Windows键+R,输入cmd回车,分别输入tesseract -v和tesseract --list-langs,成功结果如下
Note:红色为刚才添加的汉语语言库
3.3 新建python服务
如在pycharm中创建app.py作为本地OCR服务,这里直接给出服务脚本,缺什么库,自己安装就行
python">from flask import Flask, request, jsonify
import pytesseract
from PIL import Image
from io import BytesIO
import base64
from flask_cors import CORS
app = Flask(__name__)
# Enable CORS for the frontend origin
CORS(app, resources={r"/ocr": {"origins": "https://tkglobal.melon.com"}})
pytesseract.pytesseract.tesseract_cmd = 'D:/software/Tesseract-OCR/tesseract.exe'
tessdata_dir_config = '--tessdata-dir "D:/software/Tesseract-OCR/tessdata"'
def change_background(img):
try:
# img.show()
x, y = img.size
new_img = Image.new('RGBA', img.size, (255, 255, 255))
new_img.paste(img, (0, 0, x, y), img)
return new_img
except:
print('更换图片背景失败')
@app.route('/ocr', methods=['POST'])
def ocr():
try:
# Get the base64 image data from the request
data = request.get_json()
img_data = data.get('image')
if not img_data:
return jsonify({'error': 'No image data provided'}), 400
# Remove the "data:image/png;base64," part of the string if it exists
if img_data.startswith('data:image/png;base64,'):
img_data = img_data.replace('data:image/png;base64,', '')
# Decode the base64 string
img_bytes = base64.b64decode(img_data)
# Create an image from the byte data
img = Image.open(BytesIO(img_bytes))
# img = change_background(img)
# Perform OCR using Tesseract
text = pytesseract.image_to_string(img,lang="eng",config="--psm 7")
# Return the extracted text
return jsonify({'text': text.strip()})
except Exception as e:
return jsonify({'error': str(e)}), 500
if __name__ == '__main__':
app.run(debug=True)
利用postman测试服务
python">curl --location 'http://127.0.0.1:5000/ocr' \
--header 'Content-Type: application/json' \
--data ' {"image":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAARgAAABQCAYAAADC8mo5AAAM2klEQVR42u2dC7CVVRXHF6KIgoBICiUCkj3IlFdYjOL7NZUQhSCOgpo6GjGRiUCgpAjI41oyBpogkEQhIoilhegUII1EDOqVV6k8BAFJUVFRHu01Z904Hdb+Xnvvc8899/+bWQMD56y9v//Z336s/SICAAAAAAAAAAAAqJUcY6wNZAAAuFBHKpLuxkYYm2NsrbH98icAACSiibFzjN1qbIqxF429b+ygxfZJTwYAAP7HkcbaGetjbLSxp41tjKhIbJXLWgyTAKjdNDd2ibGfGZthbJWxT1JWJjuMPWeswth1xjoZqw9pAag98FCls1QA9xtbLBVDmoqEK55/SkV0m7GLpYICANQSooKuaSoTHhItNHavDJXaydAJAFBLSBt01Yw/v0y+z37ONtYY0gJQe/AZdOUezXBjV0hPpw7kBaD2gKArAMAZBF0BAM4g6FqmNJUfAYBigaBrLaCd/LD75U8AfIOgay3vuezL+xGbQhLgAIKu4DBezPtx+0AOkAAEXUFiRuT96DNiPvs1Y4ONzTT2Z2Orjb1t7FNju4xtkAprkrG+xj4fKM+nGLvG2FRjLxh72dhbxvYa22Nsq7HXjD1hbIixs4qkZano4wsEXYEznfMKwnZlXNvN2GRjm1IWqqph11xj53pqNQfKS3owg3FreVUA/Yqlz50Zn7tUba+UtzXGfke5IO3xJa7DZ4QjFTK1UtvzROwk/97R2LMefxxu1ZtkyF9dY7cX5NHF5iYsyHEUW5+nSqhy2B/I727KBXGPKFEdVqG6yMaMPBHHGfu9sQMBfiDuKrdKkS+uCP4SIB//MHZcRq1OriZ9tlbji7WlIOj6ZOD0FkT0FKpTh4dRVWSjT4rW6X2pybkleVC6rOONTZdhSNz3eYhzUoI8fcnYv2KGF6sk3QrJxzjpbi8x9mZMPh7PoFNXiakUW58vVHOP5caC/GwuQpqzFP1LTQeQkGtjhP1YhhY9jNWL8cXd/FuMvRHh74kYH42MrYuIpXB+GySspCZFvNQXptCov8QNqkMfjpM9IkFWHpYtN1ZpbFvG4cp7xtYbW2psnsSQOAB7j8Xye1UtLD7vi/h+vvGs02wJzr8bk8/LC3Q4NWEa+faAxff4DL5aoarIxgTLj3BACkSW1ZE8S/DLiMJzQcR351le4gEZn+8Ky4s4L+H3R0fEIioC6bNSKoGPPbbAO8l9pWsPxe8bDoH7fhE91Wc8lO1LFL9v4ZUvHkNiCuXfKDctnJWBFr9PWj5/qfLZD8h9qnmypdI6OqM+lQnyVFdiNl1lGHq79KgWSE/sHc8zHE/RoZWuFQ4VatrKdrajzxOlN1bod4+H/I4IpANIwDWWwrpdie5f55DOLEvhKQzk1ZEXr/Cz3/XwrJ0sz3paBn0ek6EQB6HPlPzxNOvYvBjQRnnpfVQePJTgtT68zP7XxoYau5pye3l4/01DJe8LFT93eNBxseJ3oAe/d1ue3bXHFUoHEMPpxj6yBD+Pkpaw8AXhxWIPUW49ScsUabW2DFHOK/jc+cpnFnp6Xn4mbdl6F8vnO1qGJxzkfU16Vb7Wg2wkfVaK4z5ZZ7u0af3zHTWsI41NUg1dh14HE8baiq0DiOFYeUm0l/movM/xEOD1iJdjk3SPOTbSQYYENlYp3+9b8JmHlM983+NzVyr+rzf2U4k1cXB1Bflbc3NAKqSXJAjMaQySZ+KXsjkdWtiYRJ+ktLHEi45z1K8d6Uv+63n4bboqvj9y9BlKBxDDGEV4DijWt7TkuxK+UNyqLzI2UoYNjfL8TFc+f1uC1qZthufjdHnZPs9C3GxsFOUWsn1KfqctP5BKi2d2HpZeXz9pIdsmiO9QSn2S0kfx9YqHctNf8bvMU5nsFcB3KB1ABF8hfbo1rtvYV4KHI6USSXp2B7cYO8U+lO+9K5UWxyoekMChbQx+mcRJWkisoZ4MubpJHGKoBHCfljjFexRmDUTVcvbvGDuD/KwGzofjGMsLrH9GX/cr+X/EQx61QPlET8//oOJ7lKPPUDqAlEGvRzP4qSvDogEyTNpENWsvzDsSUJ4vlRzP8PSm3NoQH/pUJ8sozCKxlYrfH3jw21AaoMLh5RdLVAdgob1lnNvMk/+W0i2dQuH2qyQxfqZ10tOaJr0ujrXwkQBflhhUdehTDDiGpgWnz3T0yzN+2qzYyR7yPErxO6NEdQARzFEE/03A9BrJS9tcWiP++zkSG+ExN099/9jYMBkm7SH7StmddGimhf/kYw/+QLkVmeyjh8SLmtUgfUKgTcfvoegAfBK0IOwWD/n9ntIYbSP3s2FC6QAsfI70IGcp1eiTY2I5vLz8BgpzjkpN0CcJtyrP8FcPfgeRn/1cVfCEwnClcuE43bdKWAdg4SeK4EtKsBKM259yUCqC5yVu8vVapE8SpivPMc6D39mUfZaLew0cFOe9RN0lP9qMIfdSv1niOgALf6cwKzB904vSTydvkaEMry9pXOb6xLGGwgRiXw8YM9svvdemNUAHoNCE9KDr6Q49jbsc7aII/7zLeTdl35PDXWGevv5qmepjozHpK4JbOpafZgErF+7JdPZc3kPpACICadrp7VmvfejloWD1j0mDz0S5mw6fvsxyVGbPMtRH42LFz1YP5efbFHbWb6Xk3RehdAAWxngO0FV4KFRJL3qrLwWcdyFvcEiP17s0KEN98hlOyXetp+EXVJzlBZPIz11HoXQAFuYrgo918LfcsSDxat4jMqbNS/B/RLkFgx+mTHcx6dOU5aKPdl7tEA/l51nLC5vmsCZe68LbKRZR9F4vHyttQ+kAUgS8BmX0xUv1P3F8gRZ7eq6jJV7Da2HWJkx7WBnro724F3jQWduL1sHBX33R1xZn6+SY31A6AAvapVhXFSltLkyFe5/GBEqLF9rx6XxRBzntUoZK5aBPa9JnZxo55u800lc3+7jXyBarmungM5QOIAKtRb2sSGlrZ7z0CJzmCZTbImCrZG4qQ316K35e9ZC/qxW/Sz0+v3ZrxAoHf6F0ABFoU7A9i5T2RCXtYt1oOM5SwUwpQ320wPJUD/nTDs2u8Pj82j6kHQ7+QukAUo5Jry9S2usL0t0s/85BzLoF5hve8PZv5dmfqwH6pGVpgp5aFrQFiD7vMO9H+v1QVGI6gAheVkQfU4R0zyP7lRyLlP87JUAeHlPSmVMD9EnDkaQff9reMX+2gHUbjxrcofh/KaOvUDqAGLSX+U9FSPcFsh+4rO05qhcgD+OVdEbXAH3S0JH0ncOugdhvkL5fyCfTEjQA1a0DiOFXpK+1qB8wzZ6W+Ae32g1JP/w6BNoal+4lrk9abiH9uhlXBih+/+hZhxUehzShdAAxXG4pzN0DpcddaO3oyqoDl0+05OdYz/ngPSmFhw7toP8/2LwU9UnLo6TfXOjKTMXvnR516GbRPesQLJQOIIZjSD/da1mAtPigoFctBeeVvLGythmti+e8jKVkZ8iWmj5p0W6I6OUhr9r1vT6n75d67iGF0gEkYIGlUF/rMQ0O0q4n+/qT/OlCbeZmlMe8cOtYeOwDL7JrVkP0SdNL0ypr14B5E4vfEzxpcYNFg6yHTYXSASSkC9kPb/LRKvHisLidzzfnff5x5f956fhJHvLCx3JqQeQba5A+SblI8bPNQ361O503eCqLvEp6n+L/tw4+Q+kAUjCP7IdkD6ZsszgtLWNfzfKnC/uQ/XiFrOfqcv7vshRe3gBXpwbpk5Sfk75z3BVtR/IsR59tSb9GuOperoYOvkPpAFLAdyJF7UBeJ61oixg/HCQ9W14c7fQ5XldyH0XvX2EflZZ8cEvPO6aTnnDGNwXcQ/Zdukvp8HuwS10fl6HvUA9lRbve5nlKd2jWSMoFWWeQvtYo/6yWdo75DaUDSMmlFH8kJY9lV0hrM1F+qHspd7Urb92PupeZhz68mbBwD8sSS5xkH0WfvcuBz2cot42fzyUZJi9nVV7+E/MsvNbk+BqqTxLeVtK40EM58XV9bpzxSutTPeQ3lA4gA3xL42eeCwqvG7mpIOaQ5AbAPjGVTFZjnyMo27kqpaRPFK0slZ/rzuHWRapc5ifoDVanDsCBTtI7cC0k3NpPpcPPOy0MuvWOCSi+6bHg8srcrmWkj40rlfQqPZSNKwNXLKvJ7xGZoXQAjnDQ8ocSWM3StR1L9oOUz6Lc2pIqax2TlwbS48ha0eyWmMcZZaqPxgQl3WkennuCp4pkvwxdVsswd7DEy3wTSgfgET6hjO8Gmku5hWDbpPXloOdmecmmU+4qjw4B88EzPedSblaAj2VcI3nhIOhe+XulxCz4rh6+1bE9ZT+Cs6bpAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADlxH8BuyGmi0P6dyMAAAAASUVORK5CYII="}'
结果如下(tesseract准确率不是100%哈,但是大部分都正确识别了)
最后一步,重写js代码,autoCaptcha的再实现,由于上面提到tesseract准确率不是100%,因此下面代码做了刷新再识别直至识别成功。
javascript">function autoCaptcha() {
// Get the captcha image as base64
var captchaImg = document.getElementById('captchaImg');
if (!captchaImg) {
console.error('验证码图片未找到');
return;
}
var imgBase64 = captchaImg.src; // This is the base64 data URL
// Send the base64 string to the Python server
fetch('http://127.0.0.1:5000/ocr', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({ image: imgBase64 })
})
.then(response => response.json())
.then(data => {
if (data.text) {
console.log('OCR识别结果:', data.text);
var captcha = data.text.trim().substring(0, 6); // Extract the captcha text
console.log('提取前6位验证码:', captcha);
// Fill the captcha input
var captchaInput = document.getElementById('label-for-captcha');
if (captchaInput) {
captchaInput.value = captcha;
console.log('验证码已自动填入输入框');
}
// Submit the form if needed
var submitBtn = document.getElementById('btnComplete');
if (submitBtn) {
submitBtn.click();
}
} else {
console.error('OCR识别失败:', data.error);
}
})
.catch(error => console.error('请求错误:', error));
}
async function sleep(t) {
return await new Promise(resolve => setTimeout(resolve, t));
}
async function loopVerifyCaptcha(){
autoCaptcha();
await sleep(15000);
while(document.getElementById("certification").style.display != "none") {
// Call autoCaptcha function as needed
document.getElementById('btnReload').click();
await sleep(10000);
autoCaptcha();
}
}
loopVerifyCaptcha();
恭喜您,掌握了新技能,如果有帮到您,点击右下角红包打赏作者喝杯奶茶吧~