0


python + Selenium 实现自动化查询问财数据及下载Excel数据并存入到mysql

问财的数据海量及时,作为量化交易的数据来源也是一种不错的选择,之前也用过一些免费的api,但是数据不是很完整,时效性也不高,试了一下问财觉得很不错,不过下载有次数限制,非会员每天只能导出2次,会员每天200次。因为不想每天手工操作,所以想通过该方法实现全自动下载导出存入mysql

安装selenium库

pip install selenium

安装google浏览器,查看版本号

下载对应版本驱动

下载地址:CNPM Binaries Mirror

解压后将文件放在 python/scripts 目录下

用cmd命令来启动google浏览器,也可以做成一个bat文件,方便下次快速打开浏览器

C:\Users\Administrator\AppData\Local\Google\Chrome\Application\chrome.exe --remote-debugging-port=9222 --user-data-dir="C:\selenum\AutomationProfile"

用上面的命令打开浏览器的好处是,运行程序不用产生新的浏览器,只会使用一个浏览器,这样就可以事先将账号登录好再进行后续的操作。

上代码查询一下试一试

from selenium import webdriver
import time
from selenium.webdriver.common.by import By

options = webdriver.ChromeOptions()
options.add_experimental_option("debuggerAddress", "127.0.0.1:9222")
browser = webdriver.Chrome(options=options)
browser.get(f"https://www.iwencai.com/unifiedwap/result?w=%20%E7%AB%9E%E4%BB%B7%E6%B6%A8%E5%B9%85%20%E6%B5%81%E5%8A%A8%E5%B8%82%E5%80%BC%20%E6%A6%82%E5%BF%B5%20%E9%BE%99%E5%A4%B4%20%E4%BA%8C%E7%BA%A7%E8%A1%8C%E4%B8%9A&querytype=stock")
browser.quit();

查询效果如下:

导出到excel,增加三行代码

from selenium import webdriver
import time
from selenium.webdriver.common.by import By

options = webdriver.ChromeOptions()
options.add_experimental_option("debuggerAddress", "127.0.0.1:9222")
browser = webdriver.Chrome(options=options)
browser.get(f"https://www.iwencai.com/unifiedwap/result?w=%20%E7%AB%9E%E4%BB%B7%E6%B6%A8%E5%B9%85%20%E6%B5%81%E5%8A%A8%E5%B8%82%E5%80%BC%20%E6%A6%82%E5%BF%B5%20%E9%BE%99%E5%A4%B4%20%E4%BA%8C%E7%BA%A7%E8%A1%8C%E4%B8%9A&querytype=stock”)

time.sleep(2)
button = browser.find_element(By.CLASS_NAME,'table-export');
button.click();

browser.quit();

效果如下:

安装openpyxl 用于操作Excel

pip install openpyxl

安装mysql库

pip install pymysql

my.ini 或 my.cnf 增加一个配置项,可以批量插入大批量数据到sql

max_allowed_packet=100M

代码如下:

from selenium import webdriver
import time
from selenium.webdriver.common.by import By
import os
import pymysql
import decimal
import sys
import openpyxl
import uuid

def down(date):
    
    options = webdriver.ChromeOptions()
    options.add_experimental_option("debuggerAddress", "127.0.0.1:9222")
    browser = webdriver.Chrome(options=options)
    browser.get(f"https://www.iwencai.com/unifiedwap/result?w={date}%20%E7%AB%9E%E4%BB%B7%E6%B6%A8%E5%B9%85%20%E6%B5%81%E5%8A%A8%E5%B8%82%E5%80%BC%20%E6%A6%82%E5%BF%B5%20{date}%E9%BE%99%E5%A4%B4%20%E4%BA%8C%E7%BA%A7%E8%A1%8C%E4%B8%9A&querytype=stock")
    time.sleep(2)
    button = browser.find_element(By.CLASS_NAME,'table-export');
    button.click();

    browser.quit();
    pass

def objToStr(value):
    if value is None:
        return ""
    else:
        return str(value)
    
def strToDecimal(str):
    str = objToStr(str)
    if str == "":
        str = "0"
    return decimal.Decimal(str)    

def getDate():
    return time.strftime('%Y-%m-%d', time.localtime(time.time()));

def getConn1():
    return pymysql.connect(host="127.0.0.1",port=3306,user="root",passwd="123456",db="wc_data" )
 
def importAll(date):    
    path='D:\\Users\Administrator\\下载'
    for file_name in os.listdir(path):
        if "竞价涨幅" in file_name:
            print(file_name)
            importFile(path + "\\" + file_name,date);
            time.sleep(3);
            os.remove(path + "\\" + file_name);
            

def importFile(filename,date):
    conn1 = getConn1();
    cursor1 = conn1.cursor();

    inwb = openpyxl.load_workbook(filename);

    sheetsname = inwb.sheetnames;
    ws = inwb[sheetsname[0]];

    rows = ws.max_row -1;
    cols = ws.max_column;
    print(rows);
    print(cols);

    i = 2
    value = ""; 
    while i <= rows:
        code = ws.cell(i,1).value;
        name = ws.cell(i,2).value;
        zf = ws.cell(i,5).value;
        type_name = ws.cell(i,9).value;
        notions = ws.cell(i,7).value;
        sz = ws.cell(i,18).value;
        amount = ws.cell(i,15).value;

        if zf == "--":
            zf = "0";
            
        if sz == "--":
            sz = "0";

        if amount == "--":
            amount = "0";              

            
        value += f"('{date}','{code}', '{name}','{zf}','{type_name}','{notions}','{sz}','{amount}'),";
        i = i + 1;
        pass

    value = value[:-1];

    sql = f"delete from k_data_jj ";
    cursor1.execute(sql)
    conn1.commit()

    if value != "":
        sql = f'INSERT into k_data_jj(date,code,name,zf,type,notions,sz,amount) VALUES {value}';
        count = cursor1.execute(sql)
        conn1.commit()

    inwb.close();
                

def job():
    date = getDate();
    down(date);
    time.sleep(20);
    importAll(date);

job();

5000多个股数据,10秒内导入完成,接下来就可以在数据库进行数据统计分析了

Demo下载https://download.csdn.net/download/gdgztt/87271607


本文转载自: https://blog.csdn.net/gdgztt/article/details/128293418
版权归原作者 我是唐赢 所有, 如有侵权,请联系我们删除。

“python + Selenium 实现自动化查询问财数据及下载Excel数据并存入到mysql”的评论:

还没有评论