0


微软new bing chatgpt 逆向爬虫实战

gospider 介绍

gospider 是一个golang 爬虫神器,它内置了多种反爬虫模块,是golang 爬虫必备的工具包

安装

go get -u gitee.com/baixudong/gospider

gitee地址

https://gitee.com/baixudong/gospider

github地址

https://github.com/baixudong007/gospider

开始new bing 逆向

通过抓包得到websocket 地址

在这里插入图片描述

分析websocket 发送参数

在这里插入图片描述

这里一连接就连续发送三个 text 类型的数据,其中第一个数据为

{"protocol":"json","version":1}

第二个数据为

{"type":6}

第三个数据为

{"arguments":[{"source":"cib","optionsSets":["nlu_direct_response_filter","deepleo","disable_emoji_spoken_text","responsible_ai_policy_235","enablemm","dlislog","dloffstream","dv3sugg","harmonyv3"],"allowedMessageTypes":["Chat","InternalSearchQuery","InternalSearchResult","InternalLoaderMessage","RenderCardRequest","AdsQuery","SemanticSerp"],"sliceIds":["0113dllog","216dloffstream"],"traceId":"63f8b16700104c9db4609875735e3f12","isStartOfSession":true,"message":{"locale":"ru-RU","market":"ru-RU","region":"US","location":"lat:47.639557;long:-122.128159;re=1000m;","locationHints":[{"country":"United States","state":"Pennsylvania","city":"Pittsburgh","zipcode":"15211","timezoneoffset":-5,"dma":508,"countryConfidence":9,"cityConfidence":5,"Center":{"Latitude":40.4393,"Longitude":-80.0213},"RegionType":2,"SourceType":1}],"timestamp":"2023-02-24T20:45:32+08:00","author":"user","inputMethod":"Keyboard","text":"你好","messageType":"Chat"},"conversationSignature":"Qf1G4jxAl5c2h1frcuYraT8+4L4f9IfjvARi+xT+IoI=","participant":{"id":"844428589040485"},"conversationId":"51D|BingProd|36AD9D4018B442F39EC6E4055099E8E728B019942E0F02A1C99016EA127A3E0E"}],"invocationId":"0","target":"chat","type":4}

第一个数据和第二个数据简单易懂,第三个数据有很多没有用的参数,这里删除精简后如下

{"arguments":[{"conversationId":conversationId,"source":"cib","isStartOfSession":isStartOfSession,"message":{"text":text,"messageType":"Chat"},"conversationSignature":conversationSignature,"participant":{"id":clientId}}],"invocationId":"1","target":"chat","type":4}

第三个数据,四个变量:

  1. conversationId /turing/conversation/create 处获得
  2. conversationSignature /turing/conversation/create 处获得
  3. clientId /turing/conversation/create 处获得
  4. isStartOfSession 是否是会话的第一个问题
  5. text 要问的问题

获取conversationId,conversationSignature,clientId 值。通过抓包分析

在这里插入图片描述
只要带上登录信息,发送这个请求,就可以得到以上三个值了,这一步没有难度。登录的cookie 为 _U ,只要_U 这个值就可以了

参数都准备完毕了,现在我们一起通过代码演示下。

首先获取 conversationId,conversationSignature,clientId

    reqCli, err = requests.NewClient(nil)if err !=nil{
        log.Panic(err)}
    response, err := reqCli.Request(nil,"get","https://www.bing.com/turing/conversation/create", requests.RequestOption{
        Cookies:"登录的cookies",})if err !=nil{
        log.Panic(err)}
    jsonData := response.Json()
    conversationId := jsonData.Get("conversationId").String()
    clientId := jsonData.Get("clientId").String()
    conversationSignature := jsonData.Get("conversationSignature").String()

发送websocket 请求

    Response, err := reqCli.Request(nil,"get","wss://sydney.bing.com/sydney/ChatHub")if err !=nil{
        log.Panic(err)}
    wsCli := Response.WebSocket()if err = wsCli.Send(context.TODO(), websocket.MessageText,append(tools.StringToBytes(`{"protocol":"json","version":1}`),0x1e)); err !=nil{
        log.Panic(err)}if err = wsCli.Send(context.TODO(), websocket.MessageText,append(tools.StringToBytes(`{"type":6}`),0x1e)); err !=nil{
        log.Panic(err)}
    data :=map[string]any{"arguments":[]map[string]any{map[string]any{"source":"cib","isStartOfSession": isStartOfSession,"message":map[string]any{"text":        text,"messageType":"Chat",},"conversationSignature": conversationSignature,"participant":map[string]any{"id": clientId,},"conversationId": conversationId,},},"invocationId":"1","target":"chat","type":4,}if err = wsCli.Send(context.TODO(), websocket.MessageText,append(tools.StringToBytes(tools.Any2json(data).Raw),0x1e)); err !=nil{
        log.Panic(err)}

接受微软回答的消息

var offset int
    run:=truefor run{
        msgType, msgCon, err := wsCli.Recv(context.TODO())if err !=nil{
            log.Panic(err)}if msgType == websocket.MessageText {
            msgData := tools.Any2json(msgCon)switch msgData.Get("type").Int(){case1:
                txt := msgData.Get("arguments.0.messages.0.text").String()
                lls :=[]rune(txt)
                fmt.Print(string(lls[offset:]))
                offset =len(lls)case2:
                log.Print(msgData)
                run=false}}

这样就可以和new bing 愉快的玩耍了

注意事项

  1. isStartOfSession 会话的第一个问题为true,后面几个问题为false
  2. 发送text 消息时要在文本的末尾添加1e 这个字符。
标签: chatgpt 爬虫 python

本文转载自: https://blog.csdn.net/Mr_bai_404/article/details/129471872
版权归原作者 Mr_Bai_404 所有, 如有侵权,请联系我们删除。

“微软new bing chatgpt 逆向爬虫实战”的评论:

还没有评论