Web Spider XHR断点 千千XX 歌曲下载
首先声明: 此次案例只为学习交流使用,切勿用于其他非法用途
注:网站url、接口url请使用base64.b64decode自行解码
文章目录
前言
目标网站地址:aHR0cHM6Ly9tdXNpYy45MXEuY29tLw==
提示:以下是本篇文章正文内容,下面案例可供参考
一、资源推荐
JS逆向加密解密工具 下载使用:https://blog.csdn.net/EXIxiaozhou/article/details/128034062
二、任务说明
1、输入歌手名,歌曲名等关键词,可以检索相关的信息;
2、输入该平台的歌曲ID,即可完成歌曲的下载;
最终结果
三、网站分析
实现思路
- 1、通过关键词检索,拿到歌曲信息;
- 2、通过歌曲ID请求某接口,拿到歌曲的播放链接;
- 3、通过歌曲的播放链接进行歌曲下载;
1、打开网站,首页搜索框输入关键词,按回车键,可以发现数据就在网页中,替换搜索关键词即可;
关键词检索歌曲信息url:aHR0cHM6Ly9tdXNpYy45MXEuY29tL3NlYXJjaD93b3JkPcfgu6i0yQ==
数据可以直接看到,使用解析模块直接提取想要的数据
2、分析获取歌曲下载链的XHR接口,来到歌曲播放页面,歌曲的下载链接由下列接口返回的;
获取歌曲下载链的XHR接口url:aHR0cHM6Ly9tdXNpYy45MXEuY29tL3YxL3NvbmcvdHJhY2tsaW5r
分析请求时提交的参数,是一个GET请求,sign是MD5加密字符串,appid是版本号写死即可,TSID是歌曲ID,timestamp是10位的时间戳
加密的明文数据为下列的字符串,只需更换TSID,timestamp这两个参数即可
f'TSID={TSID歌曲ID}&appid=16073360×tamp={timestamp时间戳}0b50b02fd0d73a9c4c8c3a781c30845f'
3、拿到歌曲的下载链接打开就可以直接播放啦
四、XHR断点调试,使用WT-JS还原JS加密代码
逆向思路
- 1、通过浏览器,资源面板,添加XHR断点,让其在浏览器请求之前断住,以便我们分析请求参数;
- 2、通过资源面板下的调用堆栈,进行跟值,找到加密代码处;
- 3、拿到加密明文后,使用WT-JS还原JS加密代码
- 4、通过python代码调用JS代码实现整个任务;
1、根据接口的url,下XHR断点,这里以获取歌曲下载链的XHR接口为例;
如果浏览器向该接口发送请求则会在发包前进行断点;
2、在歌曲播放页面刷新一下页面即可触发XHR断点,可以发现此处的代码参数已经生成,需要通过调用堆栈进行跟值,找到加密代码处,一个一个往下跟,正常流程是每从堆栈切换至一个新的函数,得取消之前的断点,在新函数处重新下断点,刷新操作;
通过调用堆栈跟值来到此处明文加密函数,在createSign()方法中重新下断点;
注意return处的断点,断前不断后(以避免值出现重复计算),加密的明文就是 r += secret的值;
3、还原JS加密代码,点击生成JS加密代码,粘贴至pycharm编辑器中调试
打开pycharm调试JS加密代码
五、代码实现
1、JS加密代码:encode.js
var CryptoJS = CryptoJS ||(function(Math, undefined){
var crypto;if(typeof window !=='undefined'&& window.crypto){
crypto = window.crypto;}if(typeof self !=='undefined'&& self.crypto){
crypto = self.crypto;}if(typeof globalThis !=='undefined'&& globalThis.crypto){
crypto = globalThis.crypto;}if(!crypto &&typeof window !=='undefined'&& window.msCrypto){
crypto = window.msCrypto;}if(!crypto &&typeof global !=='undefined'&& global.crypto){
crypto = global.crypto;}if(!crypto &&typeof require ==='function'){
try {
crypto =require('crypto');}catch(err){}}
var cryptoSecureRandomInt =function(){if(crypto){if(typeof crypto.getRandomValues ==='function'){
try {return crypto.getRandomValues(new Uint32Array(1))[0];}catch(err){}}if(typeof crypto.randomBytes ==='function'){
try {return crypto.randomBytes(4).readInt32LE();}catch(err){}}}
throw new Error('Native crypto module could not be used to get secure random number.');};
var create = Object.create ||(function(){
function F(){}returnfunction(obj){
var subtype;
F.prototype = obj;
subtype = new F();
F.prototype = null;return subtype;};}());
var C ={};
var C_lib = C.lib ={};
var Base = C_lib.Base =(function(){return{
extend:function(overrides){
var subtype =create(this);if(overrides){
subtype.mixIn(overrides);}if(!subtype.hasOwnProperty('init')|| this.init === subtype.init){
subtype.init =function(){
subtype.$super.init.apply(this, arguments);};}
subtype.init.prototype = subtype;
subtype.$super = this;return subtype;}, create:function(){
var instance = this.extend();
instance.init.apply(instance, arguments);return instance;}, init:function(){}, mixIn:function(properties){for(var propertyName in properties){if(properties.hasOwnProperty(propertyName)){
this[propertyName]= properties[propertyName];}}if(properties.hasOwnProperty('toString')){
this.toString = properties.toString;}}, clone:function(){return this.init.prototype.extend(this);}};}());
var WordArray = C_lib.WordArray = Base.extend({
init:function(words, sigBytes){
words = this.words = words ||[];if(sigBytes != undefined){
this.sigBytes = sigBytes;}else{
this.sigBytes = words.length *4;}}, toString:function(encoder){return(encoder || Hex).stringify(this);}, concat:function(wordArray){
var thisWords = this.words;
var thatWords = wordArray.words;
var thisSigBytes = this.sigBytes;
var thatSigBytes = wordArray.sigBytes;
this.clamp();if(thisSigBytes %4){for(var i =0; i < thatSigBytes; i++){
var thatByte =(thatWords[i >>>2]>>>(24-(i %4)*8))&0xff;
thisWords[(thisSigBytes + i)>>>2]|= thatByte <<(24-((thisSigBytes + i)%4)*8);}}else{for(var j =0; j < thatSigBytes; j +=4){
thisWords[(thisSigBytes + j)>>>2]= thatWords[j >>>2];}}
this.sigBytes += thatSigBytes;return this;}, clamp:function(){
var words = this.words;
var sigBytes = this.sigBytes;
words[sigBytes >>>2]&=0xffffffff<<(32-(sigBytes %4)*8);
words.length = Math.ceil(sigBytes /4);}, clone:function(){
var clone = Base.clone.call(this);
clone.words = this.words.slice(0);return clone;}, random:function(nBytes){
var words =[];
var r =(function(m_w){
var m_w = m_w;
var m_z =0x3ade68b1;
var mask =0xffffffff;returnfunction(){
m_z =(0x9069*(m_z &0xFFFF)+(m_z >>0x10))& mask;
m_w =(0x4650*(m_w &0xFFFF)+(m_w >>0x10))& mask;
var result =((m_z <<0x10)+ m_w)& mask;
result /=0x100000000;
result +=0.5;return result *(Math.random()>.5?1:-1);}});
var RANDOM = false, _r;
try {cryptoSecureRandomInt();
RANDOM = true;}catch(err){}for(var i =0, rcache; i < nBytes; i +=4){if(!RANDOM){
_r =r((rcache || Math.random())*0x100000000);
rcache =_r()*0x3ade67b7;
words.push((_r()*0x100000000)|0);continue;}
words.push(cryptoSecureRandomInt());}return new WordArray.init(words, nBytes);}});
var C_enc = C.enc ={};
var Hex = C_enc.Hex ={
stringify:function(wordArray){
var words = wordArray.words;
var sigBytes = wordArray.sigBytes;
var hexChars =[];for(var i =0; i < sigBytes; i++){
var bite =(words[i >>>2]>>>(24-(i %4)*8))&0xff;
hexChars.push((bite >>>4).toString(16));
hexChars.push((bite &0x0f).toString(16));}return hexChars.join('');}, parse:function(hexStr){
var hexStrLength = hexStr.length;
var words =[];for(var i =0; i < hexStrLength; i +=2){
words[i >>>3]|=parseInt(hexStr.substr(i,2),16)<<(24-(i %8)*4);}return new WordArray.init(words, hexStrLength /2);}};
var Latin1 = C_enc.Latin1 ={
stringify:function(wordArray){
var words = wordArray.words;
var sigBytes = wordArray.sigBytes;
var latin1Chars =[];for(var i =0; i < sigBytes; i++){
var bite =(words[i >>>2]>>>(24-(i %4)*8))&0xff;
latin1Chars.push(String.fromCharCode(bite));}return latin1Chars.join('');}, parse:function(latin1Str){
var latin1StrLength = latin1Str.length;
var words =[];for(var i =0; i < latin1StrLength; i++){
words[i >>>2]|=(latin1Str.charCodeAt(i)&0xff)<<(24-(i %4)*8);}return new WordArray.init(words, latin1StrLength);}};
var Utf8 = C_enc.Utf8 ={
stringify:function(wordArray){
try {returndecodeURIComponent(escape(Latin1.stringify(wordArray)));}catch(e){
throw new Error('Malformed UTF-8 data');}}, parse:function(utf8Str){return Latin1.parse(unescape(encodeURIComponent(utf8Str)));}};
var BufferedBlockAlgorithm = C_lib.BufferedBlockAlgorithm = Base.extend({
reset:function(){
this._data = new WordArray.init();
this._nDataBytes =0;}, _append:function(data){if(typeof data =='string'){
data = Utf8.parse(data);}
this._data.concat(data);
this._nDataBytes += data.sigBytes;}, _process:function(doFlush){
var processedWords;
var data = this._data;
var dataWords = data.words;
var dataSigBytes = data.sigBytes;
var blockSize = this.blockSize;
var blockSizeBytes = blockSize *4;
var nBlocksReady = dataSigBytes / blockSizeBytes;if(doFlush){
nBlocksReady = Math.ceil(nBlocksReady);}else{
nBlocksReady = Math.max((nBlocksReady |0)- this._minBufferSize,0);}
var nWordsReady = nBlocksReady * blockSize;
var nBytesReady = Math.min(nWordsReady *4, dataSigBytes);if(nWordsReady){for(var offset =0; offset < nWordsReady; offset += blockSize){
this._doProcessBlock(dataWords, offset);}
processedWords = dataWords.splice(0, nWordsReady);
data.sigBytes -= nBytesReady;}return new WordArray.init(processedWords, nBytesReady);}, clone:function(){
var clone = Base.clone.call(this);
clone._data = this._data.clone();return clone;}, _minBufferSize:0});
var Hasher = C_lib.Hasher = BufferedBlockAlgorithm.extend({
cfg: Base.extend(),
init:function(cfg){
this.cfg = this.cfg.extend(cfg);
this.reset();}, reset:function(){
BufferedBlockAlgorithm.reset.call(this);
this._doReset();}, update:function(messageUpdate){
this._append(messageUpdate);
this._process();return this;}, finalize:function(messageUpdate){if(messageUpdate){
this._append(messageUpdate);}
var hash = this._doFinalize();return hash;}, blockSize:512/32,
_createHelper:function(hasher){returnfunction(message, cfg){return new hasher.init(cfg).finalize(message);};}, _createHmacHelper:function(hasher){returnfunction(message, key){return new C_algo.HMAC.init(hasher, key).finalize(message);};}});
var C_algo = C.algo ={};return C;}(Math));(function(Math){
var C = CryptoJS;
var C_lib = C.lib;
var WordArray = C_lib.WordArray;
var Hasher = C_lib.Hasher;
var C_algo = C.algo;
var T =[];(function(){for(var i =0; i <64; i++){
T[i]=(Math.abs(Math.sin(i +1))*0x100000000)|0;}}());
var MD5 = C_algo.MD5 = Hasher.extend({
_doReset:function(){
this._hash = new WordArray.init([0x67452301,0xefcdab89,0x98badcfe,0x10325476]);}, _doProcessBlock:function(M, offset){for(var i =0; i <16; i++){
var offset_i = offset + i;
var M_offset_i = M[offset_i];
M[offset_i]=((((M_offset_i <<8)|(M_offset_i >>>24))&0x00ff00ff)|(((M_offset_i <<24)|(M_offset_i >>>8))&0xff00ff00));}
var H = this._hash.words;
var M_offset_0 = M[offset +0];
var M_offset_1 = M[offset +1];
var M_offset_2 = M[offset +2];
var M_offset_3 = M[offset +3];
var M_offset_4 = M[offset +4];
var M_offset_5 = M[offset +5];
var M_offset_6 = M[offset +6];
var M_offset_7 = M[offset +7];
var M_offset_8 = M[offset +8];
var M_offset_9 = M[offset +9];
var M_offset_10 = M[offset +10];
var M_offset_11 = M[offset +11];
var M_offset_12 = M[offset +12];
var M_offset_13 = M[offset +13];
var M_offset_14 = M[offset +14];
var M_offset_15 = M[offset +15];
var a = H[0];
var b = H[1];
var c = H[2];
var d = H[3];
a =FF(a, b, c, d, M_offset_0,7, T[0]);
d =FF(d, a, b, c, M_offset_1,12, T[1]);
c =FF(c, d, a, b, M_offset_2,17, T[2]);
b =FF(b, c, d, a, M_offset_3,22, T[3]);
a =FF(a, b, c, d, M_offset_4,7, T[4]);
d =FF(d, a, b, c, M_offset_5,12, T[5]);
c =FF(c, d, a, b, M_offset_6,17, T[6]);
b =FF(b, c, d, a, M_offset_7,22, T[7]);
a =FF(a, b, c, d, M_offset_8,7, T[8]);
d =FF(d, a, b, c, M_offset_9,12, T[9]);
c =FF(c, d, a, b, M_offset_10,17, T[10]);
b =FF(b, c, d, a, M_offset_11,22, T[11]);
a =FF(a, b, c, d, M_offset_12,7, T[12]);
d =FF(d, a, b, c, M_offset_13,12, T[13]);
c =FF(c, d, a, b, M_offset_14,17, T[14]);
b =FF(b, c, d, a, M_offset_15,22, T[15]);
a =GG(a, b, c, d, M_offset_1,5, T[16]);
d =GG(d, a, b, c, M_offset_6,9, T[17]);
c =GG(c, d, a, b, M_offset_11,14, T[18]);
b =GG(b, c, d, a, M_offset_0,20, T[19]);
a =GG(a, b, c, d, M_offset_5,5, T[20]);
d =GG(d, a, b, c, M_offset_10,9, T[21]);
c =GG(c, d, a, b, M_offset_15,14, T[22]);
b =GG(b, c, d, a, M_offset_4,20, T[23]);
a =GG(a, b, c, d, M_offset_9,5, T[24]);
d =GG(d, a, b, c, M_offset_14,9, T[25]);
c =GG(c, d, a, b, M_offset_3,14, T[26]);
b =GG(b, c, d, a, M_offset_8,20, T[27]);
a =GG(a, b, c, d, M_offset_13,5, T[28]);
d =GG(d, a, b, c, M_offset_2,9, T[29]);
c =GG(c, d, a, b, M_offset_7,14, T[30]);
b =GG(b, c, d, a, M_offset_12,20, T[31]);
a =HH(a, b, c, d, M_offset_5,4, T[32]);
d =HH(d, a, b, c, M_offset_8,11, T[33]);
c =HH(c, d, a, b, M_offset_11,16, T[34]);
b =HH(b, c, d, a, M_offset_14,23, T[35]);
a =HH(a, b, c, d, M_offset_1,4, T[36]);
d =HH(d, a, b, c, M_offset_4,11, T[37]);
c =HH(c, d, a, b, M_offset_7,16, T[38]);
b =HH(b, c, d, a, M_offset_10,23, T[39]);
a =HH(a, b, c, d, M_offset_13,4, T[40]);
d =HH(d, a, b, c, M_offset_0,11, T[41]);
c =HH(c, d, a, b, M_offset_3,16, T[42]);
b =HH(b, c, d, a, M_offset_6,23, T[43]);
a =HH(a, b, c, d, M_offset_9,4, T[44]);
d =HH(d, a, b, c, M_offset_12,11, T[45]);
c =HH(c, d, a, b, M_offset_15,16, T[46]);
b =HH(b, c, d, a, M_offset_2,23, T[47]);
a =II(a, b, c, d, M_offset_0,6, T[48]);
d =II(d, a, b, c, M_offset_7,10, T[49]);
c =II(c, d, a, b, M_offset_14,15, T[50]);
b =II(b, c, d, a, M_offset_5,21, T[51]);
a =II(a, b, c, d, M_offset_12,6, T[52]);
d =II(d, a, b, c, M_offset_3,10, T[53]);
c =II(c, d, a, b, M_offset_10,15, T[54]);
b =II(b, c, d, a, M_offset_1,21, T[55]);
a =II(a, b, c, d, M_offset_8,6, T[56]);
d =II(d, a, b, c, M_offset_15,10, T[57]);
c =II(c, d, a, b, M_offset_6,15, T[58]);
b =II(b, c, d, a, M_offset_13,21, T[59]);
a =II(a, b, c, d, M_offset_4,6, T[60]);
d =II(d, a, b, c, M_offset_11,10, T[61]);
c =II(c, d, a, b, M_offset_2,15, T[62]);
b =II(b, c, d, a, M_offset_9,21, T[63]);
H[0]=(H[0]+ a)|0;
H[1]=(H[1]+ b)|0;
H[2]=(H[2]+ c)|0;
H[3]=(H[3]+ d)|0;}, _doFinalize:function(){
var data = this._data;
var dataWords = data.words;
var nBitsTotal = this._nDataBytes *8;
var nBitsLeft = data.sigBytes *8;
dataWords[nBitsLeft >>>5]|=0x80<<(24- nBitsLeft %32);
var nBitsTotalH = Math.floor(nBitsTotal /0x100000000);
var nBitsTotalL = nBitsTotal;
dataWords[(((nBitsLeft +64)>>>9)<<4)+15]=((((nBitsTotalH <<8)|(nBitsTotalH >>>24))&0x00ff00ff)|(((nBitsTotalH <<24)|(nBitsTotalH >>>8))&0xff00ff00));
dataWords[(((nBitsLeft +64)>>>9)<<4)+14]=((((nBitsTotalL <<8)|(nBitsTotalL >>>24))&0x00ff00ff)|(((nBitsTotalL <<24)|(nBitsTotalL >>>8))&0xff00ff00));
data.sigBytes =(dataWords.length +1)*4;
this._process();
var hash = this._hash;
var H = hash.words;for(var i =0; i <4; i++){
var H_i = H[i];
H[i]=(((H_i <<8)|(H_i >>>24))&0x00ff00ff)|(((H_i <<24)|(H_i >>>8))&0xff00ff00);}return hash;}, clone:function(){
var clone = Hasher.clone.call(this);
clone._hash = this._hash.clone();return clone;}});
function FF(a, b, c, d, x, s, t){
var n = a +((b & c)|(~b & d))+ x + t;return((n << s)|(n >>>(32- s)))+ b;}
function GG(a, b, c, d, x, s, t){
var n = a +((b & d)|(c &~d))+ x + t;return((n << s)|(n >>>(32- s)))+ b;}
function HH(a, b, c, d, x, s, t){
var n = a +(b ^ c ^ d)+ x + t;return((n << s)|(n >>>(32- s)))+ b;}
function II(a, b, c, d, x, s, t){
var n = a +(c ^(b |~d))+ x + t;return((n << s)|(n >>>(32- s)))+ b;}
C.MD5 = Hasher._createHelper(MD5);
C.HmacMD5 = Hasher._createHmacHelper(MD5);}(Math));
function MD5_Encrypt(word){return CryptoJS.MD5(word).toString();//反转://return CryptoJS.MD5(word).toString().split("").reverse().join("");}
2、execjs模块调用JS代码
import execjs
with open(file='encode.js', mode='r', encoding='utf-8') as fis:
js_code = fis.read() # 读取JS代码文件
js_obj = execjs.compile(js_code) # 激将JS代码传入
js_obj.call('function','params') # 调用JS的函数, 参数1:函数名、参数2:该函数所需要的参数
3、python代码
import os
import time
import execjs
import requests
from lxml import etree
class Spider(object):
def __init__(self):
self.headers ={'user-agent': 'Mozilla/5.0(Windows NT 10.0; Win64; x64) AppleWebKit/537.36(KHTML, like Gecko) '
'Chrome/95.0.4638.69 Safari/537.36','authority':'music.91q.com'}
with open(file='encode.js', mode='r', encoding='utf-8') as fis:
js_code = fis.read()
self.js_obj = execjs.compile(js_code)
self.timestamp =int(time.time())
self.appid =16073360
self.secret ='0b50b02fd0d73a9c4c8c3a781c30845f'
self.resource_path ='music_download'
def create_directory(self):if os.path.exists(self.resource_path) is False:
os.mkdir(self.resource_path)
def song_download(self, song_name, music_url):
file_path = f"{self.resource_path}/{song_name}.mp3"
response = requests.get(url=music_url, headers=self.headers)
with open(file_path,'wb') as fis:for chunk in response.iter_content(chunk_size=1000):
fis.write(chunk)
fis.flush()print(music_url)print(f"{file_path} - 下载完成!")
def get_params(self, song_id):
encode_string = f'TSID={song_id}&appid={self.appid}×tamp={self.timestamp}{self.secret}'
sign = self.js_obj.call('MD5_Encrypt', encode_string)
params ={'sign': sign,'appid': self.appid,'TSID': song_id,'timestamp': self.timestamp
}print(params)return params
def parse(self, song_id):
url = 'https://music.91q.com/v1/song/tracklinko'
params = self.get_params(song_id=song_id)
response = requests.get(url=url, headers=self.headers, params=params)
result ={'status': False}if response.status_code ==200 and response.json()['data']['path'] is not None:
json_data = response.json()
result['song_name']= json_data['data']['title']
result['singer']= json_data['data']['artist'][0]['name']
result['music_url']= json_data['data']['path']
result['status']= True
return result
def runs(self, keyword):
self.create_directory()
keyword_search_url = f'https://music.91q.com/search?word={keyword}'
response = requests.get(url=keyword_search_url, headers=self.headers)
selects = etree.HTML(response.text)
li_list = selects.xpath('//ul[@data-v-e9211496=""]/li')
song_info_item =dict()for li in li_list[1:]:
song_name = li.xpath('div[@class="song ellipsis clearfix"]/div[@class="song-box"]/a/text()')[0]
singer = li.xpath('div[@class="artist ellipsis"]/a/text()')[0]
song_id = li.xpath('div[@class="song ellipsis clearfix"]/div[@class="song-box"]/a/@href')[0]
song_id = song_id.replace('/song/','')
song_info_item[song_id]= song_name
print(f"歌名:{song_name},\t\t歌曲ID:{song_id},\t\t歌手:{singer}")
song_id =input("请输入歌曲ID:").strip()
result = self.parse(song_id)if result['status'] is False:print(f"{song_id} - Vip歌曲不能下载")return False
self.song_download(song_name=result['song_name'], music_url=result['music_url'])if __name__ =='__main__':
ky =input("请输入搜索关键词:").strip()Spider().runs(keyword=ky)
六、Pip模块安装
镜像地址
- 清华:https://pypi.tuna.tsinghua.edu.cn/simple
- 阿里云:http://mirrors.aliyun.com/pypi/simple/
- 中国科技大学 https://pypi.mirrors.ustc.edu.cn/simple/
- 华中理工大学:http://pypi.hustunique.com/
- 山东理工大学:http://pypi.sdutlinux.org/
- 豆瓣:http://pypi.douban.com/simple/
案例使用到的第三方模块以及对应版本
- requests==2.27.0
- PyExecJS==1.5.1
- lxml==4.9.1
pip指定模块安装:pip install 模块名 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip指定requirements.txt文件安装:pip install -i https://pypi.doubanio.com/simple/ -r requirements.txt
总结
例如:以上就是今天要讲的内容,此次案例只为学习交流使用,切勿用于其他非法用途,网站url、接口url请使用base64.b64decode自行解码。
版权归原作者 EXI-小洲 所有, 如有侵权,请联系我们删除。