0


Hadoop 华为P30手机评论画像的分析 --爬虫与java部分代码

java项目的源码:链接:https://pan.baidu.com/s/1Z_JS0op3ZkjMLUdq4ZNsvg?pwd=0rtl
提取码:0rtl

1.数据获取(爬虫)

这爬的是京东商城的数据,在京东搜一下华为p30找到商品点进去

右键检查找到网络 刷新一下页面 在下面可以发现一个?appid开头的请求

d66f86b2aafd4ce39e2648540f66a486.jpeg

点进去发现评论的响应数据的json串,很好这样在处理数据的时候可以使用json函数对数据进行处理。

点开标题找到这个请求的url 这个就是在python里爬虫所需的url 同时下面还有所需的cookie 和 User-Agent。

f9c1b74a5759482b8e58644c858d0c7e.jpeg

下面是爬虫的py代码 把url、cookie、User-Agent换成自己的找的即可。

import requests
import json
import pandas as pd

def find_data():
    ids=[] #id
    nicknames = []  # 会员名
    plusAvailables = []  # 级别
    scores = []  # 评价星级
    contents = []  # 评论内容
    creationTimes = []  # 时间
    usefulVoteCounts = []  # 点赞数
    productColors = []  # 商品属性
    referenceNames = []  # 页面标题
    url1 = 'https://api.m.jd.com/?appid=item-v3&functionId=pc_club_productPageComments&client=pc&clientVersion=1.0.0&t=1715764989106&body=%7B%22productId%22%3A10097671340655%2C%22score%22%3A0%2C%22sortType%22%3A5%2C%22page%22%3A0%2C%22pageSize%22%3A10%2C%22isShadowSku%22%3A0%2C%22rid%22%3A0%2C%22fold%22%3A1%2C%22bbtf%22%3A%22%22%2C%22shield%22%3A%22%22%7D&h5st=20240515172309115%3Bzyyn6zn66m9gtiz1%3Bfb5df%3Btk03w91fb1c0e18nAjo8it8i5ld3LCgGnQYpVO85Y36Dy7UUx5LpSOICuLSu6rIy5c-pd9gwrHnL2S1ItvaBp0wkzLMo%3B64b0943133c7a8b99ad7261cfd7deef74571e61f771e2c990a30c92dd91657cd%3B4.7%3B1715764989115%3BVWcaqHrMOSNEv3BCIYlC5aa1MEZlXmiG8DwHcdRFEwB98LOD8iXup3iqf0ThRv4gF52hOhpQyl7pw1LLb8zvAWRVKNN3dDWvNmgEe_5UkmohEendmsOzPF9l_RKf-e8Zo7GMRD61fq3-ZBK6gUkNI0KpdDiclYxRoeb_f2EJSknLQRqRPF75zDcoUdYSwhBJSgYpJlQ3nH9kffsF1qEfmPDjlnMHJ4jCfjioFkHufbPlGos2pWYqXq-e5CIv0FzmPK_tUsfqmTmJzkqmKYz34e7Xr0-OfFOCBjQDFgvsIAB-4Ucm2qFghNvBIT62uiWtR_5VeHrnekhEOIw_e6zdBpGRFSMhVyv2XhwttHj73PP1z9bL563BIHIyE7o1-KIR7E5Khd-pRm8Hhv9A_sYy-jzun-iNF5XwmfupA2hgQFl4umOO2B_hq1a6o_TYRNI4o4rEL4N-9ux-vp1Tu6gTiVemfEG0Y1V-7m5iwNFR9RANlKoejtmPLbQLhdDfjf3UtUpVN9F5Vk2iRATHjqtoo6O8fY_p5xIpUsrVxOLCu7nZggE7nDk8PeheJO0dl8zjLad9Prk3hGJ0DQIeqffFGvzEemLTD52YgeDqWQHLXbk3&x-api-eid-token=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX&loginType=3&uuid=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4'
    url2 = 'https://api.m.jd.com/?appid=item-v3&functionId=pc_club_productPageComments&client=pc&clientVersion=1.0.0&t=1715765065646&body=%7B%22productId%22%3A10097671340655%2C%22score%22%3A0%2C%22sortType%22%3A5%2C%22page%22%3A2%2C%22pageSize%22%3A10%2C%22isShadowSku%22%3A0%2C%22rid%22%3A0%2C%22fold%22%3A1%2C%22bbtf%22%3A%22%22%2C%22shield%22%3A%22%22%7D&h5st=20240515172425656%3Bzyyn6zn66m9gtiz1%3Bfb5df%3Btk03w91fb1c0e18nAjo8it8i5ld3LCgGnQYpVO85Y36Dy7UUx5LpSOICuLSu6rIy5c-pd9gwrHnL2S1ItvaBp0wkzLMo%3Bec64cccb6e9461b123314e46dea2aeb0d4d40db804b8976d7303983f62330210%3B4.7%3B1715765065656%3BVmcfNRnfG0qvXxu6FEz4uaZPMEnh3DoZCP8E99QfR8uBesz4gj-UiFQXHTm4LgwapPPpblOCIogMU6jpiqkfHELZimAKO0ik2AUlmyW_TT7Bqu9C0ijt7FBK64arDRbizsgGawqSQ7hazCLpMzA7j0AdfkJZpdFU8jpYaJwXZxAgXJL9eKa9oBx6II4qYbzgXUrHNNz8Nl8j1Hl7wze8QxuGzLbC9oei0KkNMQiHXEdPtqyk1GmNB5vRZyHQCaT99n5bsTeXWyJM4bycWbFKpOrUnW9tHtRqUr8pjqPL3qoyK5OpT6wg1VkHUHrUW5pG4qz3-29g6-TknRvZF9Fi9kvGs1d40HHRXWX3hue8mAufz9bL563BIHIyE7o1-KIR7E5Khd-pRm8Hhv9A_sYy-jzun-iNF5XwmfupA2hgQFl4umOO2B_hq1a6o_TYRNI4o4rEL4N-9ux-vp1Tu6gTiVemfEG0Y1V-7m5iwNFR9RANlKoejtmPLbQLhdDfjf3UtUpVN9F5Vk2iRATHjqtoo6O8fY_p5xIpUsrVxOLCu7nZggE7nDk8PeheJO0dl8zjLad9Prk3hGJ0DQIeqffFGvzEemLTD52YgeDqWQHLXbk3&x-api-eid-token=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX&loginType=3&uuid=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4'
    url3 = 'https://api.m.jd.com/?appid=item-v3&functionId=pc_club_productPageComments&client=pc&clientVersion=1.0.0&t=1715765094422&body=%7B%22productId%22%3A10097671340655%2C%22score%22%3A0%2C%22sortType%22%3A5%2C%22page%22%3A3%2C%22pageSize%22%3A10%2C%22isShadowSku%22%3A0%2C%22rid%22%3A0%2C%22fold%22%3A1%2C%22bbtf%22%3A%22%22%2C%22shield%22%3A%22%22%7D&h5st=20240515172454431%3Bzyyn6zn66m9gtiz1%3Bfb5df%3Btk03w91fb1c0e18nAjo8it8i5ld3LCgGnQYpVO85Y36Dy7UUx5LpSOICuLSu6rIy5c-pd9gwrHnL2S1ItvaBp0wkzLMo%3Bf8086784e4ccfc76823620bf537bad11f52122e1537a1d5ae05e5233bf9ae210%3B4.7%3B1715765094431%3BVikj-B4uOlcVFjeEuVzEYxlMibV4FZhZGRJK2ZGrIUJM-XfzpK5pe0VKAv-NHYFGVQTgVSp53O3TE5iuHBMg18MFl_95CFiWDOWPLUVpuoNRwglDpZOdoZsHwR9V_8CbMs8EdgnFKOcsz91IMQd6GmW5KzD30uSE3ET5HNjPFc2KZSBLNmI0lyaW0_zPTEPpeWzvTPOyghgElHLelKZEhdc3wcmKwSqluyHZmEF7A11tbQBgu68I7kXRb0vk-cWVKjTliLji-wfyo1hB_lSAMh_DyFcWfFihExbq4SWEY0Demu4tReyJDjUiXxPE9_qyD3DMPi1L0eDBT68MuxBrNxeY2n_xxDmiR-VRL6FAqTcVz9bL563BIHIyE7o1-KIR7E5Khd-pRm8Hhv9A_sYy-jzun-iNF5XwmfupA2hgQFl4umOO2B_hq1a6o_TYRNI4o4rEL4N-9ux-vp1Tu6gTiVemfEG0Y1V-7m5iwNFR9RANlKoejtmPLbQLhdDfjf3UtUpVN9F5Vk2iRATHjqtoo6O8fY_p5xIpUsrVxOLCu7nZggE7nDk8PeheJO0dl8zjLad9Prk3hGJ0DQIeqffFGvzEemLTD52YgeDqWQHLXbk3&x-api-eid-token=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX&loginType=3&uuid=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4'
    url4 = 'https://api.m.jd.com/?appid=item-v3&functionId=pc_club_productPageComments&client=pc&clientVersion=1.0.0&t=1715765135107&body=%7B%22productId%22%3A10097671340655%2C%22score%22%3A0%2C%22sortType%22%3A5%2C%22page%22%3A4%2C%22pageSize%22%3A10%2C%22isShadowSku%22%3A0%2C%22rid%22%3A0%2C%22fold%22%3A1%2C%22bbtf%22%3A%22%22%2C%22shield%22%3A%22%22%7D&h5st=20240515172535115%3Bzyyn6zn66m9gtiz1%3Bfb5df%3Btk03w91fb1c0e18nAjo8it8i5ld3LCgGnQYpVO85Y36Dy7UUx5LpSOICuLSu6rIy5c-pd9gwrHnL2S1ItvaBp0wkzLMo%3Bb2158e0e13f27bc65cb09698b5183504007e2bd44afd74aa356eed853cf92958%3B4.7%3B1715765135115%3BVqoiFrzAw6WxbRUhKZmT7aUTscqES3Itce9ClqAwMNJMWvb2uV7W4RyZ48GI-vQYPPgWkKRhNLyyPcDtJR3050eRS2mPUAzKCx2FfHtv7GRFOxZaH7ZRvoil1g16Etp_sZk8RX4l6FWiwR9GgSOfS-6PnL7P4LTN8SeNmM6nDWRB3N0xj8-nRgu0VJt7fzDjg0nzWG1VlAmLC6OjaDDIJ6NnlRCZnBO0QNfkfP_ZavYk0duYWOPpMOVZpDxldwlREvwBBCwF3A2zz51x3A70ZJ6Xr0-OfFOCBjQDFgvsIAB-4Ucm2qFghNvBIT62uiWtR_5VeHrnekhEOIw_e6zdBpGRFSMhVyv2XhwttHj73PP1z9bL563BIHIyE7o1-KIR7E5Khd-pRm8Hhv9A_sYy-jzun-iNF5XwmfupA2hgQFl4umOO2B_hq1a6o_TYRNI4o4rEL4N-9ux-vp1Tu6gTiVemfEG0Y1V-7m5iwNFR9RANlKoejtmPLbQLhdDfjf3UtUpVN9F5Vk2iRATHjqtoo6O8fY_p5xIpUsrVxOLCu7nZggE7nDk8PeheJO0dl8zjLad9Prk3hGJ0DQIeqffFGvzEemLTD52YgeDqWQHLXbk3&x-api-eid-token=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX&loginType=3&uuid=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4'
    url5 = 'https://api.m.jd.com/?appid=item-v3&functionId=pc_club_productPageComments&client=pc&clientVersion=1.0.0&t=1715765167370&body=%7B%22productId%22%3A10097671340655%2C%22score%22%3A0%2C%22sortType%22%3A5%2C%22page%22%3A5%2C%22pageSize%22%3A10%2C%22isShadowSku%22%3A0%2C%22rid%22%3A0%2C%22fold%22%3A1%2C%22bbtf%22%3A%22%22%2C%22shield%22%3A%22%22%7D&h5st=20240515172607381%3Bzyyn6zn66m9gtiz1%3Bfb5df%3Btk03w91fb1c0e18nAjo8it8i5ld3LCgGnQYpVO85Y36Dy7UUx5LpSOICuLSu6rIy5c-pd9gwrHnL2S1ItvaBp0wkzLMo%3Be0b80118cc7579c2ec324d43f7ba3081189b020020c981c444d7090b3f540144%3B4.7%3B1715765167381%3BV2UxUyHedpZUnF8FcCrURNy0BzgMcyFfQiT4HX8uYLae1rauRS_LxkZkEndByW_mv5MxQRe3udmev3ziYMfBUJ_AftvISh_jxXSO-uJ2gLx8yAFX4YVZ8rzB-cyw9gaC5VFieg83-w4ouLO-vYPh3Z4czsuj0SrqT9JjcAMyIjJ0_TIomWWw2j1OiouxMgEjE7i3vwMmuHQRIZQuphWy3wrNAgR9ef6mOxcSX87Ur-4-k6rt1tx19aySIWdht9N5NVlx7b62QDW5pLfT0K10hv-DyFcWfFihExbq4SWEY0Demu4tReyJDjUiXxPE9_qyD3DMPi1L0eDBT68MuxBrNxeY2n_xxDmiR-VRL6FAqTcVz9bL563BIHIyE7o1-KIR7E5Khd-pRm8Hhv9A_sYy-jzun-iNF5XwmfupA2hgQFl4umOO2B_hq1a6o_TYRNI4o4rEL4N-9ux-vp1Tu6gTiVemfEG0Y1V-7m5iwNFR9RANlKoejtmPLbQLhdDfjf3UtUpVN9F5Vk2iRATHjqtoo6O8fY_p5xIpUsrVxOLCu7nZggE7nDk8PeheJO0dl8zjLad9Prk3hGJ0DQIeqffFGvzEemLTD52YgeDqWQHLXbk3&x-api-eid-token=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX&loginType=3&uuid=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4'
    url6 = 'https://api.m.jd.com/?appid=item-v3&functionId=pc_club_productPageComments&client=pc&clientVersion=1.0.0&t=1715765200749&body=%7B%22productId%22%3A10097671340655%2C%22score%22%3A0%2C%22sortType%22%3A5%2C%22page%22%3A6%2C%22pageSize%22%3A10%2C%22isShadowSku%22%3A0%2C%22rid%22%3A0%2C%22fold%22%3A1%2C%22bbtf%22%3A%22%22%2C%22shield%22%3A%22%22%7D&h5st=20240515172640761%3Bzyyn6zn66m9gtiz1%3Bfb5df%3Btk03w91fb1c0e18nAjo8it8i5ld3LCgGnQYpVO85Y36Dy7UUx5LpSOICuLSu6rIy5c-pd9gwrHnL2S1ItvaBp0wkzLMo%3Ba306fc071ff6d1b73a905fbee6c5935ccc1b46b044b92f3ee5f709d44a2f8f4f%3B4.7%3B1715765200761%3BVyA5JSwCoi1CI-mcuU5IuF25i6l7UwDD8kQH79eyMC-0RApptJzya_VQUBj6jZy2YSgRJclGnwFBKSNiPAYLPtZDgYm0cVXTYzreAs93-xYK6pRqaMFV4g9KXpCIm9rQjVI76FWUPx3oIU-q4hGYFcGQQP4jiEcihoVEGvQqBmSp37HiZYmw29PCCp3MtTqCY0JIMSkHR2Si7Zm7krhBDvgzFN7oMWCqMLqVndEuOHzYXlxMrzDqjFkUj16AtoHI3soYABox93W2ygXw5Ymof4GDyFcWfFihExbq4SWEY0Demu4tReyJDjUiXxPE9_qyD3DMPi1L0eDBT68MuxBrNxeY2n_xxDmiR-VRL6FAqTcVz9bL563BIHIyE7o1-KIR7E5Khd-pRm8Hhv9A_sYy-jzun-iNF5XwmfupA2hgQFl4umOO2B_hq1a6o_TYRNI4o4rEL4N-9ux-vp1Tu6gTiVemfEG0Y1V-7m5iwNFR9RANlKoejtmPLbQLhdDfjf3UtUpVN9F5Vk2iRATHjqtoo6O8fY_p5xIpUsrVxOLCu7nZggE7nDk8PeheJO0dl8zjLad9Prk3hGJ0DQIeqffFGvzEemLTD52YgeDqWQHLXbk3&x-api-eid-token=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX&loginType=3&uuid=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4'
    url7 = 'https://api.m.jd.com/?appid=item-v3&functionId=pc_club_productPageComments&client=pc&clientVersion=1.0.0&t=1715765215481&body=%7B%22productId%22%3A10097671340655%2C%22score%22%3A0%2C%22sortType%22%3A5%2C%22page%22%3A7%2C%22pageSize%22%3A10%2C%22isShadowSku%22%3A0%2C%22rid%22%3A0%2C%22fold%22%3A1%2C%22bbtf%22%3A%22%22%2C%22shield%22%3A%22%22%7D&h5st=20240515172655492%3Bzyyn6zn66m9gtiz1%3Bfb5df%3Btk03w91fb1c0e18nAjo8it8i5ld3LCgGnQYpVO85Y36Dy7UUx5LpSOICuLSu6rIy5c-pd9gwrHnL2S1ItvaBp0wkzLMo%3Bf6787f3d99499e35f5e08cad10f02a92cbc2eb23cd21b8c6dff2892c9ea49788%3B4.7%3B1715765215492%3BVKIwdFe3d631S0oIj34BVZHROut7JspBrSWfF3igHy9mBvqEFmZX9TzRP5NENjtOanYkDwC6DDKZP_HYahbkjfjxgQLnUm6TKtMsQTK1q-h3v0arrVkN3NRyuT5yaMxS4ALVYIuYHRhpfkAI0xVqjs5tYA06QEmD9y0ZFI27--TCMRfLyTN88xSjrmKUOixNjp_2UuJxmcQQyPkHR4ZFa58orOiV1don4UIyJDmOerLXo6ffHcZJ9EL3OzjHm4a8FnCyJQ4-Obo231rbH9tn5dDXr0-OfFOCBjQDFgvsIAB-4Ucm2qFghNvBIT62uiWtR_5VeHrnekhEOIw_e6zdBpGRFSMhVyv2XhwttHj73PP1z9bL563BIHIyE7o1-KIR7E5Khd-pRm8Hhv9A_sYy-jzun-iNF5XwmfupA2hgQFl4umOO2B_hq1a6o_TYRNI4o4rEL4N-9ux-vp1Tu6gTiVemfEG0Y1V-7m5iwNFR9RANlKoejtmPLbQLhdDfjf3UtUpVN9F5Vk2iRATHjqtoo6O8fY_p5xIpUsrVxOLCu7nZggE7nDk8PeheJO0dl8zjLad9Prk3hGJ0DQIeqffFGvzEemLTD52YgeDqWQHLXbk3&x-api-eid-token=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX&loginType=3&uuid=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4'
    url8 = 'https://api.m.jd.com/?appid=item-v3&functionId=pc_club_productPageComments&client=pc&clientVersion=1.0.0&t=1715765234747&body=%7B%22productId%22%3A10097671340655%2C%22score%22%3A0%2C%22sortType%22%3A5%2C%22page%22%3A8%2C%22pageSize%22%3A10%2C%22isShadowSku%22%3A0%2C%22rid%22%3A0%2C%22fold%22%3A1%2C%22bbtf%22%3A%22%22%2C%22shield%22%3A%22%22%7D&h5st=20240515172714752%3Bzyyn6zn66m9gtiz1%3Bfb5df%3Btk03w91fb1c0e18nAjo8it8i5ld3LCgGnQYpVO85Y36Dy7UUx5LpSOICuLSu6rIy5c-pd9gwrHnL2S1ItvaBp0wkzLMo%3B18450902170483f1b5cd4c409dac47892eaeba0b8a8a3fbd58acf22ef9bc1040%3B4.7%3B1715765234752%3BVCv1tZaZBmhfSVlg7CyKUVYI-1akDgm97H70nHVAaIRCg9YJm2yFJMzzTnouCSkYbDxWv06jArq-SwGeJYllfBP2rN5gF_Pyx5vc433Lp0Orx7PBIrEF8PL5OOFQ9d3KToEZPLRM2T5UZtfmdgPhC4yF8xsanFdWEtUQmPMKrOJNjxFDBTrDy9F6Lgvt-03RZyPHqZKWZF2QLxP55y921wgRY5ekJRtmDuYrMmOgFlr3tQJV1hmPfxIZ7j_oIJAGO8vh0WlY6o4PVteLMyRgkD_Xr0-OfFOCBjQDFgvsIAB-4Ucm2qFghNvBIT62uiWtR_5VeHrnekhEOIw_e6zdBpGRFSMhVyv2XhwttHj73PP1z9bL563BIHIyE7o1-KIR7E5Khd-pRm8Hhv9A_sYy-jzun-iNF5XwmfupA2hgQFl4umOO2B_hq1a6o_TYRNI4o4rEL4N-9ux-vp1Tu6gTiVemfEG0Y1V-7m5iwNFR9RANlKoejtmPLbQLhdDfjf3UtUpVN9F5Vk2iRATHjqtoo6O8fY_p5xIpUsrVxOLCu7nZggE7nDk8PeheJO0dl8zjLad9Prk3hGJ0DQIeqffFGvzEemLTD52YgeDqWQHLXbk3&x-api-eid-token=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX&loginType=3&uuid=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4'
    url9 = 'https://api.m.jd.com/?appid=item-v3&functionId=pc_club_productPageComments&client=pc&clientVersion=1.0.0&t=1715765249080&body=%7B%22productId%22%3A10097671340655%2C%22score%22%3A0%2C%22sortType%22%3A5%2C%22page%22%3A9%2C%22pageSize%22%3A10%2C%22isShadowSku%22%3A0%2C%22rid%22%3A0%2C%22fold%22%3A1%2C%22bbtf%22%3A%22%22%2C%22shield%22%3A%22%22%7D&h5st=20240515172729093%3Bzyyn6zn66m9gtiz1%3Bfb5df%3Btk03w91fb1c0e18nAjo8it8i5ld3LCgGnQYpVO85Y36Dy7UUx5LpSOICuLSu6rIy5c-pd9gwrHnL2S1ItvaBp0wkzLMo%3B9fb430c125d0f376526aba776becf1eb71fe87ffec2dcb519b7c0dcc5b42f73c%3B4.7%3B1715765249093%3BV_kpgGa8f2ZcC5N1c5npv7k26RI3bubSJPRoheIRpoEnO_LwwQAAGh5amsjJI6HaDjICFlzeXQBx_D6oRAayUbxiR5UACQh1baSz24Ixg8O2ntyfp5qIeD6wEWA2a7deV4yemV6IZAzr_y6muH6Gab6gDVuX61PDSEpwpa-C00q0E7HcpP4_-0H9336359BBakV9k6WyTWnZ19nrT2_YQPRSh9FKq6BSp0ZK4XZKCa4hGnYoqXl5RGdM_mvVXPXriCMHxMEPJlEPGp4NqqcpSm4TV2rS3q_oiDYW-sRaDqzHuOLjnO0mzVnPgQ2TX7wfLQxjrlD-VGJ5497cTQKld3u40Ac4pPhQGq1lBuP-7JYKz9bL563BIHIyE7o1-KIR7E5Khd-pRm8Hhv9A_sYy-jzun-iNF5XwmfupA2hgQFl4umOO2B_hq1a6o_TYRNI4o4rEL4N-9ux-vp1Tu6gTiVemfEG0Y1V-7m5iwNFR9RANlKoejtmPLbQLhdDfjf3UtUpVN9F5Vk2iRATHjqtoo6O8fY_p5xIpUsrVxOLCu7nZggE7nDk8PeheJO0dl8zjLad9Prk3hGJ0DQIeqffFGvzEemLTD52YgeDqWQHLXbk3&x-api-eid-token=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX&loginType=3&uuid=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4'
    url10 = 'https://api.m.jd.com/?appid=item-v3&functionId=pc_club_productPageComments&client=pc&clientVersion=1.0.0&t=1715765263811&body=%7B%22productId%22%3A10097671340655%2C%22score%22%3A0%2C%22sortType%22%3A5%2C%22page%22%3A10%2C%22pageSize%22%3A10%2C%22isShadowSku%22%3A0%2C%22rid%22%3A0%2C%22fold%22%3A1%2C%22bbtf%22%3A%22%22%2C%22shield%22%3A%22%22%7D&h5st=20240515172743822%3Bzyyn6zn66m9gtiz1%3Bfb5df%3Btk03w91fb1c0e18nAjo8it8i5ld3LCgGnQYpVO85Y36Dy7UUx5LpSOICuLSu6rIy5c-pd9gwrHnL2S1ItvaBp0wkzLMo%3B809b236c7f3751e81e26ead244afb2b2d60ec68707625a54424b6490cf3b6e0c%3B4.7%3B1715765263822%3BVuSfAQnUawC3oFQqnSCPj22-pQW9NgaI2_owyI191O0-_DF6nk_o3gVvGiPqIHLBhgEkzO05ZbK6fOwojvY5khYBjSjTP8Ey3Sj7jWV43ifswjUoeSL_I1aOIiMokVX08BhPH4rPZ3N8RLZrzc9Q-vsWG9AWyFEh48Zymz1PB0GF-yc2PmxyH9ezqZm8_WdvX2lCjcO2J7Dy44iLSzx8KRvpyjs5PCrS0h-PpYEX0TPfPu3X4zohi4vYBZHSIxbqpArafHcPf6Ga4Bui1P7ckDFDyFcWfFihExbq4SWEY0Demu4tReyJDjUiXxPE9_qyD3DMPi1L0eDBT68MuxBrNxeY2n_xxDmiR-VRL6FAqTcVz9bL563BIHIyE7o1-KIR7E5Khd-pRm8Hhv9A_sYy-jzun-iNF5XwmfupA2hgQFl4umOO2B_hq1a6o_TYRNI4o4rEL4N-9ux-vp1Tu6gTiVemfEG0Y1V-7m5iwNFR9RANlKoejtmPLbQLhdDfjf3UtUpVN9F5Vk2iRATHjqtoo6O8fY_p5xIpUsrVxOLCu7nZggE7nDk8PeheJO0dl8zjLad9Prk3hGJ0DQIeqffFGvzEemLTD52YgeDqWQHLXbk3&x-api-eid-token=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX&loginType=3&uuid=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4'
    urls = [url1, url2, url3, url4, url5, url6, url7, url8, url9, url10]
    headers = {
        "cookie":"__jdu=1707228359583795780053; shshshfpa=d1361ffe-167f-ec74-20d1-e42fe03107a7-1707228363; shshshfpx=d1361ffe-167f-ec74-20d1-e42fe03107a7-1707228363; __jdv=76161171|cn.bing.com|-|referral|-|1715693984210; areaId=7; PCSYCityID=CN_410000_410800_0; _pst=jd_513d621bae3e2; unick=jd_183378soh; pin=jd_513d621bae3e2; _tp=aHNrLjOiSYXV7jWIeUqQ4sbp5qlaOCZtxwKpkoxnwaE%3D; ipLoc-djd=7-446-452-37446; 3AB9D23F7A4B3CSS=jdd03FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4QAAAAMPPOB26HQAAAAACNKMOZOI6P4CZAX; mba_muid=1707228359583795780053; mba_sid=17157642121944040777400342219.1; wlfstk_smdl=iog1cj3owe7sz06qic6txutoo8sor2h9; logintype=wx; npin=jd_513d621bae3e2; thor=6C9544F65CC953AB8FA5E4807D15EFA824BD8D7AFE06018A0113949A7EFBD938EADBFA23AEE2C43A1AD1820A5EFE5912FFAEF4AA07F5D0903E1A864D7F46B0CD1F3FA8D188E510C38BBC8825DCA2B0A721442713343D63167ACEEB246F5A1EDBBCEFF5F22351CCF753141DE154889AD4E17C4ABD218B553E2B1093C45815CFB9E55759AAE6A713AA739685A2091733BB94473047F99038AEF898D0C0B8130286; pinId=481nurn_kyW_8Z2JMkoorbV9-x-f3wj7; jsavif=1; token=e341fa134925c90714faee5fd65f9269,3,953202; __tk=9a419db1fe3a7304be0a640a1e9cbadc,3,953202; __jda=181111935.1707228359583795780053.1707228359.1715734775.1715764177.4; __jdc=181111935; 3AB9D23F7A4B3C9B=FXD3524HGJXHERKIKYAKEHKEOM4XU6URCC6D77X3HDOTRBB2VI5YORUN6VAFNRF3ECO3DIG63Z7W7J6XNPLQMFLP4Q; __jdb=181111935.8.1707228359583795780053|4.1715764177; shshshfpb=BApXcmHSMeOpAwLHu5DgCwJIYjVhb97Z9BkVjhExq9xJ1MncH34O2; flash=2_qwWYiitMosJEh6yiIwi8UE5m7pwbbs6CXSfxJtwevoM36LbZ4Pf40tGJrKLyWR9NjChkpCx2Z1mkq40NBx2NwQ4X5y6XKjln5g6KI-u2otqBz45OMZ1jioTc3jxyfYOf4W3yweZUdOFHG_swy36pFd-yqtVUO829v9SXja3tp65*",
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0"}
    for i1 in urls:
        url = i1
        r = requests.get(url, headers=headers).json()
        data = r['comments']
        for i in data:
            ids.append(i['id'])
            nicknames.append(i['nickname'])
            if i['plusAvailable']>0:
                plusAvailables.append('PLUS会员')
            else:
                plusAvailables.append("")
            scores.append(i['score'])
            contents.append(i['content'])
            creationTimes.append(i['creationTime'])
            usefulVoteCounts.append(i['usefulVoteCount'])
            productColors.append(i['productColor'])
            referenceNames.append(i['referenceName'])
    datadf={
        'id':ids,
        '会员':nicknames,
        '级别':plusAvailables,
        '评价星级':scores,
        '评价内容':contents,
        '时间':creationTimes,
        '点赞数':usefulVoteCounts,
        '商品属性':productColors,
        '页面标题':referenceNames
    }
    df = pd.DataFrame(datadf)
    df.to_excel('jdp30.xlsx',index=False)

if __name__ == '__main__':
    find_data()

2.java代码与可视化

package com.hw.cn.hwp30;

import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.ResponseBody;

import java.sql.*;
import java.util.ArrayList;
import java.util.List;

@Controller
public class ShowController {
    public  static  final  String URL = "jdbc:hive2://192.168.18.130:10001/default";
    public  static  final  String USER = "hive";
    public  static  final  String PASS = "Yh183360@";

    public ArrayList<Integer> finda(String sql) throws SQLException {
        ArrayList<Integer> list = new ArrayList<>();
        Connection conn = DriverManager.getConnection(URL, USER, PASS);
        Statement stmt =conn.createStatement();
        ResultSet rs = stmt.executeQuery(sql);
        System.out.println("开始");
        int a=0;
        int b=0;
        while (rs.next()){
            String  aa=rs.getString("weekday");
            if (aa.equals("Working days"))a++;
            if (aa.equals("weekend"))b++;
        }
        list.add(a);
        list.add(b);
        return list;
    }

    @RequestMapping("/tb_a1")
    public @ResponseBody List tb_a() throws SQLException {
        String sql ="select * from tb_a";
        ArrayList<Integer> integers = finda(sql);
        for (Integer integer : integers) {
            System.out.println(integer);
        }
        return integers;
    }

    public ArrayList<Integer> findb(String sql) throws SQLException {
        ArrayList<Integer> list = new ArrayList<>();
        Connection conn = DriverManager.getConnection(URL, USER, PASS);
        Statement stmt =conn.createStatement();
        ResultSet rs = stmt.executeQuery(sql);
        int a=0;
        int b=0;
        int c=0;
        int d=0;
        while (rs.next()){
            String aa=rs.getString("st");
            if (aa != null){
                if (aa.equals("0:00~8:00"))a++;
                if (aa.equals("11:30~13:30"))b++;
                if (aa.equals("13:30~17:30"))c++;
                if (aa.equals("17:30~24:00"))d++;
            }
        }
        list.add(a);
        list.add(b);
        list.add(c);
        list.add(d);
        return list;
    }

    @RequestMapping("/tb_b1")
    public @ResponseBody List tb_b() throws SQLException {
        String sql ="select * from tb_b";
        ArrayList<Integer> integers = findb(sql);
        for (Integer integer : integers) {
            System.out.println(integer);
        }
        return integers;
    }

    public ArrayList<Integer> findc(String sql) throws SQLException {
        ArrayList<Integer> list = new ArrayList<>();
        Connection conn = DriverManager.getConnection(URL, USER, PASS);
        Statement stmt =conn.createStatement();
        ResultSet rs = stmt.executeQuery(sql);
        int a=0;
        int b=0;
        int c=0;
        int d=0;
        while (rs.next()){
            String aa=rs.getString("quanters");
            if (aa != null){
                if (aa.equals("spring"))a++;
                if (aa.equals("summer"))b++;
                if (aa.equals("autumn"))c++;
                if (aa.equals("winter"))d++;
            }
        }
        list.add(a);
        list.add(b);
        list.add(c);
        list.add(d);
        return list;
    }

    @RequestMapping("/tb_c1")
    public @ResponseBody List tb_c() throws SQLException {
        String sql ="select * from tb_c";
        ArrayList<Integer> integers = findc(sql);
        for (Integer integer : integers) {
            System.out.println(integer);
        }
        return integers;
    }
}
<%--
  Created by IntelliJ IDEA.
  User: 86183
  Date: 2024/5/23
  Time: 13:38
  To change this template use File | Settings | File Templates.
--%>
<%@ page contentType="text/html;charset=UTF-8" language="java" %>
<script src="../js/vue.js"></script>
<script src="../plugins/elementui/index.js"></script>
<script type="text/javascript" src="../js/jquery.min.js"></script>
<script src="../js/axios-0.18.0.js"></script>
<script src="../echarts-5.5.0/dist/echarts.min.js"></script>
<html>
<head>
    <title>华为P30手机评论画像分析</title>
</head>
<body>
  <div id="app">

      <%--<template>--%>
          <%--<button @click="tb_a()">tb_a图</button>--%>
      <%--</template>--%>

  </div>
  <div id="b1" style="width: 600px;height:400px;"></div>
  <div id="b2" style="width: 600px;height:400px;"></div>
  <div id="b3" style="width: 600px;height:400px;"></div>
  <div id="b4" style="width: 600px;height:400px;"></div>

</body>
<script>
    var vue = new Vue({
        el:'#app',
        data:{
          tb_alist:null,
          tb_blist:null,
          tb_clist:null,
          tb_dlist:null
        },
        methods:{
            tb_a(){
                $.ajax( {
                    url:'/tb_a1',        //请求地址
                    async: false,            //异步开关
                    type:'post',            //请求方式
                    data: {

                    },
                    datatype : "json",//"xml", "html", "script", "json", "jsonp", "text".
                    success : function(data) {
                        tb_alist=data;
                    },
                    error : function() {
                        alert("异常!!!");

                    }
                });
                var myCharts = echarts.init(document.querySelector('#b1'))
                option = {
                    title: {
                        text: '评论工作日图',
                        subtext: '评论数据在工作日和休息日的分布情况',
                        left: 'center'
                    },
                    tooltip: {
                        trigger: 'item'
                    },
                    legend: {
                        orient: 'vertical',
                        left: 'left'
                    },
                    series: [{
                        name: '评论数量',
                        type: 'pie',
                        radius: '50%',
                        data: [{
                            value: tb_alist[0],
                            name: '工作日评论数量'
                            },
                            {
                                value: tb_alist[1],
                                name: '休息日评论数量'
                            }],
                        radius: '50%',
                        center: ['50%', '50%'], // 这个属性可以调整图像的位置,左面所示为中心
                        label: {
                            normal: {
                                show: true,
                                position: 'inner', // 数值显示在内部
                                formatter: '{d}%', // 格式化数值百分比输出
                            },
                        },
                        emphasis: {
                            itemStyle: {
                                shadowBlur: 10,
                                shadowOffsetX: 0,
                                shadowColor: 'rgba(0, 0, 0, 0.5)'
                            }
                        }
                    }]
                };
                myCharts.setOption(option)
            },

            tb_b(){
                $.ajax( {
                    url:'/tb_b1',        //请求地址
                    async: false,            //异步开关
                    type:'post',            //请求方式
                    data: {

                    },
                    datatype : "json",//"xml", "html", "script", "json", "jsonp", "text".
                    success : function(data) {
                        tb_blist=data;
                    },
                    error : function() {
                        alert("异常!!!");

                    }
                });
                var myCharts = echarts.init(document.querySelector('#b2'))
                option = {
                    title: {
                        text: '评论时间图',
                        subtext: '评论数据在一天各个时间段的分布情况',
                        left: 'center'
                    },
                    tooltip: {
                        trigger: 'item'
                    },
                    legend: {
                        orient: 'vertical',
                        left: 'left'
                    },
                    series: [{
                        name: '评论数量',
                        type: 'pie',
                        radius: '50%',
                        data: [{
                            value: tb_blist[0],
                            name: '0:00~8:00评论数量'
                        },
                            {
                                value: tb_blist[1],
                                name: '11:30~13:30评论数量'
                            },
                            {
                                value: tb_blist[2],
                                name: '13:30~17:30评论数量'
                            },{
                                value: tb_blist[3],
                                name: '17:30~24:00评论数量'
                            }],
                        radius: '50%',
                        center: ['50%', '50%'], // 这个属性可以调整图像的位置,左面所示为中心
                        label: {
                            normal: {
                                show: true,
                                position: 'inner', // 数值显示在内部
                                formatter: '{d}%', // 格式化数值百分比输出
                            },
                        },
                        emphasis: {
                            itemStyle: {
                                shadowBlur: 10,
                                shadowOffsetX: 0,
                                shadowColor: 'rgba(0, 0, 0, 0.5)'
                            }
                        }
                    }]
                };
                myCharts.setOption(option)
            },
            tb_c(){
                $.ajax( {
                    url:'/tb_c1',        //请求地址
                    async: false,            //异步开关
                    type:'post',            //请求方式
                    data: {

                    },
                    datatype : "json",//"xml", "html", "script", "json", "jsonp", "text".
                    success : function(data) {
                        tb_clist=data;
                    },
                    error : function() {
                        alert("异常!!!");

                    }
                });
                var myCharts = echarts.init(document.querySelector('#b3'))
                option = {
                    title: {
                        text: '评论季节图',
                        subtext: '评论数据在一年内各季节的分布情况',
                        left: 'center'
                    },
                    tooltip: {
                        trigger: 'item'
                    },
                    legend: {
                        orient: 'vertical',
                        left: 'left'
                    },
                    series: [{
                        name: '评论数量',
                        type: 'pie',
                        radius: '50%',
                        data: [{
                            value: tb_clist[0],
                            name: '春季评论数量'
                        },
                            {
                                value: tb_clist[1],
                                name: '夏季评论数量'
                            },
                            {
                                value: tb_clist[2],
                                name: '秋季评论数量'
                            },{
                                value: tb_clist[3],
                                name: '冬季评论数量'
                            }],
                        radius: '50%',
                        center: ['50%', '50%'], // 这个属性可以调整图像的位置,左面所示为中心
                        label: {
                            normal: {
                                show: true,
                                position: 'inner', // 数值显示在内部
                                formatter: '{d}%', // 格式化数值百分比输出
                            },
                        },
                        emphasis: {
                            itemStyle: {
                                shadowBlur: 10,
                                shadowOffsetX: 0,
                                shadowColor: 'rgba(0, 0, 0, 0.5)'
                            }
                        }
                    }]
                };
                myCharts.setOption(option)
            },
            tb_d(){
                $.ajax( {
                    url:'/tb_d1',        //请求地址
                    async: false,            //异步开关
                    type:'post',            //请求方式
                    data: {

                    },
                    datatype : "json",//"xml", "html", "script", "json", "jsonp", "text".
                    success : function(data) {
                        tb_dlist=data;
                    },
                    error : function() {
                        alert("异常!!!");

                    }
                });
                var myCharts = echarts.init(document.querySelector('#b4'))
                option = {
                    title: {
                        text: '评论时间图',
                        subtext: '评论数据在一天各个时间段的分布情况',
                        left: 'center'
                    },
                    tooltip: {
                        trigger: 'item'
                    },
                    legend: {
                        orient: 'vertical',
                        left: 'left'
                    },
                    series: [{
                        name: '评论数量',
                        type: 'pie',
                        radius: '50%',
                        data: [{
                            value: tb_blist[0],
                            name: '0:00~8:00评论数量'
                        },
                            {
                                value: tb_blist[1],
                                name: '11:30~13:30评论数量'
                            },
                            {
                                value: tb_blist[2],
                                name: '13:30~17:30评论数量'
                            },{
                                value: tb_blist[3],
                                name: '17:30~24:00评论数量'
                            }],
                        radius: '50%',
                        center: ['50%', '50%'], // 这个属性可以调整图像的位置,左面所示为中心
                        label: {
                            normal: {
                                show: true,
                                position: 'inner', // 数值显示在内部
                                formatter: '{d}%', // 格式化数值百分比输出
                            },
                        },
                        emphasis: {
                            itemStyle: {
                                shadowBlur: 10,
                                shadowOffsetX: 0,
                                shadowColor: 'rgba(0, 0, 0, 0.5)'
                            }
                        }
                    }]
                };
                myCharts.setOption(option)
            }
        },
        created(){
            this.tb_a();
            this.tb_b();
            this.tb_c();
        }

    })

</script>
</html>

3.最终效果图

233b494c5287408dbe6d31754c9b32ea.jpeg

标签: hadoop

本文转载自: https://blog.csdn.net/yh209/article/details/139149655
版权归原作者 云海cv 所有, 如有侵权,请联系我们删除。

“Hadoop 华为P30手机评论画像的分析 --爬虫与java部分代码”的评论:

还没有评论