0


Python 疫情数据可视化(爬虫+数据可视化)(Jupyter环境)

1 项目背景

2019年底,肺炎(COVID-19)在全球爆发,后来被确认为新型冠状病毒(SARS-CoV-2)所引发的。

2 项目目标

我们在爬取到公开数据的条件下,开展了一些可视化工作希望能够帮助大家更好理解现在疫情的发展情况,更有信心一起战胜肆虐的病毒。

3 项目分析

3.1数据获取

3.1.1****分析网站

先去先找到今天要爬取的目标数据:

https://news.qq.com/zt2020/page/feiyan.htm#/

3.1.2****找到数据所在url

url点击跳转查看

  1. url='https://api.inews.qq.com/newsqa/v1/query/inner/publish/modules/list?modules=statisGradeCityDetail,diseaseh5Shelf'

3.1.3****获取数据

通过爬虫获取它的json数据:

  1. url='https://api.inews.qq.com/newsqa/v1/query/inner/publish/modules/list?modules=statisGradeCityDetail,diseaseh5Shelf'
  2. response = requests.get(url, verify=False)
  3. json_data = response.json()['data']
  4. china_data = json_data['diseaseh5Shelf']['areaTree'][0]['children'] # 列表

3.1.4****解析数据

通过一个for循环对我们的列表进行取值然后再存入到我们的字典中

  1. data_set = []
  2. for i in china_data:
  3. data_dict = {}
  4. # 地区名称
  5. data_dict['province'] = i['name']
  6. # 新增确认
  7. data_dict['nowConfirm'] = i['total']['nowConfirm']
  8. # 死亡人数
  9. data_dict['dead'] = i['total']['dead']
  10. # 治愈人数
  11. data_dict['heal'] = i['total']['heal']
  12. data_set.append(data_dict)

3.1.5****保存数据

df = pd.DataFrame(data_set)

df.to_csv('yiqing_data.csv')

3.2数据可视化

3.2.1****读取数据

df2 = df.sort_values(by=['nowConfirm'],ascending=False)[:9]

df2

3.2.2****各地区确诊人数与死亡人数情况条形图

  1. bar = (
  2. Bar()
  3. .add_xaxis(list(df['province'].values)[:6])
  4. .add_yaxis("死亡", df['dead'].values.tolist()[:6])
  5. .add_yaxis("治愈", df['heal'].values.tolist()[:6])
  6. .set_global_opts(
  7. title_opts=opts.TitleOpts(title="各地区确诊人数与死亡人数情况"),
  8. datazoom_opts=[opts.DataZoomOpts()],
  9. )
  10. )
  11. bar.render_notebook()

3.2.3****各地区现有确诊人数地图

  1. china_map = (
  2. Map()
  3. .add("现有确诊", [list(i) for i in zip(df['province'].values.tolist(),df['nowConfirm'].values.tolist())], "china")
  4. .set_global_opts(
  5. title_opts=opts.TitleOpts(title="各地区确诊人数"),
  6. visualmap_opts=opts.VisualMapOpts(max_=600, is_piecewise=True),
  7. )
  8. )
  9. china_map.render_notebook()

3.2.4****各地区现有确诊人数分布环形图

  1. pie = (
  2. Pie()
  3. .add(
  4. "",
  5. [list(i) for i in zip(df2['province'].values.tolist(),df2['nowConfirm'].values.tolist())],
  6. radius = ["10%","30%"]
  7. )
  8. .set_global_opts(
  9. legend_opts=opts.LegendOpts(orient="vertical", pos_top="70%", pos_left="70%"),
  10. )
  11. .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
  12. )
  13. pie.render_notebook()

3.2.4****各地区现有确诊人数分布折线图

  1. line = (
  2. Line()
  3. .add_xaxis(list(df['province'].values))
  4. .add_yaxis("治愈", df['heal'].values.tolist())
  5. .add_yaxis("死亡", df['dead'].values.tolist())
  6. .set_global_opts(
  7. title_opts=opts.TitleOpts(title="死亡与治愈"),
  8. )
  9. )
  10. line.render_notebook()

项目源码:

  1. import requests # 发送网络请求模块
  2. import json
  3. import pprint # 格式化输出模块
  4. import pandas as pd # 数据分析当中一个非常重要的模块
  5. from pyecharts import options as opts
  6. from pyecharts.charts import Bar,Line,Pie,Map,Grid
  7. import urllib3
  8. from pyecharts.globals import CurrentConfig, NotebookType
  9. # 配置对应的环境类型
  10. CurrentConfig.NOTEBOOK_TYPE = NotebookType.JUPYTER_NOTEBOOK
  11. CurrentConfig.ONLINE_HOST='https://assets.pyecharts.org/assets/'
  12. urllib3.disable_warnings()#解决InsecureRequestWarning: Unverified HTTPS request is being made to host 'api.inews.qq.com'. 问题
  13. url = 'https://api.inews.qq.com/newsqa/v1/query/inner/publish/modules/list?modules=statisGradeCityDetail,diseaseh5Shelf'
  14. response = requests.get(url, verify=False)
  15. json_data = response.json()['data']
  16. china_data = json_data['diseaseh5Shelf']['areaTree'][0]['children'] # 列表
  17. data_set = []
  18. for i in china_data:
  19. data_dict = {}
  20. # 地区名称
  21. data_dict['province'] = i['name']
  22. # 新增确认
  23. data_dict['nowConfirm'] = i['total']['nowConfirm']
  24. # 死亡人数
  25. data_dict['dead'] = i['total']['dead']
  26. # 治愈人数
  27. data_dict['heal'] = i['total']['heal']
  28. data_set.append(data_dict)
  29. df = pd.DataFrame(data_set)
  30. df.to_csv('yiqing_data.csv')
  31. df2 = df.sort_values(by=['nowConfirm'],ascending=False)[:9]
  32. df2
  33. # bar = (
  34. # Bar()
  35. # .add_xaxis(list(df['province'].values)[:6])
  36. # .add_yaxis("死亡", df['dead'].values.tolist()[:6])
  37. # .add_yaxis("治愈", df['heal'].values.tolist()[:6])
  38. # .set_global_opts(
  39. # title_opts=opts.TitleOpts(title="各地区确诊人数与死亡人数情况"),
  40. # datazoom_opts=[opts.DataZoomOpts()],
  41. # )
  42. # )
  43. # bar.render_notebook()
  44. # china_map = (
  45. # Map()
  46. # .add("现有确诊", [list(i) for i in zip(df['province'].values.tolist(),df['nowConfirm'].values.tolist())], "china")
  47. # .set_global_opts(
  48. # title_opts=opts.TitleOpts(title="各地区确诊人数"),
  49. # visualmap_opts=opts.VisualMapOpts(max_=600, is_piecewise=True),
  50. # )
  51. # )
  52. # china_map.render_notebook()
  53. # pie = (
  54. # Pie()
  55. # .add(
  56. # "",
  57. # [list(i) for i in zip(df2['province'].values.tolist(),df2['nowConfirm'].values.tolist())],
  58. # radius = ["10%","30%"]
  59. # )
  60. # .set_global_opts(
  61. # legend_opts=opts.LegendOpts(orient="vertical", pos_top="70%", pos_left="70%"),
  62. # )
  63. # .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
  64. # )
  65. # pie.render_notebook()
  66. line = (
  67. Line()
  68. .add_xaxis(list(df['province'].values))
  69. .add_yaxis("治愈", df['heal'].values.tolist())
  70. .add_yaxis("死亡", df['dead'].values.tolist())
  71. .set_global_opts(
  72. title_opts=opts.TitleOpts(title="死亡与治愈"),
  73. )
  74. )
  75. line.render_notebook()
标签: python 爬虫 jupyter

本文转载自: https://blog.csdn.net/weixin_45987577/article/details/124685869
版权归原作者 城南望余雪 所有, 如有侵权,请联系我们删除。

“Python 疫情数据可视化(爬虫+数据可视化)(Jupyter环境)”的评论:

还没有评论