0


hive大作业-餐饮外卖平台数据分析

1、背景

W餐饮外卖平台向广大用户提供网上订餐服务,其市场占有量在近年不断增加。当用户在W平台订餐完成后,平台会引导用户对于品尝过的菜品进行评价打分,最高为5分,最低为1分。

通过用户的评分数据,可以分析外卖平台的受欢迎度、客户的体验度。

数据说明

  • 用户评分数据(mealrating.txt)

属性名称

属性说明

UserID

用户ID

MealID

菜品ID

Rating

评分

ReviewTime

评分的时间戳

Review

评价内容

  • 菜品数据集(meal_list.txt)

2、任务

  • 将用户评分数据和菜品数据导入Hive

  • 根据用户评分数据统计日销量和日用户量

select count(1) from mealrating where ReviewTime between 1496100000 and 1496200000;

2017年五月三十一日日销量

select count(DISTINCT UserID) from mealrating where ReviewTime between 1497100000 and 1497200000;

2017年六月十一号日用户量

  • 统计同时有评分和评分内容的记录

select count(1) from mealrating where Rating is not null and Review is not null;

  • 分析用户的评分分布情况

select

*,

cast(Rating/(sum(Rating) over () ) as DECIMAL (8,2)) as rat_percent

from (

select

Rating ,

count(1) rat_num ,

CAST(sum(Rating)/count(1) as DECIMAL(8,2)) avg_rat

from mealrating group by Rating

) as p ORDER BY rat_percent desc;

可以看到5分占比33%、4分27%、3分20%、2分13%、1分7%

  • 统计10大热销菜品

select name,count(name) as frequency from mealrating join meal_list on mealrating.MealID=meal_list.MealID

group by name order by frequency desc limit 10;

  • 统计10大评分为5的热销菜品

select name,Rating,count(name) as frequency from mealrating join meal_list on mealrating.MealID=meal_list.MealID

where Rating=5 group by name,Rating order by frequency desc limit 10;

  • 统计单日评分超过2次的用户数

select count() from (select ReviewTime,UserID,count() from mealrating group by ReviewTime,UserID having count(*)>2) as tmp;

  • 找出评分次数超过两次的用户中,每个用户评分最高的记录

select UserID,max(Rating) from mealrating group by UserID having UserID in (select UserID from mealrating group by UserID HAVING count(MealID)>2);


本文转载自: https://blog.csdn.net/m0_62004321/article/details/124912140
版权归原作者 听书人123 所有, 如有侵权,请联系我们删除。

“hive大作业-餐饮外卖平台数据分析”的评论:

还没有评论