文章目录
配置说明
Scala - 3.18+
Spark - 3.5.0
Hadoop - 3.3.6
安装hadoop
- 从这里下载相应版本的hadoop
- 下载后解压,配置系统环境变量
>sudovim /etc/profile
添加以下两行
exportHADOOP_HOME=/Users/collinsliu/hadoop-3.3.6/
exportPATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
请自行替换位置
然后执行并生效系统环境变量
>source /etc/profile
安装Spark
- 从这里下载相应版本的Spark
- 下载后解压,同时类似于hadoop,配置系统环境变量
>sudovim /etc/profile
添加以下两行
exportSPARK_HOME=/Users/collinsliu/spark-3.5.0
exportPATH=$PATH:$SPARK_HOME/bin
请自行替换位置
然后执行并生效系统环境变量
>source /etc/profile
- 然后配置spark连接hadoop,形成local模式: a. 首先进入conf文件夹
>cd /Users/collinsliu/spark-3.5.0/conf
b. 其次替换配置文件
>cp spark-env.sh.template spark-env.sh
>vim spark-env.sh
c. 添加以下三条连接,使得spark能够找到对应的hadoop和相应的包
exportJAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_311.jdk/Contents/Home
exportHADOOP_CONF_DIR=/Users/collinsliu/hadoop-3.3.6/etc/hadoop
exportSPARK_DIST_CLASSPATH=$(/Users/collinsliu/hadoop-3.3.6/bin/hadoop classpath)
测试安装成功
- 使用内置命令测试
>cd /Users/collinsliu/spark-3.5.0/
> ./run-example SparkPi
可以看到很多输出,最后找到
...
24/02/07 00:31:33 INFO TaskSchedulerImpl: Adding task set0.0 with 2 tasks resource profile 024/02/07 00:31:33 INFO TaskSetManager: Starting task 0.0in stage 0.0(TID 0)(192.168.0.100, executor driver, partition 0, PROCESS_LOCAL, 8263 bytes)24/02/07 00:31:33 INFO TaskSetManager: Starting task 1.0in stage 0.0(TID 1)(192.168.0.100, executor driver, partition 1, PROCESS_LOCAL, 8263 bytes)24/02/07 00:31:33 INFO Executor: Running task 0.0in stage 0.0(TID 0)24/02/07 00:31:33 INFO Executor: Running task 1.0in stage 0.0(TID 1)24/02/07 00:31:34 INFO Executor: Finished task 1.0in stage 0.0(TID 1). 1101 bytes result sent to driver
24/02/07 00:31:34 INFO Executor: Finished task 0.0in stage 0.0(TID 0). 1101 bytes result sent to driver
24/02/07 00:31:34 INFO TaskSetManager: Finished task 0.0in stage 0.0(TID 0)in1120 ms on 192.168.0.100 (executor driver)(1/2)24/02/07 00:31:34 INFO TaskSetManager: Finished task 1.0in stage 0.0(TID 1)in923 ms on 192.168.0.100 (executor driver)(2/2)24/02/07 00:31:34 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
24/02/07 00:31:34 INFO DAGScheduler: ResultStage 0(reduce at SparkPi.scala:38) finished in1.737 s
24/02/07 00:31:34 INFO DAGScheduler: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job
24/02/07 00:31:34 INFO TaskSchedulerImpl: Killing all running tasks in stage 0: Stage finished
24/02/07 00:31:34 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 1.807145 s
Pi is roughly 3.1405357026785135
说明安装成功
2. 打开sparkshell
> spark-shell
出现以下内容
24/02/07 00:48:12 WARN Utils: Your hostname, Collinss-MacBook-Air.local resolves to a loopback address: 127.0.0.1; using 192.168.0.100 instead (on interface en0)24/02/07 00:48:12 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
____ __
/ __/__ ___ _____/ /__
_\\/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.5.0
/_/
Using Scala version 2.13.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_311)
Type in expressions to have them evaluated.
Type :help for more information.
24/02/07 00:48:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://192.168.0.100:4040
Spark context available as 'sc' (master = local[*], app id = local-1707238103536).
Spark session available as 'spark'.
scala>
说明安装成功
本文转载自: https://blog.csdn.net/weixin_41429931/article/details/136064197
版权归原作者 SparklingTheo 所有, 如有侵权,请联系我们删除。
版权归原作者 SparklingTheo 所有, 如有侵权,请联系我们删除。