Tez的简介以及安装配置

news/2024/5/20 4:51:58 标签: hive, hdfs, mapreduce, 大数据, 数据仓库

Tez简介

Tez是一个Hive的运行引擎,由于没有中间存盘的过程,性能优于MR。Tez可以将多个依赖作业转换成一个作业,这样只需要写一次HDFS,中间节点少,提高作业的计算性能。

Tez的安装步骤

1)下载安装包到hive所在的66服务器,然后解压缩。

wget https://mirrors.bfsu.edu.cn/apache/tez/0.10.0/apache-tez-0.10.0-bin.tar.gz
tar -xzvf apache-tez-0.10.0-bin.tar.gz

2)在HIVE中配置tez

2.1)修改hive-env.sh文件,添加如下配置,然后将tez的jar复制到hive 的lib目录下

[user@NewBieMaster conf]$ cat hive-env.sh

。。。。。。
export TEZ_HOME=/home/user/apache-tez    #tez解压目录
export TEZ_JARS=""
for jar in `ls $TEZ_HOME |grep jar`; do
    export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/$jar
done
for jar in `ls $TEZ_HOME/lib`; do
    export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/lib/$jar
done
export HIVE_AUX_JARS_PATH=/home/user/hadoop-3.2.2/share/hadoop/common/hadoop-lzo-0.4.21-SNAPSHOT.jar$TEZ_JARS
[user@NewBieSlave1 conf]$ 

[user@NewBieMaster apache-tez]$ cp -ri /home/user/apache-tez/*.jar /home/user/apache-hive/lib/
[user@NewBieMaster apache-tez]$ cp -ri /home/user/apache-tez/lib/*.jar /home/user/apache-hive/lib/

2.2)修改hive-default.xml文件,修改计算引擎为tez。

2.3)创建tez-site.xml文件并添加如下内容。

[user@NewBieSlave1 conf]$ cat tez-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xml"?>
<configuration>
<property>
        <name>tez.lib.uris</name>
        <value>${fs.defaultFS}/tez/apache-tez-0.10.0-bin.tar.gz</value>
</property>
<!--tez是否可用Hadoop的jar包 -->
<property>
         <name>tez.use.cluster.hadoop-libs</name>
         <value>true</value>
</property>
<property>
         <name>tez.history.logging.service.class</name>        
         <value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
</configuration>
[user@NewBieSlave1 conf]$ 

3)将/home/user/apache-tez传到HDFS的/tez路径,并且测试配置是否成功。

[user@NewBieMaster conf]$ hadoop fs -mkdir /tez

hive> create table student (int[user@NewBieSlave1 lib]$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/user/apache-tez/lib/slf4j-log4j12-1.7.10.jar.bkp!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/user/apache-tez/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/user/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/user/.local/bin:/home/user/bin:/home/user/jdk1.8.0_202/jre/bin:/home/user/jdk1.8.0_202/bin:/home/user/hadoop-3.2.2/bin:/home/user/hadoop-3.2.2/sbin:/home/user/apache-hive/bin:/home/user/apache-hive/sbin:/home/user/apache-zookeeper/bin:/home/user/spark-3.1.2-bin-hadoop3.2/bin:/home/user/spark-3.1.2-bin-hadoop3.2/sbin:/home/user/apache-maven-3.8.1/bin:/home/user/kafka/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/user/apache-tez/lib/slf4j-log4j12-1.7.10.jar.bkp!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/user/apache-tez/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/user/apache-hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/user/apache-hive/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/user/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Hive Session ID = 4da8b861-e768-4159-8676-3aa0a241d191

Logging initialized using configuration in jar:file:/home/user/apache-hive/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive Session ID = 9945ba2b-bb79-41d5-8f81-204a8344c40f
hive> show databases;
OK
default
Time taken: 1.501 seconds, Fetched: 1 row(s)
hive> create table student( id int, name string)
    > ;
OK
Time taken: 0.916 seconds
hive> select * from student;
OK
Time taken: 1.877 seconds
hive> set hive.tez.container.size=1024;
hive> set hive.tez.java.opts=-Xmx500m;
hive> insert into table student(id,name) values(1,'test');
Query ID = user_20210708122909_fd4262de-bf1f-496d-9470-3ae789364d34
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1625717375401_0007)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED  
----------------------------------------------------------------------------------------------
Map 1 .......... container     SUCCEEDED      1          1        0        0       0       0  
Reducer 2 ...... container     SUCCEEDED      1          1        0        0       0       0  
----------------------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 9.94 s     
----------------------------------------------------------------------------------------------
Loading data to table default.student
OK
Time taken: 13.999 seconds
hive> select * from student;
OK
1	test
Time taken: 0.239 seconds, Fetched: 1 row(s)
hive> 


http://www.niftyadmin.cn/n/876887.html

相关文章

Hadoop HA集群怎么格式化namenode?

http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html 1&#xff09;停止Hadoop集群 2&#xff09;在所有服务器上执行命令行 rm -rf /home/user/hadoop-3.2.2/tmp/* rm -rf /home/user/hadoop-3.2.2/journal/ns1/*3)确…

大数据项目实战---电商埋点日志分析(第三部分,DWD层初步解析)

构建DWD层 1&#xff09;创建表dwd_start_log 2&#xff09;创建表dwd_event_log 往DWD层加载数据 1&#xff09;往表dwd_start_log中加载数据。 2&#xff09;往表dwd_event_log中加载数据。 2.1)创建UDF和UDTF ,将生成的jar 传到hive所在的服务器88。 https://blog.csdn.n…

HIVE中UDF的使用

1&#xff09;创建Maven工程 2&#xff09;项目pom.xml文件中添加hive的依赖。 <dependency><groupId>org.apache.hive</groupId><artifactId>hive-exec</artifactId><version>2.3.8</version></dependency> 3) 创建类BaseFi…

HIVE中UDTF的使用

1&#xff09;创建Maven工程 2&#xff09;项目pom.xml文件中添加hive的依赖。 3) 创建类EventJsonUDTF.

大数据项目实战---电商埋点日志分析(第四部分,DWD层深度解析)

1&#xff09;创建加载表dwd_display_log 2&#xff09;创建加载表dwd_newsdetail_log 3&#xff09;创建加载表dwd_loading_log 4&#xff09;创建加载表dwd_ad_log 5&#xff09;创建加载表dwd_notification_log 6&#xff09;创建加载表dwd_for_activity_log 7&#xff09;创…

大数据项目实战---电商埋点日志分析(第五部分,DWS层之用户活跃主题)

1&#xff09;创建用户按天明细表&#xff0c;dws_uv_detail_day并加载数据。 2&#xff09;创建用户按周明细表&#xff0c;dws_uv_detail_wk并加载数据。 3&#xff09;创建用户按月明细表&#xff0c;dws_uv_detail_mn并加载数据。 下一章 https://blog.csdn.net/hailunw/ar…

大数据项目实战---电商埋点日志分析(第六部分,ADS层之用户活跃主题)

大数据项目实战---电商埋点日志分析&#xff08;第六部分&#xff0c;ADS层之用户活跃主题&#xff09; 创建用户活跃汇总表ads_uv_account并加载数据。 下一章 https://blog.csdn.net/hailunw/article/details/118609254

大数据项目实战---电商埋点日志分析(第七部分,每日新增设备主题(DWS层+ADS层)

1&#xff09;创建设备按天明细表&#xff0c;dws_new_mid_day并加载数据。 2&#xff09;创建每日新增设备表&#xff0c;ads_new_mid_count并加载数据。 下一章 https://blog.csdn.net/hailunw/article/details/118611510