Hadoop大数据应用:HDFS 集群节点扩容

news/2024/5/20 2:57:45 标签: hadoop, 大数据, hdfs, linux

目录

 一、实验

1.环境

2.HDFS 集群节点扩容

二、问题

1.rsync 同步报错


 一、实验

1.环境

(1)主机

表1  主机

主机架构软件版本IP备注
hadoop

NameNode (已部署)

SecondaryNameNode (已部署)

ResourceManager(已部署)

hadoop

2.7.7192.168.204.50

node01

DataNode(已部署)

NodeManager(已部署)

hadoop

2.7.7192.168.204.51
node02

DataNode(已部署)

NodeManager(已部署)

hadoop

2.7.7192.168.204.52
node03

DataNode(已部署)

NodeManager(已部署)

hadoop

2.7.7192.168.204.53
node04

DataNode

hadoop

2.7.7192.168.204.54

(2)查看jps

hadoop节点

[root@hadoop hadoop]# jps

node01节点

node02节点

node03节点

(3) 查看节点

[root@hadoop hadoop]# ./bin/yarn node -list
24/03/14 13:40:21 INFO client.RMProxy: Connecting to ResourceManager at hadoop/192.168.204.50:8032
Total Nodes:3
         Node-Id             Node-State Node-Http-Address       Number-of-Running-Containers
    node01:40551                RUNNING       node01:8042                                  0
    node02:46073                RUNNING       node02:8042                                  0
    node03:40601                RUNNING       node03:8042                                  0

2.HDFS 集群节点扩容

(1)查看IP

地址为192.168.204.54

[root@localhost ~]# ip addr

 (2)安全机制

查看

[root@localhost ~]# sestatus

关闭

[root@localhost ~]# vim /etc/selinux/config
……
SELINUX=disabled
……

再次查看(需要reboot重启)

[root@localhost ~]# sestatus

(3)防火墙

关闭

[root@localhost ~]# systemctl stop firewalld
[root@localhost ~]# systemctl mask firewalld

(4)安装java

[root@localhost ~]# yum install -y java-1.8.0-openjdk-devel.x86_64

查看

[root@localhost ~]# jps

 (5)修改主机名

[root@localhost ~]# hostnamectl set-hostname node04
[root@localhost ~]# bash

(6)添加免密登录

[root@hadoop ~]# cd /root/.ssh/
[root@hadoop .ssh]# ls
authorized_keys  id_rsa  id_rsa.pub  known_hosts
[root@hadoop .ssh]# ssh-copy-id -i id_rsa.pub 192.168.204.54

验证:

[root@hadoop .ssh]# ssh 192.168.204.54

 (7)域名主机名(hadoop节点)

[root@hadoop ~]# vim /etc/hosts
……
192.168.205.50 hadoop
192.168.205.51 node01
192.168.205.52 node02
192.168.205.53 node03
192.168.204.54 node04

(8)同步域名配置文件

[root@hadoop ~]# rsync -av /etc/hosts node01:/etc/
sending incremental file list
hosts

sent 359 bytes  received 41 bytes  266.67 bytes/sec
total size is 269  speedup is 0.67
[root@hadoop ~]# rsync -av /etc/hosts node02:/etc/
sending incremental file list
hosts

sent 359 bytes  received 41 bytes  800.00 bytes/sec
total size is 269  speedup is 0.67
[root@hadoop ~]# rsync -av /etc/hosts node03:/etc/
sending incremental file list
hosts

sent 359 bytes  received 41 bytes  800.00 bytes/sec
total size is 269  speedup is 0.67
[root@hadoop ~]# rsync -av /etc/hosts node04:/etc/
Warning: Permanently added 'node04' (ECDSA) to the list of known hosts.
sending incremental file list
hosts

sent 359 bytes  received 41 bytes  266.67 bytes/sec
total size is 269  speedup is 0.67

(9)同步Hadoop文件

[root@hadoop ~]# rsync -aXSH --delete /usr/local/hadoop node04:/usr/local/

(10) 清除日志(node04节点)

[root@node04 ~]# cd /usr/local/hadoop/
[root@node04 hadoop]# ls
bin  etc  include  lib  libexec  LICENSE.txt  logs  NOTICE.txt  README.txt  sbin  share
[root@node04 hadoop]# cd logs/
[root@node04 logs]# ls
hadoop-root-namenode-hadoop.log    hadoop-root-secondarynamenode-hadoop.log    SecurityAuth-root.audit
hadoop-root-namenode-hadoop.out    hadoop-root-secondarynamenode-hadoop.out    yarn-root-resourcemanager-hadoop.log
hadoop-root-namenode-hadoop.out.1  hadoop-root-secondarynamenode-hadoop.out.1  yarn-root-resourcemanager-hadoop.out
[root@node04 logs]# rm -f *
[root@node04 logs]# ls

(11)查看slaves  (hadoop节点)

[root@hadoop ~]# cd /usr/local/hadoop/etc/hadoop/
[root@hadoop hadoop]# cat slaves

(12)添加slaves

 [root@hadoop hadoop]# vim slaves
  node01
  node02
  node03
  node04

(13)同步配置到所有主机

[root@hadoop hadoop]# rsync -aXSH --delete /usr/local/hadoop/etc node01:/usr/local/hadoop/
[root@hadoop hadoop]# rsync -aXSH --delete /usr/local/hadoop/etc node02:/usr/local/hadoop/
[root@hadoop hadoop]# rsync -aXSH --delete /usr/local/hadoop/etc node03:/usr/local/hadoop/
[root@hadoop hadoop]# rsync -aXSH --delete /usr/local/hadoop/etc node04:/usr/local/hadoop/

(14)启动服务 (node04节点)

[root@node04 hadoop]# ./sbin/hadoop-daemon.sh start datanode

查看jps

(15) 验证 (hadoop节点)

查看报告,Live datanodes 显示节点为4个。

[root@hadoop hadoop]# ./bin/hdfs dfsadmin -report
Configured Capacity: 822126559232 (765.67 GB)
Present Capacity: 798787727360 (743.93 GB)
DFS Remaining: 798786990080 (743.93 GB)
DFS Used: 737280 (720 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (4):

Name: 192.168.204.54:50010 (node04)
Hostname: node04
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 4096 (4 KB)
Non DFS Used: 5658746880 (5.27 GB)
DFS Remaining: 199872888832 (186.15 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:00:23 CST 2024


Name: 192.168.204.53:50010 (node03)
Hostname: node03
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 266240 (260 KB)
Non DFS Used: 5621547008 (5.24 GB)
DFS Remaining: 199909826560 (186.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.26%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:00:24 CST 2024


Name: 192.168.204.51:50010 (node01)
Hostname: node01
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 180224 (176 KB)
Non DFS Used: 6029209600 (5.62 GB)
DFS Remaining: 199502249984 (185.80 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.07%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:00:22 CST 2024


Name: 192.168.204.52:50010 (node02)
Hostname: node02
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 286720 (280 KB)
Non DFS Used: 6029328384 (5.62 GB)
DFS Remaining: 199502024704 (185.80 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.07%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:00:25 CST 2024

(16)查看命令

设置带宽命令为 -setBalancerBandwidth

[root@hadoop hadoop]# ./bin/hdfs dfsadmin
Usage: hdfs dfsadmin
Note: Administrative commands can only be run as the HDFS superuser.
        [-report [-live] [-dead] [-decommissioning]]
        [-safemode <enter | leave | get | wait>]
        [-saveNamespace]
        [-rollEdits]
        [-restoreFailedStorage true|false|check]
        [-refreshNodes]
        [-setQuota <quota> <dirname>...<dirname>]
        [-clrQuota <dirname>...<dirname>]
        [-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname>]
        [-clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname>]
        [-finalizeUpgrade]
        [-rollingUpgrade [<query|prepare|finalize>]]
        [-refreshServiceAcl]
        [-refreshUserToGroupsMappings]
        [-refreshSuperUserGroupsConfiguration]
        [-refreshCallQueue]
        [-refresh <host:ipc_port> <key> [arg1..argn]
        [-reconfig <datanode|...> <host:ipc_port> <start|status>]
        [-printTopology]
        [-refreshNamenodes datanode_host:ipc_port]
        [-deleteBlockPool datanode_host:ipc_port blockpoolId [force]]
        [-setBalancerBandwidth <bandwidth in bytes per second>]
        [-fetchImage <local directory>]
        [-allowSnapshot <snapshotDir>]
        [-disallowSnapshot <snapshotDir>]
        [-shutdownDatanode <datanode_host:ipc_port> [upgrade]]
        [-getDatanodeInfo <datanode_host:ipc_port>]
        [-metasave filename]
        [-triggerBlockReport [-incremental] <datanode_host:ipc_port>]
        [-help [cmd]]

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|resourcemanager:port>    specify a ResourceManager
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

(17)设置带宽平衡数据

000为KB,000000为MB,

500+000000 为500MB

[root@hadoop hadoop]# ./bin/hdfs dfsadmin -setBalancerBandwidth 500000000

执行脚本

[root@hadoop hadoop]# ./sbin/start-balancer.sh

(18)查看状态

DFS Used 为使用情况

[root@hadoop hadoop]# ./bin/hdfs dfsadmin -report
Configured Capacity: 822126559232 (765.67 GB)
Present Capacity: 798788423680 (743.93 GB)
DFS Remaining: 798787682304 (743.93 GB)
DFS Used: 741376 (724 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (4):

Name: 192.168.204.54:50010 (node04)
Hostname: node04
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 5658730496 (5.27 GB)
DFS Remaining: 199872901120 (186.15 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:16:33 CST 2024


Name: 192.168.204.53:50010 (node03)
Hostname: node03
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 266240 (260 KB)
Non DFS Used: 5620936704 (5.23 GB)
DFS Remaining: 199910436864 (186.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.27%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:16:33 CST 2024


Name: 192.168.204.51:50010 (node01)
Hostname: node01
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 180224 (176 KB)
Non DFS Used: 6029176832 (5.62 GB)
DFS Remaining: 199502282752 (185.80 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.07%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:16:34 CST 2024


Name: 192.168.204.52:50010 (node02)
Hostname: node02
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 286720 (280 KB)
Non DFS Used: 6029291520 (5.62 GB)
DFS Remaining: 199502061568 (185.80 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.07%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:16:34 CST 2024

二、问题

1.rsync 同步报错

(1)报错

(2)原因分析

同步主机名称错误。

(3)解决方法

修改同步主机名称。

[root@hadoop ~]# rsync -av /etc/hosts node01:/etc/


http://www.niftyadmin.cn/n/5430477.html

相关文章

空间计算综合指南

空间计算&#xff08;spatial computing&#xff09;是指使人类能够在三维空间中与计算机交互的一组技术。 该保护伞下的技术包括增强现实&#xff08;AR&#xff09;和虚拟现实&#xff08;VR&#xff09;。 这本综合指南将介绍有关空间计算所需了解的一切。 你将了解 AR、VR…

安装linux_centos7虚拟机_开启网络_ssh_防火墙

文章目录 安装linux_centos7虚拟机_开启网络_ssh_防火墙安装centos7虚拟机1. 进入VMware --> 点击文件 --> 新建虚拟机2. 选择典型 --> 选择下一步3. 选择--> 稍后安装操作系统4. 选择--> Linux --> CentOS 7 64位5. 在虚拟机名称输入(虚拟机名) --> 选择…

面试官:简单讲一下Spring Boot事务的使用

该文章专注于面试,面试只要回答关键点即可,不需要对框架有非常深入的回答,如果你想应付面试,是足够了,抓住关键点 面试官:简单讲一下SpringBoot事务的使用 在Spring Boot中使用事务可以确保一系列操作要么全部成功要么全部失败,保持数据的一致性。Spring Boot提供了多…

Java八股文(Element Plus)

Java八股文のElement Plus Element Plus Element Plus 什么是Element UI 和 Element Plus&#xff1f; Element UI 和 Element Plus 是基于 Vue.js 的一套非常受欢迎的开源 UI 组件库&#xff0c;用于快速构建具有现代化设计和丰富交互效果的前端界面。 Element UI 和 Element…

服务雪崩,熔断,降级,限流之理解

服务雪崩是现状。 通过限流&#xff0c;熔断&#xff0c;降级等方式可以处理雪崩的问题。 服务限流&#xff0c;主要是为了保护服务的正常运行&#xff0c;大量请求过来&#xff0c;忙不过来&#xff0c;起码服务还能用。 服务熔断&#xff0c;是因为大量请求大多数失败或者…

【详解】JAVA 一个或多个空格分割字符串配详细过程

Java分割字符串的应用 项目中&#xff0c;我们经常会遇到对分割字符串的处理情景&#xff0c;在此记录&#xff0c;同时也能帮助到需要的小伙伴们。 一、问题场景 我们需要在项目中截取获得不同的字符串信息&#xff0c;有时候可能出现被空格分开的情况&#xff0c;例如“Th…

Spring Cloud Gateway基础内容(一)

文章目录 参考文章一、Gateway概述1、工作原理概述2、gateway特点 &#xff08;官方描述&#xff09;3、网关的重要性&#xff08;来自尚硅谷&#xff09;4、基础配置 二、简单实现SpringCloudnacos1、新建Spring项目2、添加基础的配置文件3、添加gateway配置断言规则 三、网关…

栈与队列|232.用栈实现队列

力扣题目链接 class MyQueue { public:stack<int> stIn;stack<int> stOut;/** Initialize your data structure here. */MyQueue() {}/** Push element x to the back of queue. */void push(int x) {stIn.push(x);}/** Removes the element from in front of que…