jstorm集成hbase、rocketmq

jstorm集成hbase过程中问题记录。

环境

服务器环境

jstorm服务器:centos 7.2 64位
hbase服务器:centos 7.2 64位

ip地址

hbase服务器:192.168.1.180
jstorm服务器:192.168.1.186

软件版本

jstorm: 2.2.1 备用下载地址
jdk : 1.8.0 备用下载地址
hbase : 1.2.2
rocketmq: 3.5.8
zookeeper(jstorm使用):3.4.6
zookeeper(habse使用自带)

hbase单机安装

由于服务器配置比较低,所以hbase采用单机安装。真正线上需要搭建hbase的集群。
1、下载hbase1.2.2gz的压缩包后,上传到服务器/opt目录后,使用tar zxvf hbase-1.2.2-bin.tar.gz 解压缩.
2、添加环境变量HBASE_HOME=/opt/hbase-1.2.2,
3、修改$HBASE_HOME/conf/hbase-site.xml,内容如下

1
2
3
4
5
6
7
8
9
10
11
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///opt/hbase-1.2.2/data/hbase</value>
</property>

<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/opt/hbase-1.2.2/data/zookeeper</value>
</property>
</configuration>

4、修改$HBASE_HOME/conf/hbase-env.sh,添加JDK_HOME参数

1
export JAVA_HOME=/opt/jdk1.8.0_101

5、启动hbase服务

1
[root@server01 bin]# $HBASE_HOME/bin/start-hbase.sh

6、关闭防火墙,因为是局域网部署,所以关闭防火墙

1
2
3
[root@txbdserver01 bin]# service iptables stop
Redirecting to /bin/systemctl stop iptables.service
Failed to stop iptables.service: Unit iptables.service not loaded.

7、检查habse的zookeeper监听端口

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[root@server01 bin]# netstat -an | grep 2181
tcp6 0 0 :::2181 :::* LISTEN
tcp6 0 0 ::1:37474 ::1:2181 ESTABLISHED
tcp6 0 0 192.168.1.180:2181 192.168.1.186:25465 ESTABLISHED
tcp6 0 0 ::1:2181 ::1:37466 ESTABLISHED
tcp6 0 0 ::1:2181 ::1:37476 ESTABLISHED
tcp6 0 0 ::1:2181 ::1:37474 ESTABLISHED
tcp6 0 0 ::1:37468 ::1:2181 ESTABLISHED
tcp6 0 0 ::1:37470 ::1:2181 ESTABLISHED
tcp6 0 0 ::1:2181 ::1:37472 ESTABLISHED
tcp6 0 0 ::1:2181 ::1:37470 ESTABLISHED
tcp6 0 0 ::1:37472 ::1:2181 ESTABLISHED
tcp6 0 0 ::1:37466 ::1:2181 ESTABLISHED
tcp6 0 0 ::1:37476 ::1:2181 ESTABLISHED
tcp6 0 0 ::1:2181 ::1:37468 ESTABLISHED

至此,hbase的服务已经安装完成。通过habse的client api可以连接到zookeeper上,进行对hbase的操作。

jstorm的zookeeper安装

在jstorm服务器上部署zookeeper,供jstorm使用。
1、下载zookeeper-3.4.6.tar.gz的压缩包后,上传到服务器/opt目录后,使用tar zxvf zookeeper-3.4.6.tar.gz 解压缩.
2、添加环境变量ZOOKEEPER_HOME=/opt/zookeeper-3.4.6
3、拷贝一份配置文件,cp $ZOOKEEPER_HOME/conf/zoo_sample.cfg $ZOOKEEPER_HOME/conf/zoo.cfg ,修改$ZOOKEEPER_HOME/conf/zoo.cfg,内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/tmp/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

4、启动zookeeper服务

1
[root@server02 bin]# $ZOOKEEPER_HOME/bin/zkServer.sh start

至此,jstorm的zookeeper已经启动完成。

jstorm安装

这里我们在一台服务器上部署jstorm的zookeeper、nimbus、supervisor。
1、下载jstorm-2.2.1.zip的压缩包后,上传到服务器/opt目录后,使用unzip -d jstorm-2.2.1.zip /opt/jstorm-2.2.1 解压缩.
2、添加环境变量JSTORM_HOME=/opt/jstorm-2.2.1, PATH=$PATH:$JSTORM_HOME/bin
3、修改$JSTORM_HOME/conf/storm.yaml,内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
- "localhost"

storm.zookeeper.root: "/jstorm"

# cluster.name: "default"

#nimbus.host/nimbus.host.start.supervisor is being used by $JSTORM_HOME/bin/start.sh
#it only support IP, please don't set hostname
# For example
# nimbus.host: "10.132.168.10, 10.132.168.45"
#nimbus.host.start.supervisor: false

# %JSTORM_HOME% is the jstorm home directory
storm.local.dir: "/opt/jstorm-2.2.1/data"
# please set absolute path, default path is JSTORM_HOME/logs
# jstorm.log.dir: "absolute path"

# java.library.path: "/usr/local/lib:/opt/local/lib:/usr/lib"


# if supervisor.slots.ports is null,
# the port list will be generated by cpu cores and system memory size
# for example,
# there are cpu_num = system_physical_cpu_num/supervisor.slots.port.cpu.weight
# there are mem_num = system_physical_memory_size/(worker.memory.size * supervisor.slots.port.mem.weight)
# The final port number is min(cpu_num, mem_num)
# supervisor.slots.ports.base: 6800
# supervisor.slots.port.cpu.weight: 1.2
# supervisor.slots.port.mem.weight: 0.7
# supervisor.slots.ports: null
# supervisor.slots.ports:
# - 6800
# - 6801
# - 6802
# - 6803

# Default disable user-define classloader
# If there are jar conflict between jstorm and application,
# please enable it
# topology.enable.classloader: false

# enable supervisor use cgroup to make resource isolation
# Before enable it, you should make sure:
# 1. Linux version (>= 2.6.18)
# 2. Have installed cgroup (check the file's existence:/proc/cgroups)
# 3. You should start your supervisor on root
# You can get more about cgroup:
# http://t.cn/8s7nexU
# supervisor.enable.cgroup: false


### Netty will send multiple messages in one batch
### Setting true will improve throughput, but more latency
# storm.messaging.netty.transfer.async.batch: true

### default worker memory size, unit is byte
# worker.memory.size: 2147483648

# Metrics Monitor
# topology.performance.metrics: it is the switch flag for performance
# purpose. When it is disabled, the data of timer and histogram metrics
# will not be collected.
# topology.alimonitor.metrics.post: If it is disable, metrics data
# will only be printed to log. If it is enabled, the metrics data will be
# posted to alimonitor besides printing to log.
# topology.performance.metrics: true
# topology.alimonitor.metrics.post: false

# UI MultiCluster
# Following is an example of multicluster UI configuration
# ui.clusters:
# - {
# name: "jstorm",
# zkRoot: "/jstorm",
# zkServers:
# [ "localhost"],
# zkPort: 2181,
# }

主要是修改storm.local.dir,其他因为是单机部署,所以不需要修改即可。

4、启动zookeeper服务

1
2
3
4
#启动nimbus服务
[root@server02 bin]# $nohup $JSTORM_HOME/bin/jstorm nimbus >/dev/null 2>&1 &
#启动supervisor服务
[root@server02 bin]# $nohup $JSTORM_HOME/bin/jstorm supervisor >/dev/null 2>&1 &

至此,jstorm已经启动完成。

问题

bolt连接zookeeper超时

1
2
3
4
5
6
2017-06-13 09:37:05,582 WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxn: caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x15c8aeb3371001d, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-06-13 09:37:05,582 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxn: Closed socket connection for client /192.168.1.186:25387 which had sessionid 0x15c8aeb3371001d

经过查找,发现问题出现在zookeeper的host记录没有在jstorm服务器/etc/hosts中添加导致。
最终在jstorm服务器/etc/hosts 添加hbase的zookeeper主机名ip的映射后解决。

工程代码下载

github地址

文章目录
  1. 1. 环境
    1. 1.1. 服务器环境
    2. 1.2. ip地址
    3. 1.3. 软件版本
    4. 1.4. hbase单机安装
    5. 1.5. jstorm的zookeeper安装
    6. 1.6. jstorm安装
  2. 2. 问题
    1. 2.1. bolt连接zookeeper超时
  3. 3. 工程代码下载