Redis单实例数据迁移Cluster方案实战

  |   1 评论   |   2,800 浏览

    大部分应用在使用Redis的时候可能前期只使用一个实例,随着数据量和访问量增大,单实例逐渐捉襟见肘,就需要考虑上Cluster方案了,本文提供了一个方案,就是把单实例的数据完整的迁移到Cluster上。

    方案步骤

    1)获取原单实例节点D的持久化AOF文件

    2)新准备三个节点A,B,C,建立集群,目前集群为空

    3)把节点B,C上的slots,全部分配给A

    4)把1)中获取的AOF文件SCP到A上

    5)重启A节点,把数据全部加载到内存

    6)把A节点上的slots再均匀分配给B,C

    7)新准备A1,B1,C1,分别作为A,B,C的slave加入到集群

    8)验证数据的完整性和集群状态


    方案实战

    目前我们的实战是这样的,单节点为 10.10.10.118:6379 ,数据量为 500多万

    Cluster准备了3主3从,前期A,B,C构成一个空的集群,A1,B1,C1待数据分配好后,再加入集群

                      A 10.10.10.126:7000 -> A1 10.10.10.126:7003

                      B 10.10.10.126:7001 -> B1 10.10.10.126:7004

                      C 10.10.10.126:7002 -> C1 10.10.10.126:7005

    管理集群,我们仍然使用官方提供的工具redis-trib.rb,具体redis-trib.rb如何使用,请参考Cluster实战的那篇文章。

    1)持久化文件

    在单实例中假如同时开启了RDB和AOF,还是只要AOF文件就可以了,因为当AOF和RDB同时存在的时候,Redis还是会先加载AOF文件的。

    进入 10.10.10.118:6379 我们执行命令

    [root@TEST01 bin]# ./redis-cli 
    127.0.0.1:6379>

    127.0.0.1:6379> dbsize
    (integer) 5160702 
    127.0.0.1:6379> BGREWRITEAOF
    Background append only file rewriting started

    然后到配置AOF所在的目录,这样就会获取了最新的AOF文件。注意红色数字是key的数量,等数据全部倒入集群后,要验证这个数量是否正确。


    2)创建集群

    2.1)启动A,B,C节点

    /data/apps/redis-cluster/7000/bin/redis-server /data/apps/redis-cluster/7000/redis.conf
    /data/apps/redis-cluster/7001/bin/redis-server /data/apps/redis-cluster/7001/redis.conf
    /data/apps/redis-cluster/7002/bin/redis-server /data/apps/redis-cluster/7002/redis.conf

    2.2)3个Master节点构成集群

    [root@test1 bin]# ./redis-trib.rb create 10.10.10.126:7000 10.10.10.126:7001 10.10.10.126:7002
    >>> Creating cluster
    >>> Performing hash slots allocation on 3 nodes...
    Using 3 masters:
    10.10.10.126:7000
    10.10.10.126:7001
    10.10.10.126:7002
    M: 6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000
       slots:0-5460 (5461 slots) master
    M: bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001
       slots:5461-10922 (5462 slots) master
    M: e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002
       slots:10923-16383 (5461 slots) master
    Can I set the above configuration? (type 'yes' to accept): yes
    >>> Nodes configuration updated
    >>> Assign a different config epoch to each node
    >>> Sending CLUSTER MEET messages to join the cluster
    Waiting for the cluster to join.
    >>> Performing Cluster Check (using node 10.10.10.126:7000)
    M: 6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000
       slots:0-5460 (5461 slots) master
    M: bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001
       slots:5461-10922 (5462 slots) master
    M: e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002
       slots:10923-16383 (5461 slots) master
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    [root@test1 bin]# 

    2.3)查看集群状态,看slots分布情况

    [root@test1 bin]#./redis-cli -c -p 7000
    127.0.0.1:7000> cluster nodes
    bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001 master - 0 1461378773614 2 connected 5461-10922
    e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002 master - 0 1461378772614 3 connected 10923-16383
    6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000 myself,master - 0 0 1 connected 0-5460
    127.0.0.1:7000> 


    3)把B、C上slots移到A节点上

    [root@test1 bin]# ./redis-trib.rb check 10.10.10.126:7000
    >>> Performing Cluster Check (using node 10.10.10.126:7000)
    M: 6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000
       slots:0-5460 (5461 slots) master
       0 additional replica(s)
    M: bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001
       slots:5461-10922 (5462 slots) master
       0 additional replica(s)
    M: e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002
       slots:10923-16383 (5461 slots) master
       0 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.

    从刚才的集群状态得知

    A节点 10.10.10.126:7000 的runid为 6a85d385b2720fd463eccaf720dc12f495a1baa3 ,其有 5461 个slots

    B节点 10.10.10.126:7001 的runid为 bbb2b1b060b440a56d07a16ee7f87f9379767d61 ,其有 5462 个slots

    C节点 10.10.10.126:7002 的runid为 e7005711bc55315caaecbac2774f3c7d87a13c7a ,其有 5461 个slots


    把B节点上5462个slots移动A节点上

    ./redis-trib.rb reshard --from bbb2b1b060b440a56d07a16ee7f87f9379767d61  --to 6a85d385b2720fd463eccaf720dc12f495a1baa3  --slots 5462 --yes 10.10.10.126:7000


    把C节点上的5461个slots移动A节点上

    ./redis-trib.rb reshard --from e7005711bc55315caaecbac2774f3c7d87a13c7a    --to 6a85d385b2720fd463eccaf720dc12f495a1baa3  --slots 5461 --yes 10.10.10.126:7000

    可以看到A节点拥有了全部16384个slots,B、C节点上已经没有slots了

    [root@test1 bin]#  ./redis-trib.rb check 10.10.10.126:7000
    >>> Performing Cluster Check (using node 10.10.10.126:7000)
    M: 6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000
       slots:0-16383 (16384 slots) master
       0 additional replica(s)
    M: bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001
       slots: (0 slots) master
       0 additional replica(s)
    M: e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002
       slots: (0 slots) master
       0 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.


    4)把1)中持久的AOF文件SCP到A上

    进入单实例节点,远程复制 appendonly.aof 到A节点

    scp -P 22 -rp appendonly.aof root@10.10.10.126:/tmp


    5)重启A节点,加载AOF文件

    appendonly.aof 移动A节点配置的持久化的目录,然后shutdown A节点,然后再重启

    [root@test1 data]# redis-cli -c -p 7000 shutdown
    [root@test1 data]# 
    [root@test1 data]# ps -ef|grep redis
    root     11553     1  0 10:26 ?        00:00:08 /data/apps/redis-cluster/7001/bin/redis-server *:7001 [cluster]                        
    root     11557     1  0 10:26 ?        00:00:09 /data/apps/redis-cluster/7002/bin/redis-server *:7002 [cluster]                        
    root     12597 10641  0 11:03 pts/0    00:00:00 grep redis


    重启A节点: /data/apps/redis-cluster/7000/bin/redis-server /data/apps/redis-cluster/7000/redis.conf

    可以看到数据加载到内存后,A节点上 key 的数量

    127.0.0.1:7000> dbsize
    (error) LOADING Redis is loading the dataset in memory   -- 说明数据在加载到内容的过程中
    127.0.0.1:7000> dbsize
    (integer) 5160702  -- 这个数量也是对的


    6)把A节点上的slots再均匀分配给B,C


    把A节点上5462个slots移动B节点上

    ./redis-trib.rb reshard --from 6a85d385b2720fd463eccaf720dc12f495a1baa3  --to bbb2b1b060b440a56d07a16ee7f87f9379767d61  --slots 5462 --yes 10.10.10.126:7000


    把A节点上的5461个slots移动C节点上

    ./redis-trib.rb reshard --from 6a85d385b2720fd463eccaf720dc12f495a1baa3  --to e7005711bc55315caaecbac2774f3c7d87a13c7a    --slots 5461 --yes 10.10.10.126:7000


    可以看到slots都已经成功转移了

    127.0.0.1:7000>  cluster nodes
    6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000 myself,master - 0 0 4 connected 10923-16383
    bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001 master - 0 1461381943938 5 connected 0-5461
    e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002 master - 0 1461381944938 6 connected 5462-10922
    127.0.0.1:7000> 

    注意:在实际操作中,看数据量情况,如果量大的话,slots不要一次性移过去,要一部分一部分的转移。


    7)给A、B、C节点添加slave节点

    7.1)启动A1节点,并把A1节点加入集群,成为A节点的从节点

    启动:/data/apps/redis-cluster/7003/bin/redis-server /data/apps/redis-cluster/7003/redis.conf

    A1加入集群:

    [root@test1 bin]#  ./redis-trib.rb add-node --slave --master-id 6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7003 10.10.10.126:7000
    >>> Adding node 10.10.10.126:7003 to cluster 10.10.10.126:7000
    >>> Performing Cluster Check (using node 10.10.10.126:7000)
    M: 6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000
       slots:10923-16383 (5461 slots) master
       0 additional replica(s)
    M: bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001
       slots:0-5461 (5462 slots) master
       0 additional replica(s)
    M: e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002
       slots:5462-10922 (5461 slots) master
       0 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    >>> Send CLUSTER MEET to node 10.10.10.126:7003 to make it join the cluster.
    Waiting for the cluster to join.
    >>> Configure node as replica of 10.10.10.126:7000.
    [OK] New node added correctly.
    [root@test1 bin]# 

    看集群节点信息情况:

    127.0.0.1:7000> cluster nodes
    afbe63bcf2f3418db48ea9a2749dd0b1bf24f0f3 10.10.10.126:7003 slave 6a85d385b2720fd463eccaf720dc12f495a1baa3 0 1461387526639 4 connected
    6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000 myself,master - 0 0 4 connected 10923-16383
    bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001 master - 0 1461387527640 5 connected 0-5461
    e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002 master - 0 1461387526239 6 connected 5462-10922
    127.0.0.1:7000> 

    7.2)启动B1节点,并把A1节点加入集群,成为B节点的从节点

    启动:/data/apps/redis-cluster/7004/bin/redis-server /data/apps/redis-cluster/7004/redis.conf

    B1加入集群:

    [root@test1 bin]# ./redis-trib.rb add-node --slave --master-id bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7004 10.10.10.126:7000
    >>> Adding node 10.10.10.126:7004 to cluster 10.10.10.126:7000
    >>> Performing Cluster Check (using node 10.10.10.126:7000)
    M: 6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000
       slots:10923-16383 (5461 slots) master
       1 additional replica(s)
    S: afbe63bcf2f3418db48ea9a2749dd0b1bf24f0f3 10.10.10.126:7003
       slots: (0 slots) slave
       replicates 6a85d385b2720fd463eccaf720dc12f495a1baa3
    M: bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001
       slots:0-5461 (5462 slots) master
       0 additional replica(s)
    M: e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002
       slots:5462-10922 (5461 slots) master
       0 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    >>> Send CLUSTER MEET to node 10.10.10.126:7004 to make it join the cluster.
    Waiting for the cluster to join.
    >>> Configure node as replica of 10.10.10.126:7001.
    [OK] New node added correctly.
    [root@test1 bin]# 
    看集群节点信息情况:

    127.0.0.1:7000> cluster nodes
    afbe63bcf2f3418db48ea9a2749dd0b1bf24f0f3 10.10.10.126:7003 slave 6a85d385b2720fd463eccaf720dc12f495a1baa3 0 1461387826504 4 connected
    6fbc4a2a0239bc876bed4cf854846717d9543477 10.10.10.126:7004 slave bbb2b1b060b440a56d07a16ee7f87f9379767d61 0 1461387827004 5 connected
    e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002 master - 0 1461387826004 6 connected 5462-10922
    6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000 myself,master - 0 0 4 connected 10923-16383
    bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001 master - 0 1461387827504 5 connected 0-5461
    127.0.0.1:7000> 

    7.3)启动C1节点,并把A1节点加入集群,成为C节点的从节点

    启动:/data/apps/redis-cluster/7005/bin/redis-server /data/apps/redis-cluster/7005/redis.conf

    C1加入集群:

    [root@test1 bin]# ./redis-trib.rb add-node --slave --master-id e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7005 10.10.10.126:7000
    >>> Adding node 10.10.10.126:7005 to cluster 10.10.10.126:7000
    >>> Performing Cluster Check (using node 10.10.10.126:7000)
    M: 6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000
       slots:10923-16383 (5461 slots) master
       1 additional replica(s)
    S: afbe63bcf2f3418db48ea9a2749dd0b1bf24f0f3 10.10.10.126:7003
       slots: (0 slots) slave
       replicates 6a85d385b2720fd463eccaf720dc12f495a1baa3
    S: 6fbc4a2a0239bc876bed4cf854846717d9543477 10.10.10.126:7004
       slots: (0 slots) slave
       replicates bbb2b1b060b440a56d07a16ee7f87f9379767d61
    M: e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002
       slots:5462-10922 (5461 slots) master
       0 additional replica(s)
    M: bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001
       slots:0-5461 (5462 slots) master
       1 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    >>> Send CLUSTER MEET to node 10.10.10.126:7005 to make it join the cluster.
    Waiting for the cluster to join.
    >>> Configure node as replica of 10.10.10.126:7002.
    [OK] New node added correctly.
    [root@test1 bin]# 
    看集群节点信息情况:

    127.0.0.1:7000>  cluster nodes
    afbe63bcf2f3418db48ea9a2749dd0b1bf24f0f3 10.10.10.126:7003 slave 6a85d385b2720fd463eccaf720dc12f495a1baa3 0 1461388245893 4 connected
    6fbc4a2a0239bc876bed4cf854846717d9543477 10.10.10.126:7004 slave bbb2b1b060b440a56d07a16ee7f87f9379767d61 0 1461388247693 5 connected
    e7005711bc55315caaecbac2774f3c7d87a13c7a 10.10.10.126:7002 master - 0 1461388246893 6 connected 5462-10922
    6a85d385b2720fd463eccaf720dc12f495a1baa3 10.10.10.126:7000 myself,master - 0 0 4 connected 10923-16383
    bbb2b1b060b440a56d07a16ee7f87f9379767d61 10.10.10.126:7001 master - 0 1461388247893 5 connected 0-5461
    356df8094ad9906911d2ab6313cdc882a495b4eb 10.10.10.126:7005 slave e7005711bc55315caaecbac2774f3c7d87a13c7a 0 1461388246493 6 connected
    127.0.0.1:7000> 

    8)验证数据的完整性

    从7.3)中,也可以看出集群目前是3主3从,slots分配正常,集群状态OK

    127.0.0.1:7000> cluster info
    cluster_state:ok
    cluster_slots_assigned:16384
    cluster_slots_ok:16384
    cluster_slots_pfail:0
    cluster_slots_fail:0
    cluster_known_nodes:6
    cluster_size:3
    cluster_current_epoch:6
    cluster_my_epoch:4
    cluster_stats_messages_sent:18863
    cluster_stats_messages_received:18863
    127.0.0.1:7000> 

    验证key的数量,我们把A,B,C节点上的key的数量加起来的总量和我们单实例上的数量比照一下。

    [root@test1 redis-cluster]# redis-cli -c -p 7000 dbsize
    (integer) 1720056
    [root@test1 redis-cluster]# redis-cli -c -p 7001 dbsize
    (integer) 1720529
    [root@test1 redis-cluster]# redis-cli -c -p 7002 dbsize
    (integer) 1720117
    [root@test1 redis-cluster]# 

    1720056+1720529+1720117=5160702

    可以看出集群中key的总数量和单实例中数量完全一致。


    提醒:Redis在创建集群的时候,各节点的数据必须是空的。本文只是提供了一个思路,如有更好的方法,还请告知。

    评论

    发表评论

    validate