分片(sharding)是一种跨多台机器分布数据的方法, MongoDB使用分片来支持具有非常大的数据集和高吞吐量操作的部署。
换句话说:分片(sharding)是指将数据拆分,将其分散存在不同的机器上的过程。有时也用分区(partitioning)来表示这个概念。将数据分散到不同的机器上,不需要功能强大的大型计算机就可以储存更多的数据,处理更多的负载。
具有大型数据集或高吞吐量应用程序的数据库系统可以会挑战单个服务器的容量。例如,高查询率会耗尽服务器的CPU容量。工作集大小大于系统的RAM会强调磁盘驱动器的I / O容量。
有两种解决系统增长的方法:垂直扩展和水平扩展。
垂直扩展意味着增加单个服务器的容量,例如使用更强大的CPU,添加更多RAM或增加存储空间量。可用技术的局限性可能会限制单个机器对于给定工作负载而言足够强大。此外,基于云的提供商基于可用的硬件配置具有硬性上限。结果,垂直缩放有实际的最大值。
水平扩展意味着划分系统数据集并加载多个服务器,添加其他服务器以根据需要增加容量。虽然单个机器的总体速度或容量可能不高,但每台机器处理整个工作负载的子集,可能提供比单个高速大容量服务器更高的效率。扩展部署容量只需要根据需要添加额外的服务器,这可能比单个机器的高端硬件的总体成本更低。权衡是基础架构和部署维护的复杂性增加。
MongoDB支持通过分片进行水平扩展。
MongoDB分片群集包含以下组件:
分片(存储):每个分片包含分片数据的子集。 每个分片都可以部署为副本集。 mongos (路由):mongos充当查询路由器,在客户端应用程序和分片集群之间提供接口。 config servers (“调度”的配置):配置服务器存储群集的元数据和配置设置。 从MongoDB 3.4开始,必须将配置服务器部署为副本集(CSRS)。
下图描述了分片集群中组件的交互:
MongoDB在集合级别对数据进行分片,将集合数据分布在集群中的分片上。
27018 if mongod is a shard member; 27019 if mongod is a config server member
两个分片节点副本集(3+3)+一个配置节点副本集(3)+两个路由节点(2),共11个服务节点。
所有的的配置文件都直接放到 sharded_cluster 的相应的子目录下面,默认配置文件名字:mongod.conf
准备存放数据和日志的目录:
#-----------myshardrs01 mkdir -p /mongodb/sharded_cluster/myshardrs01_27018/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27018/data/db \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27118/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27118/data/db \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27218/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27218/data/db新建或修改配置文件:
[root@localhost ~]# cat /mongodb/sharded_cluster/myshardrs01_27018/mongod.conf systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/myshardrs01_27018/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true storage: #mongod实例存储其数据的目录。storage.dbPath设置仅适用于mongod。 dbPath: "/mongodb/sharded_cluster/myshardrs01_27018/data/db" journal: #启用或禁用持久性日志以确保数据文件保持有效和可恢复。 enabled: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: "/mongodb/sharded_cluster/myshardrs01_27018/log/mongod.pid" net: #服务实例绑定所有IP,有副作用,副本集初始化的时候,节点名字会自动设置为本地域名,而不是ip #bindIpAll: true #服务实例绑定的IP bindIp: localhost,192.168.1.171 #bindIp #绑定的端口 port: 27018 replication: #副本集的名称 replSetName: myshardrs01 sharding: #分片角色 clusterRole: shardsvr [root@localhost ~]#注意: 设置sharding.clusterRole需要mongod实例运行复制。 要将实例部署为副本集成员,请使用replSetName设置并指定副本集的名称。 新建或修改配置文件:
[root@localhost ~]# cat /mongodb/sharded_cluster/myshardrs01_27118/mongod.conf systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/myshardrs01_27118/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true storage: #mongod实例存储其数据的目录。storage.dbPath设置仅适用于mongod。 dbPath: "/mongodb/sharded_cluster/myshardrs01_27118/data/db" journal: #启用或禁用持久性日志以确保数据文件保持有效和可恢复。 enabled: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: "/mongodb/sharded_cluster/myshardrs01_27118/log/mongod.pid" net: #服务实例绑定所有IP,有副作用,副本集初始化的时候,节点名字会自动设置为本地域名,而不是ip #bindIpAll: true #服务实例绑定的IP bindIp: localhost,192.168.1.171 #bindIp #绑定的端口 port: 27118 replication: #副本集的名称 replSetName: myshardrs01 sharding: #分片角色 clusterRole: shardsvr [root@localhost ~]# [root@localhost ~]# cat /mongodb/sharded_cluster/myshardrs01_27218/mongod.conf systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/myshardrs01_27218/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true storage: #mongod实例存储其数据的目录。storage.dbPath设置仅适用于mongod。 dbPath: "/mongodb/sharded_cluster/myshardrs01_27218/data/db" journal: #启用或禁用持久性日志以确保数据文件保持有效和可恢复。 enabled: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: "/mongodb/sharded_cluster/myshardrs01_27218/log/mongod.pid" net: #服务实例绑定所有IP,有副作用,副本集初始化的时候,节点名字会自动设置为本地域名,而不是ip #bindIpAll: true #服务实例绑定的IP bindIp: localhost,192.168.1.171 #bindIp #绑定的端口 port: 27218 replication: #副本集的名称 replSetName: myshardrs01 sharding: #分片角色 clusterRole: shardsvr [root@localhost ~]#启动第一套副本集:一主一副本一仲裁 依次启动三个mongod服务:
(1)初始化副本集和创建主节点,添加副本节点,添加仲裁节点 使用客户端命令连接任意一个节点,但这里尽量要连接主节点:
[root@localhost sharded_cluster]# /usr/local/mongodb/bin/mongo --port 27018 MongoDB shell version v4.0.10 connecting to: mongodb://127.0.0.1:27018/?gssapiServiceName=mongodb Implicit session: session { "id" : UUID("eae41d99-aa8b-4352-ac3c-9c2cff535d3c") } MongoDB server version: 4.0.10 Welcome to the MongoDB shell. For interactive help, type "help". For more comprehensive documentation, see http://docs.mongodb.org/ Questions? Try the support group http://groups.google.com/group/mongodb-user Server has startup warnings: 2020-10-21T10:05:05.266+0800 I CONTROL [initandlisten] 2020-10-21T10:05:05.266+0800 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database. 2020-10-21T10:05:05.266+0800 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted. 2020-10-21T10:05:05.266+0800 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended. 2020-10-21T10:05:05.266+0800 I CONTROL [initandlisten] 2020-10-21T10:05:05.267+0800 I CONTROL [initandlisten] 2020-10-21T10:05:05.267+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. 2020-10-21T10:05:05.267+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2020-10-21T10:05:05.267+0800 I CONTROL [initandlisten] 2020-10-21T10:05:05.267+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. 2020-10-21T10:05:05.267+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2020-10-21T10:05:05.267+0800 I CONTROL [initandlisten] > rs.initiate() { "info2" : "no configuration specified. Using a default configuration for the set", "me" : "192.168.1.171:27018", "ok" : 1, "operationTime" : Timestamp(1603246083, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603246083, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myshardrs01:SECONDARY> rs.status() { "set" : "myshardrs01", "date" : ISODate("2020-10-21T02:08:47.567Z"), "myState" : 1, "term" : NumberLong(1), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1603246125, 1), "t" : NumberLong(1) }, "readConcernMajorityOpTime" : { "ts" : Timestamp(1603246125, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1603246125, 1), "t" : NumberLong(1) }, "durableOpTime" : { "ts" : Timestamp(1603246125, 1), "t" : NumberLong(1) } }, "lastStableCheckpointTimestamp" : Timestamp(1603246085, 2), "members" : [ { "_id" : 0, "name" : "192.168.1.171:27018", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 223, "optime" : { "ts" : Timestamp(1603246125, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2020-10-21T02:08:45Z"), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "could not find member to sync from", "electionTime" : Timestamp(1603246083, 2), "electionDate" : ISODate("2020-10-21T02:08:03Z"), "configVersion" : 1, "self" : true, "lastHeartbeatMessage" : "" } ], "ok" : 1, "operationTime" : Timestamp(1603246125, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603246125, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myshardrs01:PRIMARY> rs.conf() { "_id" : "myshardrs01", "version" : 1, "protocolVersion" : NumberLong(1), "writeConcernMajorityJournalDefault" : true, "members" : [ { "_id" : 0, "host" : "192.168.1.171:27018", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 } ], "settings" : { "chainingAllowed" : true, "heartbeatIntervalMillis" : 2000, "heartbeatTimeoutSecs" : 10, "electionTimeoutMillis" : 10000, "catchUpTimeoutMillis" : -1, "catchUpTakeoverDelayMillis" : 30000, "getLastErrorModes" : { }, "getLastErrorDefaults" : { "w" : 1, "wtimeout" : 0 }, "replicaSetId" : ObjectId("5f8f9803deaac6064bd46afa") } } myshardrs01:PRIMARY> rs.add("192.168.1.171:27118") { "ok" : 1, "operationTime" : Timestamp(1603246209, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603246209, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myshardrs01:PRIMARY> rs.addArb("192.168.1.171:27218") { "ok" : 1, "operationTime" : Timestamp(1603246253, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603246253, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myshardrs01:PRIMARY> rs.conf() { "_id" : "myshardrs01", "version" : 3, "protocolVersion" : NumberLong(1), "writeConcernMajorityJournalDefault" : true, "members" : [ { "_id" : 0, "host" : "192.168.1.171:27018", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 1, "host" : "192.168.1.171:27118", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 2, "host" : "192.168.1.171:27218", "arbiterOnly" : true, "buildIndexes" : true, "hidden" : false, "priority" : 0, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 } ], "settings" : { "chainingAllowed" : true, "heartbeatIntervalMillis" : 2000, "heartbeatTimeoutSecs" : 10, "electionTimeoutMillis" : 10000, "catchUpTimeoutMillis" : -1, "catchUpTakeoverDelayMillis" : 30000, "getLastErrorModes" : { }, "getLastErrorDefaults" : { "w" : 1, "wtimeout" : 0 }, "replicaSetId" : ObjectId("5f8f9803deaac6064bd46afa") } }准备存放数据和日志的目录:
#-----------myshardrs02 mkdir -p /mongodb/sharded_cluster/myshardrs02_27318/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs02_27318/data/db \ & mkdir -p /mongodb/sharded_cluster/myshardrs02_27418/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs02_27418/data/db \ & mkdir -p /mongodb/sharded_cluster/myshardrs02_27518/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs02_27518/data/db新建或修改配置文件:
[root@localhost ~]# cat /mongodb/sharded_cluster/myshardrs02_27318/mongod.conf systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/myshardrs02_27318/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true storage: #mongod实例存储其数据的目录。storage.dbPath设置仅适用于mongod。 dbPath: "/mongodb/sharded_cluster/myshardrs02_27318/data/db" journal: #启用或禁用持久性日志以确保数据文件保持有效和可恢复。 enabled: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: "/mongodb/sharded_cluster/myshardrs02_27318/log/mongod.pid" net: #服务实例绑定的IP bindIp: localhost,192.168.1.171 #绑定的端口 port: 27318 replication: replSetName: myshardrs02 sharding: clusterRole: shardsvr [root@localhost ~]# [root@localhost ~]# cat /mongodb/sharded_cluster/myshardrs02_27418/mongod.conf systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/myshardrs02_27418/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true storage: #mongod实例存储其数据的目录。storage.dbPath设置仅适用于mongod。 dbPath: "/mongodb/sharded_cluster/myshardrs02_27418/data/db" journal: #启用或禁用持久性日志以确保数据文件保持有效和可恢复。 enabled: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: "/mongodb/sharded_cluster/myshardrs02_27418/log/mongod.pid" net: #服务实例绑定的IP bindIp: localhost,192.168.1.171 #绑定的端口 port: 27418 replication: replSetName: myshardrs02 sharding: clusterRole: shardsvr [root@localhost ~]# [root@localhost ~]# cat /mongodb/sharded_cluster/myshardrs02_27518/mongod.conf systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/myshardrs02_27518/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true storage: #mongod实例存储其数据的目录。storage.dbPath设置仅适用于mongod。 dbPath: "/mongodb/sharded_cluster/myshardrs02_27518/data/db" journal: #启用或禁用持久性日志以确保数据文件保持有效和可恢复。 enabled: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: "/mongodb/sharded_cluster/myshardrs02_27518/log/mongod.pid" net: #服务实例绑定的IP bindIp: localhost,192.168.1.171 #绑定的端口 port: 27518 replication: replSetName: myshardrs02 sharding: clusterRole: shardsvr [root@localhost ~]#启动第二套副本集:一主一副本一仲裁 依次启动三个mongod服务:
1)初始化副本集和创建主节点,添加副本节点,添加仲裁节点 使用客户端命令连接任意一个节点,但这里尽量要连接主节点
[root@localhost sharded_cluster]# /usr/local/mongodb/bin/mongo --port 27318 MongoDB shell version v4.0.10 connecting to: mongodb://127.0.0.1:27318/?gssapiServiceName=mongodb Implicit session: session { "id" : UUID("1093ae85-a816-47ad-b1eb-991081b6754a") } MongoDB server version: 4.0.10 Server has startup warnings: 2020-10-21T10:18:12.598+0800 I CONTROL [initandlisten] 2020-10-21T10:18:12.598+0800 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database. 2020-10-21T10:18:12.598+0800 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted. 2020-10-21T10:18:12.598+0800 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended. 2020-10-21T10:18:12.598+0800 I CONTROL [initandlisten] 2020-10-21T10:18:12.599+0800 I CONTROL [initandlisten] 2020-10-21T10:18:12.599+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. 2020-10-21T10:18:12.599+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2020-10-21T10:18:12.599+0800 I CONTROL [initandlisten] 2020-10-21T10:18:12.599+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. 2020-10-21T10:18:12.599+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2020-10-21T10:18:12.599+0800 I CONTROL [initandlisten] > rs.initiate() { "info2" : "no configuration specified. Using a default configuration for the set", "me" : "192.168.1.171:27318", "ok" : 1, "operationTime" : Timestamp(1603246780, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603246780, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myshardrs02:SECONDARY> myshardrs02:PRIMARY> myshardrs02:PRIMARY> rs.status() { "set" : "myshardrs02", "date" : ISODate("2020-10-21T02:19:54.761Z"), "myState" : 1, "term" : NumberLong(1), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1603246792, 1), "t" : NumberLong(1) }, "readConcernMajorityOpTime" : { "ts" : Timestamp(1603246792, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1603246792, 1), "t" : NumberLong(1) }, "durableOpTime" : { "ts" : Timestamp(1603246792, 1), "t" : NumberLong(1) } }, "lastStableCheckpointTimestamp" : Timestamp(1603246782, 2), "members" : [ { "_id" : 0, "name" : "192.168.1.171:27318", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 104, "optime" : { "ts" : Timestamp(1603246792, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2020-10-21T02:19:52Z"), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "could not find member to sync from", "electionTime" : Timestamp(1603246780, 2), "electionDate" : ISODate("2020-10-21T02:19:40Z"), "configVersion" : 1, "self" : true, "lastHeartbeatMessage" : "" } ], "ok" : 1, "operationTime" : Timestamp(1603246792, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603246792, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myshardrs02:PRIMARY> rs.add("192.168.1.171:27418") { "ok" : 1, "operationTime" : Timestamp(1603246816, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603246816, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myshardrs02:PRIMARY> rs.addArb("192.168.1.171:27518") { "ok" : 1, "operationTime" : Timestamp(1603246843, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603246843, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myshardrs02:PRIMARY> rs.status() { "set" : "myshardrs02", "date" : ISODate("2020-10-21T02:20:59.407Z"), "myState" : 1, "term" : NumberLong(1), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1603246843, 1), "t" : NumberLong(1) }, "readConcernMajorityOpTime" : { "ts" : Timestamp(1603246843, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1603246843, 1), "t" : NumberLong(1) }, "durableOpTime" : { "ts" : Timestamp(1603246843, 1), "t" : NumberLong(1) } }, "lastStableCheckpointTimestamp" : Timestamp(1603246842, 1), "members" : [ { "_id" : 0, "name" : "192.168.1.171:27318", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 169, "optime" : { "ts" : Timestamp(1603246843, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2020-10-21T02:20:43Z"), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "could not find member to sync from", "electionTime" : Timestamp(1603246780, 2), "electionDate" : ISODate("2020-10-21T02:19:40Z"), "configVersion" : 3, "self" : true, "lastHeartbeatMessage" : "" }, { "_id" : 1, "name" : "192.168.1.171:27418", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 42, "optime" : { "ts" : Timestamp(1603246843, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1603246843, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2020-10-21T02:20:43Z"), "optimeDurableDate" : ISODate("2020-10-21T02:20:43Z"), "lastHeartbeat" : ISODate("2020-10-21T02:20:57.781Z"), "lastHeartbeatRecv" : ISODate("2020-10-21T02:20:59.311Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "", "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "", "configVersion" : 3 }, { "_id" : 2, "name" : "192.168.1.171:27518", "health" : 1, "state" : 7, "stateStr" : "ARBITER", "uptime" : 15, "lastHeartbeat" : ISODate("2020-10-21T02:20:57.780Z"), "lastHeartbeatRecv" : ISODate("2020-10-21T02:20:57.798Z"), "pingMs" : NumberLong(1), "lastHeartbeatMessage" : "", "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "", "configVersion" : 3 } ], "ok" : 1, "operationTime" : Timestamp(1603246843, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603246843, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myshardrs02:PRIMARY> exit bye第一步:准备存放数据和日志的目录:
#-----------configrs #建立数据节点data和日志目录 mkdir -p /mongodb/sharded_cluster/myconfigrs_27019/log \ & mkdir -p /mongodb/sharded_cluster/myconfigrs_27019/data/db \ & mkdir -p /mongodb/sharded_cluster/myconfigrs_27119/log \ & mkdir -p /mongodb/sharded_cluster/myconfigrs_27119/data/db \ & mkdir -p /mongodb/sharded_cluster/myconfigrs_27219/log \ & mkdir -p /mongodb/sharded_cluster/myconfigrs_27219/data/db新建或修改配置文件
[root@localhost ~]# cat /mongodb/sharded_cluster/myconfigrs_27019/mongod.conf systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/myconfigrs_27019/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true storage: #mongod实例存储其数据的目录。storage.dbPath设置仅适用于mongod。 dbPath: "/mongodb/sharded_cluster/myconfigrs_27019/data/db" journal: enabled: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: "/mongodb/sharded_cluster/myconfigrs_27019/log/mongod.pid" net: #服务实例绑定所有IP #bindIpAll: true #服务实例绑定的IP bindIp: localhost,192.168.1.171 #绑定的端口 port: 27019 replication: replSetName: myconfigrs sharding: clusterRole: configsvr [root@localhost ~]# [root@localhost ~]# cat /mongodb/sharded_cluster/myconfigrs_27119/mongod.conf systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/myconfigrs_27119/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true storage: #mongod实例存储其数据的目录。storage.dbPath设置仅适用于mongod。 dbPath: "/mongodb/sharded_cluster/myconfigrs_27119/data/db" journal: enabled: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: "/mongodb/sharded_cluster/myconfigrs_27119/log/mongod.pid" net: #服务实例绑定所有IP #bindIpAll: true #服务实例绑定的IP bindIp: localhost,192.168.1.171 #绑定的端口 port: 27119 replication: replSetName: myconfigrs sharding: clusterRole: configsvr [root@localhost ~]# [root@localhost ~]# cat /mongodb/sharded_cluster/myconfigrs_27219/mongod.conf systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/myconfigrs_27219/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true storage: #mongod实例存储其数据的目录。storage.dbPath设置仅适用于mongod。 dbPath: "/mongodb/sharded_cluster/myconfigrs_27219/data/db" journal: enabled: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: "/mongodb/sharded_cluster/myconfigrs_27219/log/mongod.pid" net: #服务实例绑定所有IP #bindIpAll: true #服务实例绑定的IP bindIp: localhost,192.168.1.171 #绑定的端口 port: 27219 replication: replSetName: myconfigrs sharding: clusterRole: configsvr [root@localhost ~]#启动配置副本集:一主两副本 依次启动三个mongod服务:
(1)初始化副本集和创建主节点: 使用客户端命令连接任意一个节点,但这里尽量要连接主节点:
[root@localhost sharded_cluster]# /usr/local/mongodb/bin/mongo --port 27019 MongoDB shell version v4.0.10 connecting to: mongodb://127.0.0.1:27019/?gssapiServiceName=mongodb Implicit session: session { "id" : UUID("372532cf-d08f-48fa-b86f-d1b169893097") } MongoDB server version: 4.0.10 Server has startup warnings: 2020-10-21T10:30:47.051+0800 I CONTROL [initandlisten] 2020-10-21T10:30:47.051+0800 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database. 2020-10-21T10:30:47.051+0800 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted. 2020-10-21T10:30:47.051+0800 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended. 2020-10-21T10:30:47.051+0800 I CONTROL [initandlisten] 2020-10-21T10:30:47.052+0800 I CONTROL [initandlisten] 2020-10-21T10:30:47.052+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. 2020-10-21T10:30:47.052+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2020-10-21T10:30:47.052+0800 I CONTROL [initandlisten] 2020-10-21T10:30:47.052+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. 2020-10-21T10:30:47.052+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2020-10-21T10:30:47.052+0800 I CONTROL [initandlisten] > rs.initiate() { "info2" : "no configuration specified. Using a default configuration for the set", "me" : "192.168.1.171:27019", "ok" : 1, "operationTime" : Timestamp(1603247561, 1), "$gleStats" : { "lastOpTime" : Timestamp(1603247561, 1), "electionId" : ObjectId("000000000000000000000000") }, "lastCommittedOpTime" : Timestamp(0, 0), "$clusterTime" : { "clusterTime" : Timestamp(1603247561, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myconfigrs:SECONDARY> rs.add("192.168.1.171:27119") { "ok" : 1, "operationTime" : Timestamp(1603247583, 1), "$gleStats" : { "lastOpTime" : { "ts" : Timestamp(1603247583, 1), "t" : NumberLong(1) }, "electionId" : ObjectId("7fffffff0000000000000001") }, "lastCommittedOpTime" : Timestamp(1603247567, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603247583, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myconfigrs:PRIMARY> rs.add("192.168.1.171:27219") { "ok" : 1, "operationTime" : Timestamp(1603247594, 1), "$gleStats" : { "lastOpTime" : { "ts" : Timestamp(1603247594, 1), "t" : NumberLong(1) }, "electionId" : ObjectId("7fffffff0000000000000001") }, "lastCommittedOpTime" : Timestamp(1603247583, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603247594, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myconfigrs:PRIMARY> rs.status() { "set" : "myconfigrs", "date" : ISODate("2020-10-21T02:33:23.751Z"), "myState" : 1, "term" : NumberLong(1), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "configsvr" : true, "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1603247597, 1), "t" : NumberLong(1) }, "readConcernMajorityOpTime" : { "ts" : Timestamp(1603247597, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1603247597, 1), "t" : NumberLong(1) }, "durableOpTime" : { "ts" : Timestamp(1603247597, 1), "t" : NumberLong(1) } }, "lastStableCheckpointTimestamp" : Timestamp(1603247563, 19), "members" : [ { "_id" : 0, "name" : "192.168.1.171:27019", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 158, "optime" : { "ts" : Timestamp(1603247597, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2020-10-21T02:33:17Z"), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "could not find member to sync from", "electionTime" : Timestamp(1603247561, 2), "electionDate" : ISODate("2020-10-21T02:32:41Z"), "configVersion" : 3, "self" : true, "lastHeartbeatMessage" : "" }, { "_id" : 1, "name" : "192.168.1.171:27119", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 20, "optime" : { "ts" : Timestamp(1603247597, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1603247597, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2020-10-21T02:33:17Z"), "optimeDurableDate" : ISODate("2020-10-21T02:33:17Z"), "lastHeartbeat" : ISODate("2020-10-21T02:33:22.478Z"), "lastHeartbeatRecv" : ISODate("2020-10-21T02:33:23.484Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "", "syncingTo" : "192.168.1.171:27019", "syncSourceHost" : "192.168.1.171:27019", "syncSourceId" : 0, "infoMessage" : "", "configVersion" : 3 }, { "_id" : 2, "name" : "192.168.1.171:27219", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 9, "optime" : { "ts" : Timestamp(1603247597, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1603247597, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2020-10-21T02:33:17Z"), "optimeDurableDate" : ISODate("2020-10-21T02:33:17Z"), "lastHeartbeat" : ISODate("2020-10-21T02:33:22.478Z"), "lastHeartbeatRecv" : ISODate("2020-10-21T02:33:22.999Z"), "pingMs" : NumberLong(1), "lastHeartbeatMessage" : "", "syncingTo" : "192.168.1.171:27019", "syncSourceHost" : "192.168.1.171:27019", "syncSourceId" : 0, "infoMessage" : "", "configVersion" : 3 } ], "ok" : 1, "operationTime" : Timestamp(1603247597, 1), "$gleStats" : { "lastOpTime" : { "ts" : Timestamp(1603247594, 1), "t" : NumberLong(1) }, "electionId" : ObjectId("7fffffff0000000000000001") }, "lastCommittedOpTime" : Timestamp(1603247597, 1), "$clusterTime" : { "clusterTime" : Timestamp(1603247597, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } myconfigrs:PRIMARY> exit bye第一步:准备存放数据和日志的目录:
#-----------mongos01 mkdir -p /mongodb/sharded_cluster/mymongos_27017/log配置文件
[root@localhost ~]# cat /mongodb/sharded_cluster/mymongos_27017/mongos.conf systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/mymongos_27017/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: /mongodb/sharded_cluster/mymongos_27017/log/mongod.pid" net: #服务实例绑定所有IP,有副作用,副本集初始化的时候,节点名字会自动设置为本地域名,而不是ip #bindIpAll: true #服务实例绑定的IP bindIp: localhost,192.168.1.171 #bindIp #绑定的端口 port: 27017 sharding: #指定配置节点副本集 configDB: myconfigrs/192.168.1.171:27019,192.168.1.171:27119,192.168.1.171:27219 [root@localhost ~]#启动mongos:
此时,写不进去数据,如果写数据会报错:
原因:通过路由节点操作,现在只是连接了配置节点,还没有连接分片数据节点,因此无法写入业务数据。
使用命令添加分片: (1)添加分片: 语法:
sh.addShard("IP:Port")将第一套分片副本集添加进来:
mongos> sh.addShard("myshardrs01/192.168.1.171:27018,192.168.1.171:27118,192.168.1.171:27218") { "shardAdded" : "myshardrs01", "ok" : 1, "operationTime" : Timestamp(1603252911, 5), "$clusterTime" : { "clusterTime" : Timestamp(1603252911, 5), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } }查看分片状态情况:
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5f8f9dcb0113a7896f493070") } shards: { "_id" : "myshardrs01", "host" : "myshardrs01/192.168.1.171:27018,192.168.1.171:27118", "state" : 1 } active mongoses: "4.0.10" : 1 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "config", "primary" : "config", "partitioned" : true }继续将第二套分片副本集添加进来:
mongos> sh.addShard("myshardrs02/192.168.1.171:27318,192.168.1.171:27418,192.168.1.171:27518") { "shardAdded" : "myshardrs02", "ok" : 1, "operationTime" : Timestamp(1603252980, 3), "$clusterTime" : { "clusterTime" : Timestamp(1603252980, 3), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5f8f9dcb0113a7896f493070") } shards: { "_id" : "myshardrs01", "host" : "myshardrs01/192.168.1.171:27018,192.168.1.171:27118", "state" : 1 } { "_id" : "myshardrs02", "host" : "myshardrs02/192.168.1.171:27318,192.168.1.171:27418", "state" : 1 } active mongoses: "4.0.10" : 1 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "config", "primary" : "config", "partitioned" : true }(2)开启分片功能:sh.enableSharding("库名")、sh.shardCollection("库名.集合名",{"key":1}) 在mongos上的articledb数据库配置sharding:
mongos> sh.enableSharding("articledb") { "ok" : 1, "operationTime" : Timestamp(1603253008, 5), "$clusterTime" : { "clusterTime" : Timestamp(1603253008, 5), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5f8f9dcb0113a7896f493070") } shards: { "_id" : "myshardrs01", "host" : "myshardrs01/192.168.1.171:27018,192.168.1.171:27118", "state" : 1 } { "_id" : "myshardrs02", "host" : "myshardrs02/192.168.1.171:27318,192.168.1.171:27418", "state" : 1 } active mongoses: "4.0.10" : 1 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "articledb", "primary" : "myshardrs02", "partitioned" : true, "version" : { "uuid" : UUID("ab71d9d4-4762-41ba-a803-5d205b0a9c5c"), "lastMod" : 1 } } { "_id" : "config", "primary" : "config", "partitioned" : true }(3)集合分片 对集合分片,你必须使用 sh.shardCollection() 方法指定集合和分片键。 语法:
sh.shardCollection(namespace, key, unique)参数:
对集合进行分片时,你需要选择一个 片键(Shard Key) , shard key 是每条记录都必须包含的,且建立了索引的单个字段或复合字段,MongoDB按照片键将数据划分到不同的 数据块 中,并将 数据块 均衡地分布到所有分片中.为了按照片键划分数据块,MongoDB使用 基于哈希的分片方式(随机平均分配)或者基于范围的分片方式(数值大小分配) 。用什么字段当片键都可以,如:nickname作为片键,但一定是必填字段。
分片规则一:哈希策略 对于 基于哈希的分片 ,MongoDB计算一个字段的哈希值,并用这个哈希值来创建数据块.在使用基于哈希分片的系统中,拥有”相近”片键的文档 很可能不会 存储在同一个数据块中,因此数据的分离性更好一些. 使用nickname作为片键,根据其值的哈希值进行数据分片
mongos> sh.shardCollection("articledb.comment",{"nickname":"hashed"}) { "collectionsharded" : "articledb.comment", "collectionUUID" : UUID("dafde8ff-581e-4afe-a25e-aa215d06ed9a"), "ok" : 1, "operationTime" : Timestamp(1603253181, 30), "$clusterTime" : { "clusterTime" : Timestamp(1603253181, 30), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5f8f9dcb0113a7896f493070") } shards: { "_id" : "myshardrs01", "host" : "myshardrs01/192.168.1.171:27018,192.168.1.171:27118", "state" : 1 } { "_id" : "myshardrs02", "host" : "myshardrs02/192.168.1.171:27318,192.168.1.171:27418", "state" : 1 } active mongoses: "4.0.10" : 1 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "articledb", "primary" : "myshardrs02", "partitioned" : true, "version" : { "uuid" : UUID("ab71d9d4-4762-41ba-a803-5d205b0a9c5c"), "lastMod" : 1 } } articledb.comment shard key: { "nickname" : "hashed" } unique: false balancing: true chunks: myshardrs01 2 myshardrs02 2 { "nickname" : { "$minKey" : 1 } } -->> { "nickname" : NumberLong("-4611686018427387902") } on : myshardrs01 Timestamp(1, 0) { "nickname" : NumberLong("-4611686018427387902") } -->> { "nickname" : NumberLong(0) } on : myshardrs01 Timestamp(1, 1) { "nickname" : NumberLong(0) } -->> { "nickname" : NumberLong("4611686018427387902") } on : myshardrs02 Timestamp(1, 2) { "nickname" : NumberLong("4611686018427387902") } -->> { "nickname" : { "$maxKey" : 1 } } on : myshardrs02 Timestamp(1, 3) { "_id" : "config", "primary" : "config", "partitioned" : true } config.system.sessions shard key: { "_id" : 1 } unique: false balancing: true chunks: myshardrs01 1 { "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : myshardrs01 Timestamp(1, 0)分片规则二:范围策略 对于 基于范围的分片 ,MongoDB按照片键的范围把数据分成不同部分.假设有一个数字的片键:想象一个从负无穷到正无穷的直线,每一个片键的值都在直线上画了一个点.MongoDB把这条直线划分为更短的不重叠的片段,并称之为 数据块 ,每个数据块包含了片键在一定范围内的数据.在使用片键做范围划分的系统中,拥有”相近”片键的文档很可能存储在同一个数据块中,因此也会存储在同一个分片中. 如使用作者年龄字段作为片键,按照点赞数的值进行分片:
mongos> sh.shardCollection("articledb.author",{"age":1}) { "collectionsharded" : "articledb.author", "collectionUUID" : UUID("d352ef1c-1e1d-4e8c-a251-3d4e13f5b7ee"), "ok" : 1, "operationTime" : Timestamp(1603253278, 13), "$clusterTime" : { "clusterTime" : Timestamp(1603253278, 13), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5f8f9dcb0113a7896f493070") } shards: { "_id" : "myshardrs01", "host" : "myshardrs01/192.168.1.171:27018,192.168.1.171:27118", "state" : 1 } { "_id" : "myshardrs02", "host" : "myshardrs02/192.168.1.171:27318,192.168.1.171:27418", "state" : 1 } active mongoses: "4.0.10" : 1 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "articledb", "primary" : "myshardrs02", "partitioned" : true, "version" : { "uuid" : UUID("ab71d9d4-4762-41ba-a803-5d205b0a9c5c"), "lastMod" : 1 } } articledb.author shard key: { "age" : 1 } unique: false balancing: true chunks: myshardrs02 1 { "age" : { "$minKey" : 1 } } -->> { "age" : { "$maxKey" : 1 } } on : myshardrs02 Timestamp(1, 0) articledb.comment shard key: { "nickname" : "hashed" } unique: false balancing: true chunks: myshardrs01 2 myshardrs02 2 { "nickname" : { "$minKey" : 1 } } -->> { "nickname" : NumberLong("-4611686018427387902") } on : myshardrs01 Timestamp(1, 0) { "nickname" : NumberLong("-4611686018427387902") } -->> { "nickname" : NumberLong(0) } on : myshardrs01 Timestamp(1, 1) { "nickname" : NumberLong(0) } -->> { "nickname" : NumberLong("4611686018427387902") } on : myshardrs02 Timestamp(1, 2) { "nickname" : NumberLong("4611686018427387902") } -->> { "nickname" : { "$maxKey" : 1 } } on : myshardrs02 Timestamp(1, 3) { "_id" : "config", "primary" : "config", "partitioned" : true } config.system.sessions shard key: { "_id" : 1 } unique: false balancing: true chunks: myshardrs01 1 { "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : myshardrs01 Timestamp(1, 0)注意的是: 1)一个集合只能指定一个片键,否则报错。 2)一旦对一个集合分片,分片键和分片值就不可改变。 如:不能给集合选择不同的分片键、不能更新分片键的值。 3)根据age索引进行分配数据。
基于范围的分片方式与基于哈希的分片方式性能对比: 基于范围的分片方式提供了更高效的范围查询,给定一个片键的范围,分发路由可以很简单地确定哪个数据块存储了请求需要的数据,并将请求转发到相应的分片中. 不过,基于范围的分片会导致数据在不同分片上的不均衡,有时候,带来的消极作用会大于查询性能的积极作用.比如,如果片键所在的字段是线性增长的,一定时间内的所有请求都会落到某个固定的数据块中,最终导致分布在同一个分片中.在这种情况下,一小部分分片承载了集群大部分的数据,系统并不能很好地进行扩展.
与此相比,基于哈希的分片方式以范围查询性能的损失为代价,保证了集群中数据的均衡.哈希值的随机性使数据随机分布在每个数据块中,因此也随机分布在不同分片中.但是也正由于随机性,一个范围查询很难确定应该请求哪些分片,通常为了返回需要的结果,需要请求所有分片.如无特殊情况,一般推荐使用 Hash Sharding。 而使用 _id 作为片键是一个不错的选择,因为它是必有的,你可以使用数据文档 _id 的哈希作为片键。这个方案能够是的读和写都能够平均分布,并且它能够保证每个文档都有不同的片键所以数据块能够很精细。似乎还是不够完美,因为这样的话对多个文档的查询必将命中所有的分片。虽说如此,这也是一种比较好的方案了。理想化的 shard key 可以让 documents 均匀地在集群中分布:
显示集群的详细信息:
mongos> db.printShardingStatus()查看均衡器是否工作(需要重新均衡时系统才会自动启动,不用管它):
mongos> sh.isBalancerRunning() false查看当前 Balancer状态:
mongos> sh.getBalancerState() true测试一(哈希规则):登录mongs后,向comment循环插入1000条数据做测试:
mongos> use articledb switched to db articledb mongos> for(var i=1;i<=1000;i++){db.comment.insert({_id:i+"",nickname:"BoBo"+i})} WriteResult({ "nInserted" : 1 })提示: js的语法,因为mongo的shell是一个JavaScript的shell。 注意:从路由上插入的数据,必须包含片键,否则无法插入。分别登陆两个片的主节点,统计文档数量 第一个分片副本集:
可以看到, 1000条数据近似均匀的分布到了2个shard上。是根据片键的哈希值分配的。这种分配方式非常易于水平扩展:一旦数据存储需要更大空间,可以直接再增加分片即可,同时提升了性能。 使用db.comment.stats()查看单个集合的完整情况,mongos执行该命令可以查看该集合的数据分片的情况。 使用sh.status()查看本库内所有集合的分片信息。
测试二(范围规则):登录mongs后,向comment循环插入1000条数据做测试:
mongos> use articledb switched to db articledb mongos> for(var i=1;i<=20000;i++) {db.author.save({"name":"BoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBo"+i,"age":NumberInt(i%120)})} WriteResult({ "nInserted" : 1 })插入成功后,仍然要分别查看两个分片副本集的数据情况。 分片效果:
文件夹:
#-----------mongos02 mkdir -p /mongodb/sharded_cluster/mymongos_27117/log新建或修改配置文件:
systemLog: #MongoDB发送所有日志输出的目标指定为文件 destination: file #mongod或mongos应向其发送所有诊断日志记录信息的日志文件的路径 path: "/mongodb/sharded_cluster/mymongos_27117/log/mongod.log" #当mongos或mongod实例重新启动时,mongos或mongod会将新条目附加到现有日志文件的末尾。 logAppend: true processManagement: #启用在后台运行mongos或mongod进程的守护进程模式。 fork: true #指定用于保存mongos或mongod进程的进程ID的文件位置,其中mongos或mongod将写入其PID pidFilePath: /mongodb/sharded_cluster/mymongos_27117/log/mongod.pid" net: #服务实例绑定所有IP,有副作用,副本集初始化的时候,节点名字会自动设置为本地域名,而不是ip #bindIpAll: true #服务实例绑定的IP bindIp: localhost,192.168.1.171 #bindIp #绑定的端口 port: 27117 sharding: #指定配置节点副本集 configDB: myconfigrs/192.168.1.171:27019,192.168.1.171:27119,192.168.1.171:27219使用mongo客户端登录27117,发现,第二个路由无需配置,因为分片配置都保存到了配置服务器中了。
compass连接:
Java客户端常用的是SpringDataMongoDB,其连接的是mongs路由,配置和单机mongod的配置是一 样的。 多个路由的时候的SpringDataMongoDB的客户端配置参考如下:
uri: mongodb://192.168.1.144:27017,192.168.1.1446:27117/articledb