小内存虚拟机中使用MongoDB数据库
目录
在1核1GB的阿里云云虚拟机中使用MongoDB数据库
问题
最近我在阿里云虚拟机上运行的MongoDB服务器经常崩溃。从 systemctl status 的输出看,mongod 进程被 kill 掉。
# systemctl status mongod.service
● mongod.service - SYSV: Mongo is a scalable, document-oriented database.
Loaded: loaded (/etc/rc.d/init.d/mongod)
Active: failed (Result: signal) since Thu 2016-07-28 15:56:58 CST; 29min ago
Docs: man:systemd-sysv-generator(8)
Process: 2545 ExecStop=/etc/rc.d/init.d/mongod stop (code=exited, status=0/SUCCESS)
Process: 2322 ExecStart=/etc/rc.d/init.d/mongod start (code=exited, status=0/SUCCESS)
Main PID: 2335 (code=killed, signal=KILL)
Jul 28 13:59:45 iZ25qo7yf0oZ systemd[1]: Starting SYSV: Mongo is a scalable, document-oriented database....
Jul 28 13:59:45 iZ25qo7yf0oZ runuser[2331]: pam_unix(runuser:session): session opened for user mongod by (uid=0)
Jul 28 13:59:45 iZ25qo7yf0oZ mongod[2322]: Starting mongod: [ OK ]
Jul 28 13:59:45 iZ25qo7yf0oZ systemd[1]: Started SYSV: Mongo is a scalable, document-oriented database..
Jul 28 15:56:57 iZ25qo7yf0oZ systemd[1]: mongod.service: main process exited, code=killed, status=9/KILL
Jul 28 15:56:58 iZ25qo7yf0oZ mongod[2545]: Stopping mongod: [ OK ]
Jul 28 15:56:58 iZ25qo7yf0oZ systemd[1]: Unit mongod.service entered failed state.
Jul 28 15:56:58 iZ25qo7yf0oZ systemd[1]: mongod.service failed.mongod.service 的日志显示进程的详细情况。
Jul 27 16:40:47 iZ25qo7yf0oZ systemd[1]: Starting SYSV: Mongo is a scalable, document-oriented database.... Jul 27 16:40:47 iZ25qo7yf0oZ runuser[31290]: pam_unix(runuser:session): session opened for user mongod by (uid=0) Jul 27 16:40:49 iZ25qo7yf0oZ runuser[31290]: pam_unix(runuser:session): session closed for user mongod Jul 27 16:40:49 iZ25qo7yf0oZ mongod[31283]: Starting mongod: [ OK ] Jul 27 16:40:49 iZ25qo7yf0oZ systemd[1]: Started SYSV: Mongo is a scalable, document-oriented database.. Jul 28 06:49:40 iZ25qo7yf0oZ systemd[1]: Started SYSV: Mongo is a scalable, document-oriented database.. Jul 28 06:50:21 iZ25qo7yf0oZ systemd[1]: mongod.service: main process exited, code=killed, status=9/KILL Jul 28 06:50:21 iZ25qo7yf0oZ mongod[32606]: Stopping mongod: [ OK ] Jul 28 06:50:21 iZ25qo7yf0oZ systemd[1]: Unit mongod.service entered failed state. Jul 28 06:50:21 iZ25qo7yf0oZ systemd[1]: mongod.service failed.
可以看到 mongod 运行不到一天就被 kill 掉了。
再查看 MongoDB 的日志 mongod.log,省略部分信息。
2016-07-27T16:40:48.373+0800 I CONTROL [main] ***** SERVER RESTARTED *****
...
2016-07-27T16:40:49.733+0800 I NETWORK [HostnameCanonicalizationWorker] Starting hostname canonicalization worker
2016-07-27T16:40:49.761+0800 I NETWORK [initandlisten] waiting for connections on port 27017
...
2016-07-27T17:33:55.250+0800 I COMMAND [ftdc] serverStatus was very slow: { after basic: 2760, after asserts: 3520, after connections: 5380, after extra_info: 24790, after globalLock: 29120, after locks: 40880, after network: 43390, after opcounters: 45600, after opcountersRepl: 47070, after storageEngine: 55320, after tcmalloc: 117790, after wiredTiger: 170830, at end: 171030 }
...
2016-07-28T03:53:19.741+0800 I COMMAND [ftdc] serverStatus was very slow: { after basic: 3260, after asserts: 5580, after connections: 9750, after extra_info: 35280, after globalLock: 41220, after locks: 50910, after network: 56970, after opcounters: 69950, after opcountersRepl: 75230, after storageEngine: 111160, after tcmalloc: 161620, after wiredTiger: 162220, at end: 162450 }
2016-07-28T03:53:23.442+0800 I NETWORK [initandlisten] connection accepted from 127.0.0.1:38308 #5 (1 connection now open)上面日志没有除了服务响应慢外,没有出错信息。找不到mongod服务为啥会崩溃。
日志中都没记录,说明可能是系统将 MongDB 杀掉了。
查看 systemd 的日志 /var/log/message,找到下面的条目
Jul 28 06:50:20 iZ25qo7yf0oZ kernel: mongod invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 ... Jul 28 06:50:20 iZ25qo7yf0oZ kernel: Out of memory: Kill process 31294 (mongod) score 518 or sacrifice child Jul 28 06:50:20 iZ25qo7yf0oZ kernel: Killed process 31294 (mongod) total-vm:838876kB, anon-rss:525268kB, file-rss:0kB Jul 28 06:50:21 iZ25qo7yf0oZ systemd: mongod.service: main process exited, code=killed, status=9/KILL Jul 28 06:50:21 iZ25qo7yf0oZ mongod: Stopping mongod: [ OK ] Jul 28 06:50:21 iZ25qo7yf0oZ systemd: Unit mongod.service entered failed state. Jul 28 06:50:21 iZ25qo7yf0oZ systemd: mongod.service failed.
可以确定,因为占用内存过高,系统将 MongoDB 服务杀掉(oom-killer)。
原因探究
不知道为什么 MongoDB 会占用这么高的内存,mongostat 的输出
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
*0 *0 *0 *0 0 1|0 0.0 47.7 0 879.0M 509.0M 0|0 0|0 79b 18k 2 2016-07-29T16:40:22+08:00top 中 mongod 的输出显示,mongod 占了近一半的内存。
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3127 mongod 20 0 901024 521996 2516 S 0.3 51.4 4:23.15 mongod
需要增加内存容量。但阿里云虚拟机的内存太贵,可以尝试增加交换空间。
解决办法
阿里云虚拟机默认没有设置交换空间。
# free -h
total used free shared buff/cache available
Mem: 991M 701M 76M 6.9M 213M 134M最简便的方式之一就是创建一个交换文件,将其设为swap。
创建512MB的空白文件
dd if=/dev/zero of=/swapfile1 bs=1024k count=512
设置该文件为交换文件
mkswap /swapfile1
启动交换分区
swapon /swapfile1
在系统启动时自动加载,在/etc/fstab中加入
/swapfile1 swap swap defaults 0 0
这样就加入一个交换分区。
total used free shared buff/cache available Mem: 991M 701M 76M 6.9M 213M 134M Swap: 511M 179M 332M
后续
后面将持续观察 mongod.service 的运行情况,验证该方法是否有效。
2016.11.10:经过3个月实际运行测试,上述方法有效。添加交换分区后,mongod.service 运行正常,不再异常退出。
