Quantcast
Channel: CodeSection,代码区,数据库(综合) - CodeSec
Viewing all articles
Browse latest Browse all 6262

Peter Zaitsev: MongoDB Troubleshooting: My Top 5

$
0
0

Peter Zaitsev: MongoDB Troubleshooting: My Top 5
In this blog post, I’ll discuss my top five go-to tips for MongoDB troubleshooting.

Every DBA has a war chest of their go-to solutions for any support issues they run into for a specific technology. MongoDB is no different. Even if you have picked it because it’s a good fit and it runs well for you, things will change. When things change sometimes there is a new version of your application, or a new version of the database itself you need to have a solid starting place.

To help new DBA’s, I like to point out my top fivethings that cover the bulk of requests a DBA might need to work on.

Table of Contents Did any elections happen? Why did they happen? Is replication lagged, do I have enough oplog? CurrentOp and killOp Explained Common greps to use

This issue is all about what are some ways to pair down the error log and make it a bit more manageable. The error log is a slew of information and sometimes, without grep, it’s challenging to correlate some events.

Is an index being built?

As a DBA you will often get a call saying the database has “stopped.” The developer might say, “I didn’t change anything.” Looking at the error log is a great first port of call. With this particular grep, you just want to see if all index builds were done, if a new index was built and isstill building, or an index was removed. This will help catch all of the cases in question.

>grep -i indexmongod.log 2016-11-11T17:08:53.731+0000 I INDEX [conn458] buildindexon: samples.col1 properties: { v: 1, key: { friends: 1.0 }, name: "friends_1", ns: "samples.col1" } 2016-11-11T17:08:53.733+0000 I INDEX [conn458] buildingindexusingbulkmethod 2016-11-11T17:08:56.045+0000 I - [conn458] IndexBuild: 24700/1000000 2% 2016-11-11T17:08:59.004+0000 I - [conn458] IndexBuild: 61000/1000000 6% 2016-11-11T17:09:02.001+0000 I - [conn458] IndexBuild: 103200/1000000 10% 2016-11-11T17:09:05.013+0000 I - [conn458] IndexBuild: 130800/1000000 13% 2016-11-11T17:09:08.013+0000 I - [conn458] IndexBuild: 160300/1000000 16% 2016-11-11T17:09:11.039+0000 I - [conn458] IndexBuild: 183100/1000000 18% 2016-11-11T17:09:14.009+0000 I - [conn458] IndexBuild: 209400/1000000 20% 2016-11-11T17:09:17.007+0000 I - [conn458] IndexBuild: 239400/1000000 23% 2016-11-11T17:09:20.010+0000 I - [conn458] IndexBuild: 264100/1000000 26% 2016-11-11T17:09:23.001+0000 I - [conn458] IndexBuild: 286800/1000000 28% 2016-11-11T17:09:30.783+0000 I - [conn458] IndexBuild: 298900/1000000 29% 2016-11-11T17:09:33.015+0000 I - [conn458] IndexBuild: 323900/1000000 32% 2016-11-11T17:09:36.000+0000 I - [conn458] IndexBuild: 336600/1000000 33% 2016-11-11T17:09:39.000+0000 I - [conn458] IndexBuild: 397000/1000000 39% 2016-11-11T17:09:42.000+0000 I - [conn458] IndexBuild: 431900/1000000 43% 2016-11-11T17:09:45.002+0000 I - [conn458] IndexBuild: 489100/1000000 48% 2016-11-11T17:09:48.003+0000 I - [conn458] IndexBuild: 551200/1000000 55% 2016-11-11T17:09:51.004+0000 I - [conn458] IndexBuild: 567700/1000000 56% 2016-11-11T17:09:54.004+0000 I - [conn458] IndexBuild: 589600/1000000 58% 2016-11-11T17:10:00.929+0000 I - [conn458] IndexBuild: 597800/1000000 59% 2016-11-11T17:10:03.008+0000 I - [conn458] IndexBuild: 633100/1000000 63% 2016-11-11T17:10:06.001+0000 I - [conn458] IndexBuild: 647200/1000000 64% 2016-11-11T17:10:09.008+0000 I - [conn458] IndexBuild: 660000/1000000 66% 2016-11-11T17:10:12.001+0000 I - [conn458] IndexBuild: 672300/1000000 67% 2016-11-11T17:10:15.009+0000 I - [conn458] IndexBuild: 686000/1000000 68% 2016-11-11T17:10:18.001+0000 I - [conn458] IndexBuild: 706100/1000000 70% 2016-11-11T17:10:21.006+0000 I - [conn458] IndexBuild: 731400/1000000 73% 2016-11-11T17:10:24.006+0000 I - [conn458] IndexBuild: 750900/1000000 75% 2016-11-11T17:10:27.000+0000 I - [conn458] IndexBuild: 773900/1000000 77% 2016-11-11T17:10:30.000+0000 I - [conn458] IndexBuild: 821800/1000000 82% 2016-11-11T17:10:33.026+0000 I - [conn458] IndexBuild: 843800/1000000 84% 2016-11-11T17:10:36.008+0000 I - [conn458] IndexBuild: 874000/1000000 87% 2016-11-11T17:10:43.854+0000 I - [conn458] IndexBuild: 896600/1000000 89% 2016-11-11T17:10:46.009+0000 I - [conn458] IndexBuild: 921800/1000000 92% 2016-11-11T17:10:49.000+0000 I - [conn458] IndexBuild: 941600/1000000 94% 2016-11-11T17:10:52.011+0000 I - [conn458] IndexBuild: 955700/1000000 95% 2016-11-11T17:10:55.007+0000 I - [conn458] IndexBuild: 965500/1000000 96% 2016-11-11T17:10:58.046+0000 I - [conn458] IndexBuild: 985200/1000000 98% 2016-11-11T17:11:01.002+0000 I - [conn458] IndexBuild: 995000/1000000 99% 2016-11-11T17:11:13.000+0000 I - [conn458] Index: (2/3) BTreeBottomUpProgress: 8216900/8996322 91% 2016-11-11T17:11:14.021+0000 I INDEX [conn458] done buildingbottomlayer, goingto commit 2016-11-11T17:11:14.023+0000 I INDEX [conn458] buildindexdone. scanned 1000000 totalrecords. 140 secs 2016-11-11T17:11:14.035+0000 I COMMAND [conn458] command samples.$cmd command: createIndexes { createIndexes: "col1", indexes: [ { ns: "samples.col1", key: { friends: 1.0 }, name: "friends_1" } ] } keyUpdates:0 writeConflicts:0 numYields:0 reslen:173 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, MMAPV1Journal: { acquireCount: { w: 9996326 }, acquireWaitCount: { w: 1054 }, timeAcquiringMicros: { w: 811319 } }, Database: { acquireCount: { w: 1, W: 1 } }, Collection: { acquireCount: { W: 1 } }, Metadata: { acquireCount: { W: 12 } }, oplog: { acquireCount: { w: 1 } } } 140306ms What’s happening right now?

Like with the above index example, this helps you remove many of the messages you might not care about, or you want to block off. MongoDB does have some useful sub-component tags in the logs, such as “ReplicationExecutor” and “connXXX” that can be helpful, but I find it helpful toremove the noisy lines as opposed to the log facility types. In this example, I opted to also not have “| grep -v connection” typically I will look at the log with connections first to see if they are acting funny, and filter those out to see the core data of what is happening. If you only want to see the long queries and command, replace “ms” with “connection” to make them easier to find.

>grep -v connmongod.log | grep -v auth | grep -vi health | grep -v ms 2016-11-11T14:41:06.376+0000 I REPL [ReplicationExecutor] This nodeis localhost:28001 in theconfig 2016-11-11T14:41:06.377+0000 I REPL [ReplicationExecutor] transitionto STARTUP2 2016-11-11T14:41:06.379+0000 I REPL [ReplicationExecutor] Memberlocalhost:28003 is nowin stateSTARTUP 2016-11-11T14:41:06.383+0000 I REPL [ReplicationExecutor] Memberlocalhost:28002 is nowin stateSTARTUP 2016-11-11T14:41:06.385+0000 I STORAGE [FileAllocator] allocatingnew datafile /Users/dmurphy/Github/dbmurphy/MongoDB32Labs/labs/rs2-1/local.1, fillingwithzeroes... 2016-11-11T14:41:06.586+0000 I STORAGE [FileAllocator] done allocatingdatafile /Users/dmurphy/Github/dbmurphy/MongoDB32Labs/labs/rs2-1/local.1, size: 256MB, took 0.196 secs 2016-11-11T14:41:06.610+0000 I REPL [ReplicationExecutor] transitionto RECOVERING 2016-11-11T14:41:06.614+0000 I REPL [ReplicationExecutor] transitionto SECONDARY 2016-11-11T14:41:08.384+0000 I REPL [ReplicationExecutor] Memberlocalhost:28003 is nowin stateSTARTUP2 2016-11-11T14:41:08.386+0000 I REPL [ReplicationExecutor] Standingfor election 2016-11-11T14:41:08.388+0000 I REPL [ReplicationExecutor] Memberlocalhost:28002 is nowin stateSTARTUP2 2016-11-11T14:41:08.390+0000 I REPL [ReplicationExecutor] not electingself, localhost:28002 wouldvetowith 'I don't thinklocalhost:28001 is electablebecausethememberis not currently a secondary (mask 0x8)' 2016-11-11T14:41:08.391+0000 I REPL [ReplicationExecutor] not electingself, wearenot freshest 2016-11-11T14:41:10.387+0000 I REPL [ReplicationExecutor] Standingfor election 2016-11-11T14:41:10.389+0000 I REPL [ReplicationExecutor] replSetinfoelectSelf 2016-11-11T14:41:10.393+0000 I REPL [ReplicationExe

Viewing all articles
Browse latest Browse all 6262

Trending Articles