Quantcast
Channel: CodeSection,代码区,数据库(综合) - CodeSec
Viewing all articles
Browse latest Browse all 6262

Using Percona Toolkit pt-mongodb-query-digest

$
0
0

Using Percona Toolkit pt-mongodb-query-digest
In this blog post, we’ll look at how to use the pt - mongodb - query - digest tool in Percona Toolkit 3.0.

Percona’s pt - query - digest is one of our most popular Percona Toolkit mysql tools. It is used on a daily basis by DBAs and developers to help identify the queries consuming the most resources. It helps in findingbottlenecks and optimizing database usage. The pt - mongodb - query - digest is a similar tool for MongoDB.

About the Profiler

Before we start, remember that the MongoDB database p rofiler is disabled by default, and should be enabled. It can be enabled server-wide,but the full mode that logs all queries is not recommended in production unless you are using Percona Server for MongoDB 3.2 or higher. We added a feature to allow the sample rate of non-slow queries (like in MySQL) to limit the overhead this causes.

Additionally, by default, the profiler is only 1MB per database. You may want to remove/create the profiler to sufficient size to find the results useful. To do this, use:

org_prof_level = db.getProfilingLevel(); //DisableProfiler db.setProfilingLevel(0); db.system.profile.drop(); //Setupa100M profile1*Math.pow(1024,2) == 1M profiler_size = 100 * Math.pow(1024,2); db.runCommand( { create: "system.profile", capped: true, size: profiler_size } ); db.setProfilingLevel(org_prof_level);

According to the documentation , to check if the profiler is enabled for the samples database, run:

echo "db.getProfilingStatus();" | mongolocalhost:17001/samples`

Remember, you need to connect to aMongoDB instance, not a mongos. The output will be something like this:

MongoDBshellversion: 3.2.12 connectingto: localhost:17001/samples { "was" : 0, "slowms" : 100 } bye

The value for the field “was” is 0, which means profiling is disabled. Let’s enable the profiler for the samples database.

You must enable the profiler on all MongoDB instances that could be related to a shard of our database. To check on which instances we should enable the profiler, I am going to use the pt - mongodb - summary tool. It shows us the information we need about our cluster:

./pt-mongodb-summary ./pt-mongodb-summary # Instances ############################################################################################## PIDHostTypeReplSetEngine 11037 localhost:17001SHARDSVR/PRIMARYr1wiredTiger 11065 localhost:17002SHARDSVR/SECONDARYr1wiredTiger 11136 localhost:17003SHARDSVR/SECONDARYr1wiredTiger 11256 localhost:17004SHARDSVR/ARBITERr1wiredTiger 11291 localhost:18001SHARDSVR/PRIMARYr2wiredTiger 11362 localhost:18002SHARDSVR/SECONDARYr2wiredTiger 11435 localhost:18003SHARDSVR/SECONDARYr2wiredTiger 11513 localhost:18004SHARDSVR/ARBITERr2wiredTiger 11548 localhost:19001CONFIGSVR-wiredTiger 11571 localhost:19002CONFIGSVR-wiredTiger 11592 localhost:19003CONFIGSVR-wiredTiger

We have mongod service running on the localhost on ports 17001~17003 and 18001~18003.

Now, let’s enable the profiler for the samples database on those instances. For this example, I am going to set the profile level to “2”, to collect information about all queries.

for portin 17001 17002 17003 18001 18002 18003; do echo "db.setProfilingLevel(2);" | mongolocalhost:${port}/samples; done Running pt-mongodb-query-profile

Now we are ready to get statistics about our queries. To run pt - mongodb - query - digest , we need to specify at least “host: port/database”, like:

./pt-mongodb-query-digestlocalhost:27017/samples

The output will be something like this (I am showing a section for only one query):

# Query 0:0.27 QPS, ID 2c0e2f94937d6660f510adeea98618f3 # Ratio1.00(docsscanned/returned) # Timerange: 2017-02-22 12:27:21.004 -0300 ARTto 2017-02-22 12:28:00.867 -0300 ART # Attributepcttotalminmaxavg95%stddevmedian # ============================================================================ # Count (docs)845 # ExecTimems991206069710290 # DocsScanned7594.000.0075.000.700.007.190.00 # DocsReturned7594.000.0075.000.700.007.190.00 # Bytesrecv08.60M215.001.06M10.17K215.00101.86K215.00 # String: # Namespacessamples.col1 # Operationquery # Fingerprintuser_id # Query{"user_id":{"$gte":3506196834,"$lt":3206379780}}

From the output, we can see that this query was seen 97 times, and it provides statistics for the number of documents scanned/retrieved by the server, the execution time and size of the results. The tool also provides information regarding the operation type, the fingerprint and a query example to help to identify the source.

By default, the results are sorted by query count. It can be changed by setting the -- order - by parameter to: count, ratio, query-time, docs-scanned or docs-returned.

A “-” in front of the field name denotes the reverse order. Example:

--order-by=-ratio

When considering what ordering to use, you need to know if you are looking for the most common queries (-count), the most cache abusive (-docs-scanned), or the worst ratio of scanned to returned (-ratio)? Please note you may be tempted to use (-query-time), however you will find this almost always ends up being more queries affected by, but not causing, issues.

Conclusion

This is a new tool in the Percona Toolkit. We hope in the future we can make it grow like its big brother for MySQL ( pt - query - digest ). This tool helps DBAs and developers identify and solve bottlenecks, and keep servers running at top performance.


Viewing all articles
Browse latest Browse all 6262

Trending Articles