MyProfi 0.18 Beta

A command line tool that parses mysql query log and outputs statistics.
MyProfi project is a command line tool that parses mysql query log and outputs statistics of most frequently used queries, sorting them by number of times they appear in a log file. The similar queries are groupped together.

Thus, select id from user where login="admin" and select id from user where login='root' become select id from user where login={} - the variable parts of queries are replaced with {} markers, this make statistics analysis as easy as revewing the top of most frequently occured queries' patterns.

When developing an application one can hardly predict, which of the sql queries will run the most often, and which less often. The most popular queries are to be the first candidates for optimisation, fine tuning and polishing. By optimizing the most frequenly run queries the developer may achieve significant performance gain for overall system. By reviewing the statistics of MyProfi output it's becomes too easy to extract those queries to start optimizing db by adding indexes, reconsidering database structure, etc.

Main features:

  • Removing unsignificant variable data:
  • Usually applications run similar queries against database, but using different variable parameters in them. For example, fetch some data by its ID number in a table. What we want is to know, how frequently this query was executed at all. By removing the variable data from all queries we can then group them and count how many times they appear in the log file. Thus, from two queries SELECT NAME FROM USER WHERE ID=223 and SELECT NAME FROM USER WHERE ID=223 we will get one pattern SELECT NAME FROM USER WHERE ID={} two times.
  • As there may be variable data of different types we also wish to remove quoted strings (considering all possible escaping techniques), multiline and single line comments, unsignificant whitespaces, siquential variables like ID IN (1 ,2 ,3), etc.
  • Queries ordered by popularity:
  • That's exactly the reason, why this utility has been developed - to know, which queries are the most frequently run on database server. The more popular the query is, the more performance may be gained by optimizing it.
  • Additional statistics:
  • For slow query formats there are available additional options to use for sorting by a different then number of queries criteria. As slow query log provides the information for each query about the time it was running, the time it was locked, the number of rows totally examined and finally returned. MyProfi allowes to sort the statistics by any of these field, and even by maximum, total or average numbers for each pattern.
  • Statistics by query type:
  • Query type statistics is sometimes useful for appropriate database server configuration and tuning. For example, if UPDATE queries are the most frequently run against database, a developer may consider restructurizing tables to reduce the use of indexes, as too many indexes in tables make its update slower. Or otherwise, if SELECT queries are popular, it may be a reason to allocate more memory buffer space, etc.
  • Different input formas support:
  • MyProfi understands several input file formats. Beside general query log format, slow query log format is also supported.
  • As of version 5.1.x of MySQL server, both the general and slow query logs may be configured to be writen to mysql system table. By setting an appropriate option in MyProfi or providing input file with .csv extension we can tell the parser to treat input files as csv formatted.
  • Moreover, for slow query log you can output an additional statistics.
  • Stream parsing:
  • No matter how huge your input file is, the parser will never load whole content into memory, but reads the file chunk by chunk (though, you might be going to wait for "some time" while this huge file is being processed).
  • Custom queries filtering:
  • You can choose to output statistics only for certain type of queries: SELECT, UPDATE, DELETE, etc. This may be useful, for instance, when you are going to check the most popular queries execution plan using EXPLAIN. In this case non-SELECT queries will be just useless.
  • Support for top N queries:
  • Output only first N patters in sorted statistics.

last updated on:
April 21st, 2009, 19:34 GMT
developed by:
license type:
GPL (GNU General Public License) 
ROOT \ Database \ Database APIs
Download Button

In a hurry? Add it to your Download Basket!

user rating



Rate it!
What's New in This Release:
  • Added: HTML output option
read full changelog

Add your review!