Quantify Cyber Risk Now

header ads

MySQL Query Optimisation for Database Developers | Lucideus Research

Profile Your Workload:
The best way to understand how your server spends its time is to profile the server’s workload. You can profile your workload by using tools like MySQL Enterprise Monitor’s query analyser or the pt-query-digest from the Percona Toolkit. These tools return a table of queries sorted by response time in descending order. By profiling workload, you can expose the most time consuming queries for tuning.Workload-profiling tools group similar queries together, allowing you to see the queries that are slow, as well as the queries that are fast but executed more then 1 time.

CPU, Memory, Disk & Network:
A database server needs four resources – CPU, memory, disk and network. If any of this is overloaded or have poor configuration, then database server is very likely to perform poorly. You should select server with fast CPU & Disk along with good memory (RAM).
Less memory always cause the poor performance for a database server.

Avoid Using Select * Clauses:
Do not use the * because it is very inefficient. The * has to be converted to each column in turn. The SQL parser handles all the field references by obtaining the names of valid columns from the data dictionary and substitutes them on the command line, which is time consuming.

Exists vs. In:
The EXISTS function searches for the presence of a single row that meets the stated criteria, as opposed to the IN statement that looks for all occurrences.

PRODUCT – 1000 rows
ITEMS – 1000 rows

A. SELECT p.product_id FROM products p
WHERE p.item_no IN
(SELECT i.item_no FROM items i);

B. SELECT p.product_id FROM products p
(SELECT ‘1’ FROM items i WHERE i.item_no = p.item_no) ;

For query A, all rows in ITEMS will be read for every row in PRODUCTS. The effect will be 1,000,000 rows read from ITEMS. In the case of query B, a maximum of 1 row from ITEMS will be read for each row of PRODUCTS, thus reducing the processing overhead of the statement.

Not Exists vs. Not In:
In sub query statements such as the following, the NOT IN clause causes an internal sort/ merge:
SELECT * FROM student
WHERE student_num NOT IN
(SELECT student_num FROM class) ;

Instead, use:
SELECT * FROM student c
(SELECT 1 FROM class a WHERE a.student_num = c.student_num) ;

Use Union instead of OR:
In general, always consider the UNION verb instead of OR verb in the WHERE clauses. Using OR on an indexed column causes the optimiser to perform a full table scan rather than an indexed retrieval.

Use Union Instead of Outer Joins:
Rewrite the UNION using a FULL OUTER JOIN with the NVL function: it is suggested that this has faster performance than the UNION operator.
SELECT empno, ename, nvl(dept.deptno,emp.deptno) deptno, dname
FROM emp
ON (emp.deptno = dept.deptno)
ORDER BY empno, ename,deptno,dname ;

Using Indexes to Improve Performance:
Avoid using functions like "UPPER" or "LOWER" on the column that has an index. In case there is no way that the function can be avoided, use Functional Indexes.

Avoid Transformed Columns in the WHERE Clause:
When you need to use SQL functions on filters or join predicates, do not use them on the columns on which you want to have an index, as in the following statement:
varcol = TO_CHAR(numcol)
This prevents MySQL from use index for the column.

Instead, use them on the opposite side of the predicate, as in the following statement:
TO_CHAR(numcol) = varcol

Understand Your Engines:
MySQL has two primary storage engines: MyISAM and InnoDB. Each has its own performance characteristics and considerations. In the broadest sense MyISAM is good for read-heavy data and InnoDB is good for write-heavy data, though there are cases where the opposite is true. The biggest gotcha is how the two differ with respect to the COUNT function.

MyISAM keeps an internal cache of table meta-data like the number of rows. This means that, COUNT(*) incurs no additional cost for a well-structured query. InnoDB, however, has no such cache. For a concrete example, let's say we're trying to paginate a query. If you have a query SELECT * FROM users LIMIT 5,10, let's say, running SELECT COUNT(*) FROM users LIMIT 5,10 is essentially free with MyISAM but takes the same amount of time as the first query with InnoDB.

Post a comment