Basic Debugging
Contents
Designing Queries
Following a few conventions makes finding errors in queries a lot easier, especially when you ask for help from people who might know SQL, but know nothing about your particular schema. A query easy to read is a query easy to debug. Use whitespace to group clauses within the query. Choose good table and field aliases to add clarity, not confusion. Choose the syntax that supports the query's meaning.
Using Whitespace
http://farm1.static.flickr.com/231/495887500_5b1732dfb3_m.jpg
A query hard to read is a query hard to debug. White space is free. New lines and indentation make queries easy to read, particularly when constructing a query inside a scripting language, where variables are interspersed throughout the query.
There is a syntax error in the following. How fast can you find it?
SELECT u.id, u.name, alliance.ally FROM users u JOIN alliance ON (u.id=alliance.userId) JOIN team ON (alliance.teamId=team.teamId WHERE team.teamName='Legionnaires' AND u.online=1 AND ((u.subscription='paid' AND u.paymentStatus='current') OR u.subscription='free') ORDER BY u.name;
Here's the same query, with correct use of whitespace. Can you find the error faster?
SELECT u.id , u.name , alliance.ally FROM users u JOIN alliance ON (u.id = alliance.userId) JOIN team ON (alliance.teamId = team.teamId WHERE team.teamName = 'Legionnaires' AND u.online = 1 AND ( (u.subscription = 'paid' AND u.paymentStatus = 'current') OR u.subscription = 'free' ) ORDER BY u.name;
Even if you don't know SQL, you might still have caught the missing ')' following team.teamId.
The exact formatting style you use isn't so important. You might like commas in the select list to follow expressions, rather than precede them. You might indent with tabs or with spaces. Adherence to some particular form is not important. Legibility is the only goal.
Table and Field Aliases
http://farm1.static.flickr.com/214/458505169_e0b18e0492_m.jpg
Aliases allow you to rename tables and fields for use within a query. This can be handy when the original names are very long, and is required for self joins and certain subqueries. However, poorly chosen aliases can make a query harder to debug, rather than easier. Aliases should reflect the original table name, not an arbitrary string.
Bad:
SELECT * FROM financial_reportQ_1 AS a JOIN sales_renderings AS b ON (a.salesGroup = b.groupId) JOIN sales_agents AS c ON (b.groupId = c.group) WHERE b.totalSales > 10000 AND c.id != a.clientId
As the list of joined tables and the WHERE clause grow, it becomes necessary to repeatedly look back to the top of the query to see to which table any given alias refers.
Better:
SELECT * FROM financial_report_Q_1 AS frq1 JOIN sales_renderings AS sr ON (frq1.salesGroup = sr.groupId) JOIN sales_agents AS sa ON (sr.groupId = sa.group) WHERE sr.totalSales > 10000 AND sa.id != frq1.clientId
Each alias is just a little longer, but the table initials give enough clues that anyone familiar with the database only need see the full table name once, and can generally remember which table goes with which alias while reading the rest of the query.
Placing JOIN conditions
The MySQL manual warns against using the JOIN condition (that is, the ON clause) for restricting rows. Some queries, particularly those using implicit joins, take the opposite extreme - all join conditions are moved to the WHERE clause. In consequence, the table relationships are mixed with the business logic.
Bad:
SELECT * FROM family, relationships WHERE family.personId = relationships.personId AND relationships.relation = 'father'
Without digging through the WHERE clause, it is impossible to say what links the two tables.
Better:
SELECT * FROM family JOIN relationships ON (family.personId = relationships.personId) WHERE relationships.relation = 'father'
The relation between the tables is immediately obvious. The WHERE clause is left to limit rows in the result set.
Compliance with such a restriction negates the use of the comma operator to join tables. It is a small price to pay. Queries should be written using the explicit JOIN keyword anyway, and the two should never be mixed (unless you like rewriting all your queries every time a new version changes operator precedence).
Finding Syntax Errors
Syntax errors are among the easiest problems to solve. MySQL provides an error message showing the exact point where the parser became confused. Check the query, including a few words before the phrase shown in the error message. Most syntax and parsing errors are obvious after a second look, but some are more elusive, especially when the error text seems empty, points to a valid keyword, or seems to error on syntax that appears exactly correct.
Interpreting the Empty Error
Most syntax errors are easy to interpret. The error generally details the exact source of the trouble. A careful look at the query, with the error message in mind, often reveals an obvious mistake, such as mispelled field names, a missing 'AND', or an extra closing parenthesis. Sometimes the error is a little less helpful. A frequent, less-than-helpful message:
ERROR 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ' ' at line 1
The empty ' ' can be disheartening. Clearly there is an error, but where? A good place to look is at the end of the query. The ' ' suggests that the parser reached the end of the statement while still expecting some syntax token to appear.
Check for missing closers, such as ' and ):
SELECT * FROM someTable WHERE field = 'value
Look for incomplete clauses, often indicated by an exposed comma:
SELECT * FROM someTable WHERE field = 1 GROUP BY id,
Checking for keywords
MySQL allows table and field names and aliases that are also reserved words. To prevent ambiguity, such names must be enclosed in backticks (`):
SELECT * FROM actionTable WHERE `DELETE` = 1;
If the syntax error is shown near one of your identifiers, check if it appears on the reserved word list. http://dev.mysql.com/doc/refman/5.1/en/reserved-words.html
A text editor with color highlighting for SQL syntax helps to find these errors. When you enter a field name, and it shows up in the same color as the SELECT keyword, you know something is amiss. Some common culprits:
- DESC is a common abbreviation for "description" fields. It means "descending" in a MySQL ORDER clause.
- DATE, TIME, TIMESTAMP are all common field names. They are also field types.
- ORDER appears in sales applications. MySQL uses it to specify sorting for results.
Some keywords are so common that MySQL makes a special allowance to use them unquoted. My advice: don't. If it's a keyword, quote it.
Version specific syntax
As MySQL adds new features, the syntax must change to support them. Most of the time, old syntax will work in newer versions of MySQL. One notable exception is the change in precedence of the comma operator relative to the JOIN keyword in version 5.0. A query that used to work, such as
SELECT * FROM a, b JOIN c ON a.x = c.x;
will now fail.
More common, however, is an attempt to use new syntax in an old version. Web hosting companies are notoriously slow to upgrade MySQL, and you may find yourself using a version several years out of date. The result can be very frustrating when a query that executes flawlessly on your own workstation, running a recent installation, fails completely in your production environment.
This query fails in any version prior to 4.1, when subqueries were added to the MySQL server:
SELECT * FROM someTable WHERE someId IN (SELECT id FROM someLookupTable);
This query fails in some early versions of MySQL, because the JOIN syntax did not originally allow an ON clause:
SELECT * FROM tableA JOIN tableB ON tableA.x = tableB.y;
Always check the installed version of MySQL, and read the section of the manual relevant for that version. The manual usually indicates exactly when particular syntax became available for use.