I was asked the following question recently: “Does the use of table aliases affect performance?” To which the best answer is probably “Yes, though in general you probably won’t notice the difference and there are reasons more important than performance for using table aliases.”
Consider the following two variants of a simple SQL statement:
select s.name, c.title from du_course c, du_student s where c.start_year = 2009 and s.student_id = c.student_id ; -- -- select du_student.name, du_course.title from du_course, du_student where du_course.start_year = 2009 and du_student.student_id = du_course.student_id ;
The first thing to note is that we need to qualify at least some column names in this statement because of the appearance in two different tables of the student_id. The problem of several tables using the same column name is one of the really important reasons for qualifying columns names in all the SQL you write – if you don’t make a habit of doing it you will eventually walk into the “column capture” trap where (for example) a table in a subquery includes a column from a table in the main query and the optimizer interprets that column name in a way that you weren’t expecting.
But if you qualify the “at risk” column names does it matter if you don’t qualify the rest? When analysing any SQL statement one of the first things that Oracle has to do is check whether the list of columns referenced is legal, so it has to say: “is columnX in tableA, is columnX in tableB” and so on. If you don’t qualify columns every column has to be checked against every table – if you qualify them then each column only has to be checked against the table identified by the qualifier. In most cases, therefore, you will save a little CPU by qualifying columns … and that’s the only performance benefit you’re likely to get from qualifying column names. It’s highly likely, though, that the amount of CPU saved will be tiny compared to the rest of the work caused by the statement – so if you were only concerned about performance you probably wouldn’t bother.
But there’s another benefit to qualifying all column names – not just the “at risk” ones. It makes the SQL easier for the next person to understand and debug. In just the same way that the absence of qualifiers slows down the optimizer (slightly) it can slow down the human reader (significantly). Imagine you’re trying to debug a piece of SQL that joins seven tables, and includes references to 45 column names – how much effort is it going to take you to interpret a predicate like: “and cancellation_date > delivery_date” and work out whether or not it’s a predicate that suggests a change to an index definition, or prompts the addition of a hint to the SQL? You might even notice that there was only one column from a particular table in the select list and consider adding that column to an index used for a join. Wouldn’t things be much easier if you were immediately aware of which table each column belonged to because its name was qualified ?
So do qualify your column names – it helps to make the code safer, and easier to read, even though you probably won’t notice any performance benefit. The only significant question really is: “What’s the best strategy for qualifying column names ?”
I’ve shown you two options – neither of which I like. A single letter alias will use marginally less CPU than the full table name – but again it’s a saving that’s not worth having, particularly since a single letter alias doesn’t really help you remember table names when the statement starts to get complex. On the other hand, using the full table name can make the SQL very hard to read – especially if your standards include the schema name, and put everything into capitals.
(Worst case scenario – a site where every table_name included part of the schema name, and every column name included part of the table_name, and all the code was in capitals. Roughly 80% of any SQL statements was repetitions of schema and table_name; with fully-qualifed columns looking like: DW_STAGING.DW_STG_TRANSACTIONS.DW_STG_TRANS_TRANSACTION_ID).
My guideline – give every table in the query a three (or four – but be consistent) letter alias that hints at the full table-name. If you use the same table more than once in the same query append a number to the alias in every case. So table orders might be ord, and order_lines becomes orl. The idea is to have just enough text in the qualifier to remind your readers of the source table name, but not so much text that you swamp the statement with the pure volume of qualifiers.
There’s not a lot of change in my tiny sample when following the guideline, but this is the sort of thing I’d choose to write:
select stu.name, crs.title from du_course crs, du_student stu where crs.start_year = 2009 and stu.student_id = crs.student_id