Oracle Scratchpad

January 26, 2010

Aliases

Filed under: Infrastructure — Jonathan Lewis @ 6:45 pm BST Jan 26,2010

I was asked the following question recently: “Does the use of table aliases affect performance?” To which the best answer is probably “Yes, though in general you probably won’t notice the difference and there are reasons more important than performance for using table aliases.”

Consider the following two variants of a simple SQL statement:

select
	s.name,
	c.title
from
	du_course    c,
	du_student   s
where
	c.start_year = 2009
and	s.student_id = c.student_id
;
--
--
select
	du_student.name,
	du_course.title
from
	du_course,
	du_student
where
	du_course.start_year = 2009
and	du_student.student_id = du_course.student_id
;

The first thing to note is that we need to qualify at least some column names in this statement because of the appearance in two different tables of the student_id. The problem of several tables using the same column name is one of the really important reasons for qualifying columns names in all the SQL you write – if you don’t make a habit of doing it you will eventually walk into the “column capture” trap where (for example) a table in a subquery includes a column from a table in the main query and the optimizer interprets that column name in a way that you weren’t expecting.

But if you qualify the “at risk” column names does it matter if you don’t qualify the rest? When analysing any SQL statement one of the first things that Oracle has to do is check whether the list of columns referenced is legal, so it has to say: “is columnX in tableA, is columnX in tableB” and so on. If you don’t qualify columns every column has to be checked against every table – if you qualify them then each column only has to be checked against the table identified by the qualifier. In most cases, therefore, you will save a little CPU by qualifying columns … and that’s the only performance benefit you’re likely to get from qualifying column names. It’s highly likely, though, that the amount of CPU saved will be tiny compared to the rest of the work caused by the statement – so if you were only concerned about performance you probably wouldn’t bother.

But there’s another benefit to qualifying all column names – not just the “at risk” ones. It makes the SQL easier for the next person to understand and debug. In just the same way that the absence of qualifiers slows down the optimizer (slightly) it can slow down the human reader (significantly). Imagine you’re trying to debug a piece of SQL that joins seven tables, and includes references to 45 column names – how much effort is it going to take you to interpret a predicate like: “and cancellation_date > delivery_date” and work out whether or not it’s a predicate that suggests a change to an index definition, or prompts the addition of a hint to the SQL? You might even notice that there was only one column from a particular table in the select list and consider adding that column to an index used for a join. Wouldn’t things be much easier if you were immediately aware of which table each column belonged to because its name was qualified ?

So do qualify your column names – it helps to make the code safer, and easier to read, even though you probably won’t notice any performance benefit. The only significant question really is: “What’s the best strategy for qualifying column names ?”

I’ve shown you two options - neither of which I like. A single letter alias will use marginally less CPU than the full table name – but again it’s a saving that’s not worth having, particularly since a single letter alias doesn’t really help you remember table names when the statement starts to get complex. On the other hand, using the full table name can make the SQL very hard to read – especially if your standards include the schema name, and put everything into capitals. 

(Worst case scenario – a site where every table_name included part of the schema name, and every column name included part of the table_name, and all the code was in capitals.  Roughly 80% of any SQL statements was repetitions of schema and table_name; with fully-qualifed columns looking like: DW_STAGING.DW_STG_TRANSACTIONS.DW_STG_TRANS_TRANSACTION_ID).

My guideline – give every table in the query a three (or four – but be consistent) letter alias that hints at the full table-name. If you use the same table more than once in the same query append a number to the alias in every case. So table orders might be ord, and order_lines becomes orl. The idea is to have just enough text in the qualifier to remind your readers of the source table name, but not so much text that you swamp the statement with the pure volume of qualifiers.

There’s not a lot of change in my tiny sample when following the guideline, but this is the sort of thing I’d choose to write:


select
	stu.name,
	crs.title
from
	du_course    crs,
	du_student   stu
where
	crs.start_year = 2009
and	stu.student_id = crs.student_id

21 Comments »

  1. I agree with everything you say.

    For Alias naming I like to use the first letter of each word in the table name.

    orders o
    order_lines ol
    du_course dc
    du_student ds

    It is an easier system to standardize on.

    Comment by Sean Molloy — January 26, 2010 @ 7:17 pm BST Jan 26,2010 | Reply

  2. Sean: One nice thing about standardizing the size is the SQL reads a little easier, like

    AND abc.col1 = 2009
    AND def.col2 = 'Banana'
    AND ghi.col3 = 'Blah'
    

    scans better than

    AND abc.col1 = 2009
    AND d.col2   = 'Banana'
    AND ef.col3  = 'Blah'
    

    Of course, it’s all down to taste. :)

    Cheers

    Tim…

    Comment by Tim Hall — January 26, 2010 @ 7:54 pm BST Jan 26,2010 | Reply

  3. At least it would in a monospaced font. :)

    Comment by Tim Hall — January 26, 2010 @ 7:55 pm BST Jan 26,2010 | Reply

    • Fixed – using the WordPress “sourcecode” tag (see “How to Format”) in the right-hand panel.

      I’m a big fan of vertical alignment as well. Another of my guidelines is to align things, for example operators in the WHERE clause (not necessarily every single one, but possibly in groups) so that the text has a more block-like appearance rather than straggling, wavy lines. It’s just easier for the eye to scan.

      Comment by Jonathan Lewis — January 27, 2010 @ 6:43 pm BST Jan 27,2010 | Reply

  4. But avoid the use of just a, b, c…. especially if you call an in-line view ‘A’ and then use another ‘A’ inside the view referring to another object… life’s too short to unravel other people’s code.

    Comment by Pete Scott — January 26, 2010 @ 9:15 pm BST Jan 26,2010 | Reply

  5. The standard on the application I work on has the primary key of a table be mainly some abbreviation of the table name, and the alias for a table in a query be the meaningful part of the primary key name. So if we have a table called “ORDERS”, with a column called ORDER_KEY and a primary key of “PK_ORD”, orders should always be aliased as “select ord.order_key from orders ord”. While this can’t be enforced by the database, it does tend to lead to standardized aliases, which further improves readability over just using any old aliases. It also tends to reduce the impulse for developers to just alias using “a,b,c” etc

    Comment by Joseph Charpak — January 26, 2010 @ 9:35 pm BST Jan 26,2010 | Reply

    • By primary key of “PK_ORD” I assume you mean that’s the name of the constraint. That’s another reason I like the three (or 4) letter alias – I like to give every table a fixed short alias and then include that alias in constraint names.

      My guideline would be ORD_PK for the primary key constraint though – start with the alias; this then leads to ORL_FK_ORD as “the foreign key on order_lines that references orders”.

      Comment by Jonathan Lewis — January 27, 2010 @ 6:52 pm BST Jan 27,2010 | Reply

  6. Hi Jonathan,

    I’m curious about your mentioning of ‘column capture trap’. Is this a known Oracle behaviour? I raised an SR on this a few weeks ago, and didn’t suspect it being ‘normal’.

    Comment by Andy — January 27, 2010 @ 10:30 am BST Jan 27,2010 | Reply

    • First it’s important to check that we mean the same thing by column capture. An example:

      select  ...
      from    tab1
      where   columnX = (
                      select  max(column1)
                      from    tab2
                      where   columnY = 99
              )
      ;
      

      If columnY does not exist in tab2, then the optimizer will move outwards to check if it exists in tab1.

      I’ve seen complaints from time to time along the lines of “Oracle didn’t reject this query, even though columnY doesn’t exist in tab2.

      But it’s the correct, and expected, behaviour.

      Comment by Jonathan Lewis — January 27, 2010 @ 6:55 pm BST Jan 27,2010 | Reply

      • Thanks for clarifying. The one I found is almost similar:

        select ... 
        from tab1
        where exists (
                  select c1 
                  from tab2
                  );
        

        In this case, c1 does not belong to tab2, but exists in tab1. The query runs without error.

        Comment by Andy — January 28, 2010 @ 4:52 am BST Jan 28,2010 | Reply

  7. And never ever use x or y as an alias !!

    (ORA-00904 is not raised when typo, instead ORA-06553: PLS-306):
    
    sql > select z.abc from dual z;
    select z.abc from dual z
           *
    ERROR at line 1:
    ORA-00904: "Z"."ABC": invalid identifier
    
    
    sql > REM expected
    sql > select w.abc from dual w;
    select w.abc from dual w
           *
    ERROR at line 1:
    ORA-00904: "W"."ABC": invalid identifier
    
    
    sql > REM also expected
    sql > select x.abc from dual x;
    select x.abc from dual x
           *
    ERROR at line 1:
    ORA-06553: PLS-306: wrong number or types of arguments in call to 'OGC_X'
    
    
    sql > REM oops ?
    sql > select y.abc from dual y;
    select y.abc from dual y
           *
    ERROR at line 1:
    ORA-06553: PLS-306: wrong number or types of arguments in call to 'OGC_Y'
    
    
    sql > REM oops ?
    sql >
    

    (As far as I remember, there is some metalink note on this, it says “expected behaviour”)

    Comment by Sokrates — January 27, 2010 @ 2:36 pm BST Jan 27,2010 | Reply

    • That’s right – it’s a variant of the “column capture” problem.

      Since x.abc isn’t {alias.column} Oracle tries to interpret it as a call to a function or operator.

      Unfortunately, if you’ve installed Spatial, there are two functions (called ogc_x and ogc_y) which has been given public synonyms X and Y respectively that move Oracle into the function-call validity checks – which is where the pls error comes from.

      There are a couple more public synonyms which might cause the same (or similar) errors – for example:

      select path.abc from dual;
      ERROR at line 1:
      ORA-29900: operator binding does not exist
      ORA-06553: PLS-306: wrong number or types of arguments in call to 'PATH'
      

      (PATH is a public synonym for an operator owned by XDB called PATH – and since it’s an operator the leading ORA-29900)

      You have to be a little unlucky to get these unexpected errors, though. You’ve got to use an alias that happens to be a synonym for a function or operator on which you have execute privileges, and then you have to reference a column that doesn’t exist.

      Comment by Jonathan Lewis — January 27, 2010 @ 7:04 pm BST Jan 27,2010 | Reply

  8. In most temporary situations, I like to use “a,b,c” as alias.

    Comment by jametong — January 27, 2010 @ 3:07 pm BST Jan 27,2010 | Reply

    • I’m relly hoping the comment by jametong to use “a,b,c” is ironic as I wince when I see the use of these as alisas in queries as the aliases are meaningless – you have to keep scanning back to the FROM list with it’s tables and aliases to make sense of the rest of the query.
      I was forced, many many years back, to use “four” letter aliases. A single-word table, you take four letters, so table PERSON would have the alias PERS. For two-word table names it was two letters from each. PERSON_NAME became PENA. For three-word and above tsbles it was one letter per word. So PERSON_NAME_HISTORY was PNH, PATIENT_CLINCAL_NOTES was PCN and PATIENT_PRIMARY_CONTACT_DETAILS would be PPCD. Within a couple of weeks of someone new starting on the project they were using the aliases to refer to the tables. Naming key was simple. The PERSON_ADDRESS table had a primary key called PEAD_PK. Consultant to Consultant Speciality foreign key was CONS_COSP_FK, etc, etc, etc. You no longer had to think about what to call constraints, sequences, etc, it all flowed from the table alias names.
      Does anyone still use this sort of thing? It works for me but seems rare on customer sites.

      Comment by mwidlake — January 28, 2010 @ 12:17 am BST Jan 28,2010 | Reply

  9. On one project I worked on when you created a new table you also were required to add a comment on the table with name to be used as the alias, something like “COMMENT ‘ALIAS=ORD”. This enabled all developers to look up the alias names through the data dictionary. I always thought it would be nice to have an additional attribute on a table that was specificly for the alias.

    Comment by JohnK — January 30, 2010 @ 11:06 am BST Jan 30,2010 | Reply

  10. [...] so looking at the Predicate Information section of a DBMS_XPLAN output could be confusing.  A recent article on another blog offers suggestions for improving the readability of table [...]

    Pingback by Slow Query with an Interesting Execution Plan « Charles Hooper's Oracle Notes — February 8, 2010 @ 6:03 am BST Feb 8,2010 | Reply

  11. [...] 7-How to use column aliases with SQL queries? (comments) Jonathan Lewis-Aliases [...]

    Pingback by Blogroll Report 22/01/2009 – 29/01/2010 « Coskan’s Approach to Oracle — February 21, 2010 @ 3:25 am BST Feb 21,2010 | Reply

  12. [...] Lewis shares an explication of aliases: “I was asked the following question recently: ‘Does the use of table aliases affect [...]

    Pingback by Log Buffer #180: a Carnival of the Vanities for DBAs | The Pythian Blog — February 26, 2010 @ 6:04 pm BST Feb 26,2010 | Reply

  13. [...] the code so hard to read, especially when it’s all in upper case. It’s important to use aliases, of course, but 3 or 4 letters is a sensible [...]

    Pingback by Index Join – 3 « Oracle Scratchpad — December 7, 2010 @ 6:02 pm BST Dec 7,2010 | Reply

  14. [...] http://jonathanlewis.wordpress.com/2010/01/26/aliases/ Unfortunately, if you’ve installed Spatial, there are two functions (called ogc_x and ogc_y) which has been given public synonyms X and Y respectively that move Oracle into the function-call validity checks – which is where the pls error comes from. [...]

    Pingback by ORA-06553: PLS-306: wrong number or types of arguments in call to ‘OGC_Y’ while exporting table « Anand's Blog — June 19, 2012 @ 6:22 pm BST Jun 19,2012 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,529 other followers