Oracle Scratchpad

February 4, 2010

SQL Server

Filed under: CBO,Execution plans,Hints,Infrastructure,Oracle,SQL Server,Statistics,Wishlist — Jonathan Lewis @ 7:07 pm BST Feb 4,2010

A few days ago I did a presentation on SQL Server. This probably sounds a little strange given my status as an Oracle specialist – but the nice people at Microsoft asked me if I would contribute to one of their seminars so I downloaded and installed the 180 day free trial copy of the Enterprise version, then downloaded the “Books Online” manuals and started to play.

It was an interesting experience – and I think the audience (and organisers) found my presentation useful. The title was “What the Enterprise needs in an RDBMS” – and that’s something I do know about – and the presentation was about whether or not you could find everything you needed in SQL Server 2008, where you’d have to look in the manuals, and supplementary questions you’d have to ask.

The biggest problem I found was searching the manuals for things I knew would be there. For example, SQL Server 2008 does read-consistency so it has to generate the equivalent of undo – but could I find that as a keyword in the manuals ? (The correct search term is “version store”.)

Anyway – while working through SQL Server 2008 I came across a few really cute things that I’d like to see in Oracle, so I thought I’d list them here:

Filtered indexes – I know it’s something I can do with function-based indexes and decode() or case, but in SQL Server 2008 you can create an index with a predicate, e.g.  create an  index which references only the rows where status = ‘A’. Which means you can created an exquisitely tailored, minimalist index without changing the supplied code.

Filtered statistics – even if I don’t want to create an index there may be some very popular predicates that frequently appear along with a number of other predicates – rather than creating lots of “extended statistics” (11g – and limited to equality predicates anyway) it would be nice to be able to create statistics for just those rows where, say, status = ‘A’. This could produce a lot of extra stats entries for columns, of course, but perhaps these could be limited to the columns which appeared in sys.col_usage$.

“Included Columns”  - you can add non-key columns indexes to allow the optimizer to resolve queries within the index. (If you have enough columns to do this, the index is often called a “covering” index. ) Of course, you can do this with Oracle – but there’s an important difference. In Oracle the extra columns are treated as part of the key and this is likely to affect the clustering_factor of the index, possibly making the index seem undesirable for other queries where the included columns were irrelevant or insufficient.

“Optimize for” hint – there is a hint that overrides bind peeking (or parameter sniffing, as I think SQL Server calls it). You can say option (optimize for (@bind1 = ‘LONDON’ @bind2 unknown)) to tell the optimizer to optimize as if bind variable (parameter) bind1 was set to ‘LONDON’ and bind2 was not peekable. What a great way to deal with the problems of instability caused by bind variable peeking.

sp_create_plan_guide – I know you can now create profiles by hand (and could hack stored outlines long ago) in Oracle, but in SQL Server 2008 it’s easy, definitely legal, and approved of. Update: I had forgotten that the Optimizer Development Group had published a note about SQL Plan Management in 11g showing how you could create an SQL Plan  Baseline for one piece of SQL text by pulling the execution plan of another piece of SQL text from memory.

date_correlation_optimization – there’s an option to optimize for correlation between date columns in different tables. The tables have to be joined by a single column foreign key – and there are some other restrictions as well, but you can appreciate that there are some cases where assuming correlation between dates (joining orders to deliveries, for example) is a very good idea. It could be quite hard to implement this one, and could be quite costly to collect the statistics – but it’s a really nice idea. Update: I’ve had a look at how this is done in SQL Server and it’s not nice: essentially it uses something that an Oracle user would recognise as an ‘on commit refresh’ materialized view; the overheads are likely to be significant and there will be concurrency issues in busy systems.

I haven’t dumped my SQL Server virtual machine yet and I’ve already been asked by two different people if I’d be prepared to repeat the presentation, so there may be more wish-list items to come.

22 Comments »

  1. I really enjoyed this post. It’s interesting to see how other database vendors tackle the same problems (or don’t).

    Comment by Steve C — February 4, 2010 @ 9:28 pm BST Feb 4,2010 | Reply

  2. I’ve just added another item to the list: “included columns”.

    Comment by Jonathan Lewis — February 4, 2010 @ 10:09 pm BST Feb 4,2010 | Reply

  3. Now that you have reviewed SQL Server 2008, under what conditions if any would you recommend a client use it instead of Oracle?

    Comment by Philip Papadopoulos — February 4, 2010 @ 10:28 pm BST Feb 4,2010 | Reply

    • Philip,

      My experience with Oracle is something over 21 years; my experience with SQL Server is closer to 21 hours. I wouldn’t dream of advising anyone about SQL Server based on that little experience.

      I would tell them whether Oracle could do the job for them; I might tell them of possible threats (if I could think of any) that SQL Server might not be able to handle. I’d probably suggest that their choice should be biased by current levels of in-house experience.

      As far as choosing SQL Server though, I’d tell them to call in a SQL Server expert to give an opinion, but give them a list of questions that they need to ask.

      Comment by Jonathan Lewis — February 6, 2010 @ 6:01 pm BST Feb 6,2010 | Reply

  4. A couple of things I’ve always envied in SQL Server:
    1. A TOP keyword (as in SELECT TOP 10 (or TOP 10 PERCENT) …)
    2. A SQL “unloader”

    I really appreciated how SQL Server’s tools are designed to work with any OLE DB connection. So you can use Integration Services to do ETL to/from Oracle if you want. Or how Reporting Services is included, not an extra charge.

    I used to teach a seminar for Microsoft on how to work with Oracle databases using SQL Server Tools.

    But I just wish Microsofties would stop using the term “SQL” to refer to Microsoft SQL Server (the product) instead of SQL (the language). Like there is no other!

    Comment by Bob Watkins — February 4, 2010 @ 11:37 pm BST Feb 4,2010 | Reply

    • Yes,

      Top N was very useful. You can emulate it in Oracle, of course, but the code is a little clumsy by comparison. Naturally it prompted a couple of questions about how clever SQL Server could be about using Top N with an order by clause.

      I used it a lot to start with – I did most of my experimentation using sqlcmd and couldn’t find the equivalent of the “describe” command so kept on doing “select top 1 * from … “.

      I could really do with a version of Tom Kyte’s “print_table()” procedure for T-SQL.

      Comment by Jonathan Lewis — February 6, 2010 @ 6:03 pm BST Feb 6,2010 | Reply

  5. [...] now here is Jonathan Lewis with a post on SQL Server. Yes, Jonathan Lewis, famous Oracle guy. “A few days ago,” Jonathan writes, “I [...]

    Pingback by Log Buffer #177: a Carnival of the Vanities for DBAs | The Pythian Blog — February 5, 2010 @ 6:24 pm BST Feb 5,2010 | Reply

  6. The “Filtered Indexes”, “Filtered Statistics” and “Optimize for” seem potentially very powerful advantages of SQLServer over Oracle !
    And you studied SQLServer well enough to identify these major differences in only a few days ? That’s really skillful of you !
    Do you have test cases to compare the advantages of these features ? (Not that I have SQLServer !)

    Hemant

    Comment by Hemant K Chitale — February 6, 2010 @ 3:53 pm BST Feb 6,2010 | Reply

    • Hemant,

      Remember that I was looking at SQL Server from the position of expertise in Oracle and database principles in general. So I had a list of things that I knew ought to exist, and then some of the other bits were “accidental” finds around the edges of my searches.

      For example: I knew that there would have to be a mechanism for hinting, so I did a search in the manuals for “hints”; this gave me several links for hints, including a page listing the available hints – and one of those was “optimize for()”. Following the link for that one I got an example of usage.

      No test cases – the three examples you’ve cited are all cases where my knowledge of the Oracle allows me to work out how effective the feature could be in the right circumstances.

      Comment by Jonathan Lewis — February 6, 2010 @ 6:11 pm BST Feb 6,2010 | Reply

  7. “couldn’t find the equivalent of the “describe” command”

    The closest equivalent would be the sp_help procedure, but it gives a lot more information than just column names and data type. Often select top 1 * is quicker because it doesn’t overload you with informaiton.

    Comment by Shannon Severance — February 9, 2010 @ 7:22 pm BST Feb 9,2010 | Reply

  8. Noticed an obscure thing that MS can do but 11g cannot, at least according to Monty: http://askmonty.org/wiki/index.php/Manual:Table_Elimination (near the bottom).

    Comment by joel garry — February 12, 2010 @ 12:51 am BST Feb 12,2010 | Reply

    • Joel,

      Oracle has been able to do table elimination (called join elimination in Oracle) since 10gR2 provided the join is to a single column primary key; but I’m sure there are lots of little details (like the one posted in AskMonty) where one of the optimizer’s will get smarter than the other.

      His example (where the id column in both cases is a single column primary key) that doesn’t behave in Oracle:

      select  A.colA
      from
              tableA A
      left outer join
              tableB B
      on      B.id = A.id
      and     B.fromDate = (
                      select
                              max(sub.fromDate)
                      from    tableB sub
                      where   sub.id = A.id
               )
      ;
      

      It would be interesting to know whether SQL Server recognises all subsequent predicates on B as redundant, or whether it folds the subquery first to produce a tautology that can be eliminated. (I’d guess the former.)

      Oracle is probably hampered by the way in which it’s first transformation step is to translate the second table in the left outer join into a lateral view – and by the time it’s done that the subquery is stuck.

      Comment by Jonathan Lewis — February 12, 2010 @ 9:06 am BST Feb 12,2010 | Reply

  9. Hi Jonathan,

    I just saw your house on Google. I like your front gate:

    http://gallery.nen.gov.uk/gallery_images/0609/0000/0082/gate_mid.jpg

    Comment by Nigel — February 12, 2010 @ 10:18 am BST Feb 12,2010 | Reply

  10. Jonathan ,

    Wouldn’t “Adaptive Cursor Sharing ” in 11g minimize the diadvantages of bind variable peeking ?
    Would you consider “Optimize for” hint be close to Adaptive Cursor sharing ?

    Thanks

    Comment by Zahir Mohideen — February 21, 2010 @ 1:36 am BST Feb 21,2010 | Reply

    • Zahir,

      Sorry, I missed this comment when it first came up.

      The problem with adaptive cursor sharing is that it has to go wrong before it goes right – and then it carries on trying to do better, so you can get constantly changing plans.

      Also I don’t think adaptive cursor sharing is “persistent”, i.e. the plans and their bind variable ranges don’t get stored in the database (unlike sql base lines), so if a plan gets flushed, or when you restart the instance, you have to start making mistakes and generating new plans all over again.

      Comment by Jonathan Lewis — March 3, 2010 @ 9:54 pm BST Mar 3,2010 | Reply

  11. [...] Following on from my posting about the presentation (about Enterprise databases) that I did at a Microsoft event on SQL Server 2008, I’ve just had an article on how to design efficient SQL published on a website that’s [...]

    Pingback by SQL Server 2 « Oracle Scratchpad — March 5, 2010 @ 9:48 am BST Mar 5,2010 | Reply

  12. [...] I had still had the presentation I did for Microsoft/Unisys back in February and enough time to remind myself of what was on the slides. On this occasion, though, I also had a [...]

    Pingback by Back from MOW « Oracle Scratchpad — April 30, 2010 @ 8:12 am BST Apr 30,2010 | Reply

  13. [...] Server — Jonathan Lewis @ 7:45 pm UTC Jun 1,2010 I pointed out some time ago a few of the things in SQL Server that I would like to see in Oracle. Here’s a couple [...]

    Pingback by I wish .. (1) « Oracle Scratchpad — June 1, 2010 @ 7:46 pm BST Jun 1,2010 | Reply

  14. One of the nice things that helps TOP is that SQL Server has a Hash Match/Flow Distinct query plan node which is used to eliminate duplicates and which pipelines. (The downside is that the optimizer doesn’t always select that node when it should.) We have a query that checks to see if the number of rows where a person’s last name starts with XXX exceeds a threshold (301 by default). If the threshold is exceeded, then the user is prompted to make the search more restrictive. There’s no way to prevent a user from searching on ‘Smith’. In SQL Server, we see the optimizer generate a Hash Match/Flow Distinct for duplicate elimination. SQL Server does not have the equivalent of STATISTICS_LEVEL = ALL, but we can see by the estimated row counts and the actual # of logical reads that the scan on ‘Smith’ stops when enough rows have been returned by the Hash Match/Flow Distinct. In Oracle, we restrict on rownum we see a HASH UNIQUE for the same query and the row counts from STATISTICS_LEVEL = ALL show that HASH UNIQUE is not pipelining. (All of the rows that start with ‘Smith’ are flowing into the HASH UNIQUE.)

    So, do you know of any way to get the desired pipelining in Oracle?

    Comment by Jerry B — June 2, 2010 @ 6:55 pm BST Jun 2,2010 | Reply

    • Jerry,
      You’ve already discovered that you can play around with rownum to some effect, but it’s much more complicated than SQL Server. It sounds like SQL Server can work along the lines of: “I am aggregating and now I am stopping because I have aggregated too much” whereas Oracle is generally limited by rownum to “I am going to get the stuff I need to aggregate and I will stop if I find too much”. The nature of the Oracle code is restricted to the following type of thing:

      select 
      	/*+ gather_plan_statistics */
      	count(n1)
      from
      	(select
      		n1
      	from
      		t1
      	where
      		n1 = 5
      	and	rownum <= 300
      	)
      ;
      
      

      In this case there are 500 rows that match the “n1 = 5″ predicate in the table, but I can stop the query from working once it have found the first 300.

      Comment by Jonathan Lewis — June 4, 2010 @ 8:28 am BST Jun 4,2010 | Reply

  15. One think I do like about SQL Server is the Profiler tool for getting traces and seeing what exactly is going on.

    Comment by Ronnie Doggart — August 5, 2010 @ 9:34 am BST Aug 5,2010 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,505 other followers