Oracle Scratchpad

February 6, 2013

Delphix

Filed under: Delphix — Jonathan Lewis @ 7:19 pm GMT Feb 6,2013

If you’re a regular follower or my blog you may recall Kyle Hailey and the joint webinar we did nearly two years ago on “Visual SQL Tuning” covering an approach I’ve written about in the past and a product that he developed at Embarcadero to automate the work that I’d been doing by hand and eye.

Kyle has now moved on to Delphix, and has become involved with another really interesting piece of technology – database virtualization. How do you supply a terabyte sized database to five different development teams without using up 5TB of disc space ? Create an operating environment that keeps one master copy of the database while maintaining a set of (small) private files for each team that hold private copies of the blocks that have been changed by that team – and that’s just one feature of the product.

The product is sufficiently interesting (plus I have a healthy regard for Kyle’s opinions) that I’ve accepted an invitation to go over to California for a few days next month to experiment with it, see what it can do, try to stress it a bit and so on. The people at Delphix are so confident that I’ll be impressed that they’re going to let me do this and then write up a blog telling you how things went.

Have a browse around their documentation and if you’re interested add a suggestion to the comment telling me what you’d like me to test, and how, and I’ll see if I can fit it into my timetable (no promises – but if you come up with interesting ideas I’ll see what I can do).

17 Comments »

  1. Hi Jonathan,

    Great news that you will test Delphix, as a person with extreme knowledge of how oracle engine works and how to bring a system to its knees.

    I personally would like to see

    1- Impact on compressed data and its manipulation
    2- Impact on reading changed data (is it suitable for regression testing of batch processing environments where processed data is huge )
    3- Any side effect with parallel processing or direct path reads on change data
    4- I know it has its own versioning but if there is any extra Impact when working with guaranteed restore points
    5- How does Updating the golden copy but keeping the test data works (probably I need to check the webinar recording for getting answer of this)
    6- As far as I understand everything is virtual, I wonder how the log writer performance any log file sync under load
    7- Any impact on crash recovery times (“When will it be up ” is the first question I’m asked when I request permission to shutdown a database where active session are ongoing – I have a habit of using shutdown abort:) )
    8- Is there a knee of the curve for number of database copies ?

    Looking forward to see your review of Delphix database virtualization.

    Comment by coskan — February 6, 2013 @ 11:50 pm GMT Feb 6,2013 | Reply

  2. #1, 2, and 5 above are of most interest to me along with general insight into the types of waits it incurs compared to those of a standard physical system. I’m also curious of verification and thoughts on the advertised compression ratios and capabilities to get an overall picture of storage requirements including the clone, transaction history, and write data on target systems. And of course, any personal ideas on how to mitigate potential performance pitfalls (if any) using good ol’ Oracle tuning knowledge would be interesting.

    Comment by Steve Karam (@OracleAlchemist) — February 7, 2013 @ 1:42 am GMT Feb 7,2013 | Reply

  3. Hi Jonathan,

    We were doing this back in 2006 using NetApp’s FlexClone technology.

    Gaius

    Comment by Gaius — February 7, 2013 @ 9:31 am GMT Feb 7,2013 | Reply

    • Gaius,

      Was this a “productised” approach, or was this like me doing rman standby databases on Oracle 6 with a few cunning shell scripts ?

      Comment by Jonathan Lewis — February 13, 2013 @ 4:50 pm GMT Feb 13,2013 | Reply

      • RMAN on Oracle 6 :-p

        FlexClone is the product, it just needed wrapping in some simple scripts to make it play nicely with Oracle, allocate an IP address for a listener, and so on. The question “How do you supply a terabyte sized database to five different development teams without using up 5TB of disc space ?” was solved a very long time ago. Nowadays there are loads of ways to do it, VMware and LVM being two others that spring immediately to mind.

        Comment by Gaius — February 19, 2013 @ 1:59 pm GMT Feb 19,2013 | Reply

        • Not to mention CloneDb – I’ll have to make sure I ask about the USPs that Delphix has to offer.

          Comment by Jonathan Lewis — February 19, 2013 @ 11:02 pm GMT Feb 19,2013

        • Cool, look forwards to hearing about it!

          Comment by Gaius — February 25, 2013 @ 11:53 am GMT Feb 25,2013

        • @Gaius: VMware “Linked clone technology for Oracle databases currently is not supported” http://pubs.vmware.com/datadirector-25/topic/com.vmware.ICbase/PDF/vfabric-data-director-25-administration-guide.pdf
          p 107
          Even if you did set up a linked clone (ie a thin provision) you’d be stuck on x86 database options, your source databases would have to be on VMware and most importantly they would suffer major performance issues which is why IMO VMware doesn’t support this configuration.

          Yes there are a number of ways to thin provision a clone database. One of the most interesting is clonedb because it runs with Oracle 11.2.2+ with default install, no extra license or specialized storage. With clonedb you are stuck with the initial copy of the database. If you want a clone from a different time you have to create a new copy of production which sort of defeats the purpose. There is some flexibility in that one could apply redo to the base copy to get clones at different points in time but of course the farther away one is from the base copy the longer and harder it is going to be to stand up the clone.

          On the other hand there are limited methods that allow one to roll the base copy forward and free one from dragging around the original copy and/or creating full new copies for new clones. All these methods require specialized hardware. There is of course Netapp, there is EMC with SRDF , there is Oracle ZFS storage appliance. All require export storage skills including knowledge of how to mount file systems to target machines, how to snapshot, what files to track and how to track them such as datafiles, controlfiles, redo logs, archive logs, how to layout files across LUNs and stapshot the correct LUNs. In order to automate this so an end users can provision their own clone is a tremendous amount of work. CERN gave a presentation in January at a UKOUG conference and said they had coded over 25,000 lines of code to try and make a Netapp solution capable of providing end users provisioning of clones (ie taking out the storage admin, sysadmins, dbas)

          Unlike all of these Delphix is a software stack that installs on commodity x86 hardware and can use any storage you give it. It’s fully automated. It takes a one time full copy of the source database and then does incremental forever collection from the source databases. From their end users can provision clones onto any registered host with the correct Oracle executable choosing down to the SCN the point in time to recover to. All recovery, mount of file systems, setting up init.ora etc is done automatically. The new virtual database is also tracked by Delphix and a clone of the virtual database can be forked off from this virtual database. Forking can be done as many times and as deep as one wants. This is great for passing copies of developer databases to QA groups for example.

          As I like to say, why if thin provisioning is so great saving massive amounts of storage and enabling quick provisioning of database copies is it not seeing more adoptions? The reason is the barrier to entry with specialized storage requirements, expert storage admin skills and major amounts of scripting to make it turn key keep most of us out of the technology. Delphix breaks down all those barriers by making it turn key simple, automated, running on any x86 hardware and using any storage. Now we are seeing huge adoption rates.

          It’s like the internet was there before browsers. Tools like bulletin boards, telnet and ftp were there but it wasn’t until web browsers came out that the internet usage exploded.

          Comment by Kyle Hailey — April 4, 2013 @ 5:35 am BST Apr 4,2013

  4. Hello Jonathan,
    I want you to test, thin provisioning using automation and vdb auto refreshes using scripting. Because you don’t want to do the same thing over and over, why not automate it just by an execution of a script and a parameter to create what you want.
    Good luck!

    Comment by NConstant — February 8, 2013 @ 4:13 pm GMT Feb 8,2013 | Reply

  5. Dear Jonathan,
    Please, ask Delphix ZFS folks like Matt Ahrens and Eric Schrock how they are dealing with ZFS fixed record size and Oracle wanting to do both single-block reads (e.g. 8KB IOs) and multi-block reads (e.g. 1MB IOs). In my experience, choosing 8KB for ZFS record size would absolutely make things miserable for 1MB Oracle IOs which would get broken down into 128 x 8KB IOs to a physical disk subsystem. 100MB/s of direct reads translates into whooping 12800 IOPS on the backend.
    Also, it would be interesting to hear their opinion about the effect of Copy-on-Write (COW) mechanism that ZFS employs on Oracle on-disk structures getting fragmented mercilessly over time.
    Regards,
    Leonid

    Comment by Leonid Roodnitsky — February 8, 2013 @ 11:03 pm GMT Feb 8,2013 | Reply

    • Hey Leonid,

      ZFS — on which DxFS is based — would indeed break up the 1MB read in your example into smaller reads, but that doesn’t mean that each of those smaller chunks would become its own I/O operation. ZFS aggregates adjacent operations to minimize the number of I/Os it issues.

      You’re right that for COW filesystems like ZFS, WAFL, and others a 1MB contiguous read might require many random reads depending on how the file was written. At Delphix we address that with a combination of prefetching and caching. Users of Delphix often see a storage reduction of 10:1. They see a similar increase of efficiency in DRAM caching. Several customers use Delphix with large memory configurations — not that expensive today — so, say, 1TB of memory can effectively keep all virtual databases cached.

      Comment by Adam Leventhal — April 5, 2013 @ 4:24 pm BST Apr 5,2013 | Reply

  6. Hi Jonathan,
    Its will be interesting to check restoring Exadata database to there NFS file system. Its clear that if you have HCC compressed table, fetching from this table will fail, since it not supporting HCC.
    Also after decompressing the HCC the size of the restored data should be grouth (appx 1:8) .
    How does Deflhix handel such a case ?
    Regards,
    Yoav

    Comment by Yoav — February 11, 2013 @ 7:11 am GMT Feb 11,2013 | Reply

  7. Thanks for the suggestions so far.

    I’ve got access to some documents on “use cases” to get me started with some of the answer, but I’ll wait until I’ve had hands on before I start writing replies. Apart from the obvious restrictions imposed by EHCC, a similar type of problem that crossed my mind was the need for a development “gold copy” that didn’t contain sensitive data such as credit card numbers; I think there’s a use case on that one in the library as well.

    Comment by Jonathan Lewis — February 13, 2013 @ 4:54 pm GMT Feb 13,2013 | Reply

  8. […] heading off to Heathrow airport later on today to fly out to San Francisco for my week of  experimenting with Delphix. I’ve done a little preparation work, of course, including browsing around the Internet to […]

    Pingback by Virtual DB | Oracle Scratchpad — March 9, 2013 @ 9:52 am GMT Mar 9,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.