Oracle Scratchpad

December 2, 2011

to_char()

Filed under: Infrastructure,Oracle — Jonathan Lewis @ 7:48 am BST Dec 2,2011

Here’s an odd little detail about the to_char() function that happened to show up in a little demonstration code I used to create some data for last Friday’s quiz night.

When generating test data I often rely on doing things with rownum, and one of the thngs I do quite frequently is turn it into a character string. Nowadays I usually use the lpad() function to do this conversion because that lets me specify the defined length of the resulting column. But last Friday, rather than starting from scratch with my standard template script, I created my test case by cloning a script that I’d written in 2003 and the script had used the to_char() function.

So here’s a simple script to create a little table of character data, creating each column in a different way:

create table t as
select
	rownum				id,
	lpad(rownum,5)			lp5,
	rpad(rownum,5)			rp5,
	cast(rownum as varchar2(5))	cvc5,
	cast(rownum as char(5))		cc5,
	to_char(rownum)			tc
from
	all_objects
where
	rownum <= 12
;

And here are the results of a describe followed by a select:

SQL> desc t
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 ID                                     NUMBER
 LP5                                    VARCHAR2(5)
 RP5                                    VARCHAR2(5)
 CVC5                                   VARCHAR2(5)
 CC5                                    CHAR(5)
 TC                                     VARCHAR2(40)

SQL> select * from t order by id;

        ID LP5   RP5   CVC5  CC5   TC
---------- ----- ----- ----- ----- ----------------------------------------
         1     1 1     1     1     1
         2     2 2     2     2     2
         3     3 3     3     3     3
         4     4 4     4     4     4
         5     5 5     5     5     5
         6     6 6     6     6     6
         7     7 7     7     7     7
         8     8 8     8     8     8
         9     9 9     9     9     9
        10    10 10    10    10    10
        11    11 11    11    11    11
        12    12 12    12    12    12

12 rows selected.

As you can see, lpad(), rpad() and cast() allow you to specify the column length you want in the table definition. Cast() pads on the right (where necessary) with spaces, as does rpad() – although you can specify a padding character with rpad() and lpad(). Lpad() pads on the left (and it’s the one I use most frequently because it can be convenient to see the character representations sort in numeric order – sometimes I use ‘0’ as the padding character).

The “trap” comes from to_char() which results in a column declaration of varchar2(40) but doesn’t do any padding, of course. It’s not much of a trap, really; but when you’ve got into the habit of expecting your character strings to take up some space and see the varchar2(40) in the describe and assume that every string has been padded to 40 characters, it’s easy (and very annoying) to waste time trying to work out why the arithmetic is wrong. It was interesting to note that one of the people offering an answer to the question did exactly what I had originally intended to do, viz: sest up my varchar2(40) to hold 40 characters.

Footnote: it’s crossed my mind that the size used for the column declaration might depend on the NLS settings, and could be affected by the default character set, locale, and use of separators – but I’ll leave that investigation to any readers who have a little spare time.

10 Comments »

  1. great great woow its a great programming skill

    Comment by lkafle — December 2, 2011 @ 9:43 am BST Dec 2,2011 | Reply

  2. Purely as a guess, but perhaps the 40 characters come from assuming NUMBER(38) with an extra 2 characters for a sign and a leading space?

    Comment by Reinhard — December 2, 2011 @ 2:49 pm BST Dec 2,2011 | Reply

  3. I got varchar2(20) instead of varchar2(5) for both lp5 and rp5 Did I miss something?

    Comment by Junping Zhang — December 2, 2011 @ 4:42 pm BST Dec 2,2011 | Reply

    • Junping,

      If it happened, it happened.

      As Richard Feynmann once said It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong (This does assume that the experiment was implemented correctly, of course.) If you’re getting varchar2(20) then my “theory” is wrong.

      My first thought in response would be that the result is affected in some way by the database character set, or the NLS character set – which I’d check.
      In your case, as a quick mechanical check I’d see if lpad(N) produce varchar2(4N).

      Comment by Jonathan Lewis — December 2, 2011 @ 5:45 pm BST Dec 2,2011 | Reply

  4. Jonathan,

    Thank you!, Yes, you pointed to the right direction. I am able to get varchar2(5) for US7ASCII while It was varchar2(20) for AL32UTF8 initially.

    Comment by Junping Zhang — December 2, 2011 @ 6:38 pm BST Dec 2,2011 | Reply

    • I got it to switch from varchar2(20) to varchar2(5) with an “alter session set nls_length_semantics=char;”. Though, of course, that isn’t really any different except in appearance. The TC appears at VARCHAR2(40) in both though.

      Comment by Gary — December 2, 2011 @ 10:19 pm BST Dec 2,2011 | Reply

  5. You can add the format to to_char and get the length you need (plus 1 character it seems)

    create table t as
    select to_char(99,’99999′) tc from dual
    /

    table T created.

    desc t

    Name Null Type
    —- —- ———–
    TC VARCHAR2(6)

    create table t2 as
    select to_char(rownum,’999999′) tc from dba_triggers
    /
    table T2 created.

    desc t2

    Name Null Type
    —- —- ———–
    TC VARCHAR2(7)

    Comment by Aldo Bravo — December 3, 2011 @ 12:04 am BST Dec 3,2011 | Reply

    • Aldo,

      Thanks for that extra bit of information.

      This suggests that the varchar2(40) is connected to Oracle using a default format for conversion, perhaps assuming that a generic number is number(38) which then needs a character for the sign and a character for the decimal point.

      You can get rid of the “extra” one character in your case with the format ‘FM999999′.

      Comment by Jonathan Lewis — December 3, 2011 @ 8:32 am BST Dec 3,2011 | Reply

      • Jonathan, thanks for confirming my “guess” (post #2)! I forgot about the decimal point…

        Comment by Reinhard — December 5, 2011 @ 4:10 am BST Dec 5,2011 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,015 other followers