SQL DATA TYPE -- string
SQL99
SQL99 has two major character sets: CHARACTER and CHARACTER VARYING. In addition, there are also NATIONAL CHARACTER and NATIONAL CHARACTER VARYING.
CHARACTER can also be abbreviated with CHAR. The size can optionally be specified in the form CHARACTER(n). For example, CHARACTER(15) can hold character strings up to 15 characters long. If size is omitted, the default is 1. An error occurs if one tries to store a string that is bigger than the size declared.
CHARACTER VARYING can be abbreviated with CHAR VARYING or VARCHAR. You have to specify the maximum size for strings to be stored, for example, CHARACTER VARYING(15) holds 15-character strings, or smaller.
NATIONAL CHARACTER (NATIONAL CHAR, NCHAR, CHARACTER CHARACTER SET <char_set_name>) specifies the default data type for the country of implementation. This is a fixed-length character string data type.
NATIONAL CHARACTER VARYING (NATIONAL CHAR VARYING, NCHAR VARYING, CHARACTER VARYING CHARACTER SET < char_set_name>, CHAR VARYING CHARACTER SET < char_set_name>) is a varying-length country-specific character string data type.
CLOB is a new SQL99 data type to store large nondatabase-structured text objects of varying size and complexity, such as employees' resumes, collections of papers, books, and other similar data.
Oracle 9i
Oracle is fully compliant with SQL99 standards for character strings.
CHAR is used for fixed-length strings. The default length for a CHAR column is 1 byte with a maximum of 2,000 bytes.
VARCHAR2 is an Oracle data type to store varying-length character strings. It does not have the default length, so you have to specify a value from 1 to 4,000 (maximum number of bytes for VARCHAR2).
NCHAR and NVARCHAR2 are used to store fixed-length and varying-length national character strings. Beginning with Oracle9i, they were redefined to be Unicode-only data types and can hold up to 2,000 and 4,000 characters (not bytes!) correspondingly. That means if you declare a column to be CHAR(100) it will allocate 100 bytes per column, but NCHAR(100) Unicode-based column requires 200 bytes.
Note The VARCHAR data type in Oracle is currently a synonym to VARCHAR2. If you declare a column as VARCHAR(30), it will be converted it to VARCHAR2(30) automatically. Oracle does not recommend the use of VARCHAR as a data type, but rather recommends VARCHAR2 instead because keyword VARCHAR may be later used in some different way.
CLOB and NCLOB can store up to four gigabytes of data in Oracle. Both fixed-length and variable-length character sets are supported. CLOB uses the CHAR database character set, and NCLOB stores Unicode data using the national character set.
LONG is an old Oracle data type to store variable-length character strings containing up to two gigabytes. It is similar to VARCHAR2, but has many limitations. For example, you cannot use LONG in the WHERE clause of a SELECT statement , a table can't have more than one LONG column, it can't be indexed, and so on. Oracle strongly recommends to discontinue the use of the LONG data type and use CLOB instead.
Note Oracle has synonyms for SQL99 compatibility. For example, you can use CHARACTER(100) rather than CHAR(100) or CHARACTER VARYING rather than VARCHAR2 to attain the same results.
DB2 UDB 8.1
DB2 has following character string data types:
CHARACTER is compliant with SQL99 standards. The maximum length is 254 characters. The default length is 1.
VARCHAR is used for varying-length strings and has a maximum of 32,672 characters.
LONG VARCHAR is virtually same as VARCHAR, but can hold larger values (up to 32,700) and can't be limited to a certain number of characters.
CLOB types are SQL99 compliant varying-length strings of up to two gigabytes. An optional maximum length can be supplied in kilobytes (K|k), megabytes (M|m), or gigabytes (G|g). For example, CLOB (10M) would allow maximum of 10,048,576 characters.
GRAPHIC is a rough DB2 equivalent to NATIONAL CHARACTER. It is a double-byte data type, which may range from 1 to 127 characters. If the length specification is omitted, a length of 1 is assumed.
VARGRAPHIC is a varying-length double-byte character string data type, comparable to SQL99 NATIONAL CHARACTER VARYING. The range is from 1 to 16,336.
LONG VARGRAPHIC is similar to VARGRAPHIC with a maximum length of 16,350. It does not have an optional length limit to be supplied by user.
DBCLOB is a double-byte equivalent to CLOB. Maximum storage is one gigabyte of character data. DBCLOB accepts a maximum length in the same way as CLOB.
Note GRAPHIC, VARGRAPHIC, and DBCLOB data types are not supported in the Personal Edition of DB2 supplied with your book.
MS SQL Server 2000
The following character string data types are supported by MS SQL Server:
CHAR and VARCHAR are used for fixed-length and variable-length character data correspondingly. The maximum length is 8,000 characters. Unlike Oracle, you don't have to specify length for VARCHAR — it defaults to 1 like CHAR.
TEXT is similar to VARCHAR, but can hold much larger values. Its maximum length is two gigabytes or 231 – 1 (2,147,483,647) characters.
NCHAR and NVARCHAR, and NTEXT are Unicode equivalents to CHAR, VARCHAR, and TEXT. NCHAR and NVARCHAR can hold up to 4,000 characters; NTEXT is much larger — one gigabyte or 230 – 1 (1,073,741,823) characters.
If one data type has more than one name (or synonym) the most widely used name is given in italics.
Character string literals
The terms literal and constant refer to a fixed data value, for instance
'Frozen Margarita' 'ALEX'
'2003/08/07' '10101101'
are all character literals. Character literals are enclosed in single quotes. To represent one single quotation mark within a literal, you can enter two single quotation marks:
'O''Neil'
Character literals are surprisingly consistent between all our three major vendors, with only slight variations. For example, MS SQL Server allows double quotes for character literals instead of single ones if the option QUOTED_IDENTIFIER is set off for a connection. To represent a national character set literal, it has to be preceded by capital letter N (DB2 understands G in addition to N):
N'Jack Smith' N'Boris M.
Trukhnov' N'123 OAK ST.'
Text entered using this notation is translated into the national character set.
Binary strings
A binary string is a sequence of bytes in the same way that a character string is, but unlike character strings that usually contain information in the form of text, a binary string is used to hold nontraditional data such as images, audio and video files, program executables, and so on. Binary strings may be used for purposes similar to those of character strings (e.g., to store documents in MS Word format), but the two data types are not compatible; the difference being like text and a photo of the same text. Binary string data types are summarized in Table 3-2.
Table 3-2: Binary String Data Types SQL99
Oracle 9i DB2 UDB 8.1 MS SQL SERVER 2000
BIT BINARY[(n)] BIT VARYING
RAW(n) VARBINARY[(n)] LONG RAW
BLOB BLOB BLOB(n) IMAGE
Character vs. Special Files
It might sound a little bit confusing — why plain text documents can be stored as character strings, and a Word document has to be treated as a binary string. The thing is, a Word file is a text document from user's point of view, but from computer storage perspective it is not. In addition to plain text characters it contains many special signs and instructions that only MS Word software can interpret. The same is true for any other special files — bitmaps, spreadsheets, audio and video files, and so forth. You can think of it in this way: a special file (e.g., of the DOC, XLS, BMP, or AVI type) is like a tape for VCR, whereas a program (MS Word, Excel, Paint, QuickTime Player) is like a VCR. You have to have a VCR to play a tape, and it has to be the right VCR — if you try to play a standard US VHS tape in NTSC format on a European video recorder (PAL format), it's not going to work. You might see some blinking on your screen, you will hear some noise, but you will definitely not be able to watch the movie. Just try to open a Word file with, say, Notepad and you will see what we are talking about.
SQL99
SQL99 has following data types to store binary strings: BIT, BIT VARYING, and BLOB.
BIT is a fixed-length binary string somewhat similar to CHAR. If you declare a column to be BIT(100), 100 bytes will be allocated in memory/disk, and if the object you store is just 60 bytes, it's still going to occupy all 100 bytes.
BIT VARYING is similar to VARCHAR — even if you specify BIT VARYING(100) to be the data type lasting the previous example, it will only take 60 bytes to store the object.
BLOB is a binary equivalent to CLOB.
Oracle 9i
Oracle doesn't have an equivalent to SQL99 BIT, but has two data types that correspond to BIT VARYING — RAW and LONG RAW. BLOB data type is also supported.
RAW can hold a maximum of 2,000 bytes. The size has to be specified.
LONG RAW can accumulate up to two gigabytes of data. This data type is obsolete, and Oracle strongly recommends converting it to BLOB.
BLOB can store up to four gigabytes of binary data in Oracle.
DB2 UDB 8.1
The only data type for binary strings in DB2 is BLOB, which can be up to 2 gigabytes long.
MS SQL Server 2000
MS SQL Server has three different data types for binary strings: BINARY, VARBINARY, and IMAGE.
BINARY is a fixed-length data type to store binary data. The size can be specified from 1 to 8,000; the actual storage volume is size + 4 bytes.
VARBINARY can hold variable-length binary data. The size is from 1 through 8,000. Storage size is the actual length of the data entered + 4 bytes. The data entered can be 0 bytes in length.
IMAGE is a variable-length binary data type that can hold from 0 through 2,147,483,647 bytes (two gigabytes) of data.
Binary string literals
MS SQL Server allows literals for binary string fields (BINARY, VARBINARY, IMAGE) either in the form of hexadecimal numbers prefixed with 0x or as binary strings. The value has to be unquoted:
0xAE
0101010010100110
MS SQL Server implicitly converts these literals into appropriate binary format. Oracle and DB2 don't have binary string literals; the values have to be converted into proper format using special functions .
Note Literals are barely needed for large objects that can store gigabytes of data. In most cases LOBs are not manipulated by traditional SQL statements, but rather accessed by special programs and interfaces that know how to handle such objects without reading them directly into memory.
char=> oracle:2000;db2:254; sqlserver:8000
varchar=>oracle:4000;db2:32672; sqlserver:8000
nchar=>oracle:2000;db2.GRAPHIC:177; sqlserver:4000
nvarchar=>oracle:4000;db2.VARGRAPHIC:16336; sqlserver:4000
CLOB=>oracle:4G;db2:2G; sqlserver.TEXT:2G
NCLOB=>oracle:4G;db2.DBCLOB:1G; sqlserver.NTEXT:1G