Saturday, February 6, 2010

NEW DATA TYPES IN SQL SERVER 2008

We will take
a look at the following new data types, each of which is available in all editions
of SQL Server 2008:Date and Time: Four new date and time data types
have been added, making working with time much easier than it ever has in the past.
They include: DATE, TIME, DATETIME2, and DATETIMEOFFSET.

Spatial: Two new spatial data types have been added--GEOMETRY and GEOGRAPHY--
which you can use to natively store and manipulate location-based information, such
as Global Positioning System (GPS) data.

HIERARCHYID:
The HIERARCHYID data type is used to enable database applications
to model hierarchical tree structures, such as the organization chart of a business.

FILESTREAM: FILESTREAM is not a data type as
such, but is a variation of the VARBINARY(MAX) data type that allows unstructured
data to be stored in the file system instead of inside the SQL Server database.
Because this option requires a lot of involvement from both the DBA administration
and development side, I will spend more time on this topic than the rest.

Date and Time
In SQL Server 2005 and earlier, SQL Server only offered two date and time data types:
DATETIME and SMALLDATETIME. While they were useful in many cases, they had a lot
of limitations, including:


  • Both the date value and the
    time value are part of both of these data types, and you can’t choose to store one
    or the other. This can cause several problems:
  • It often causes a lot of wasted
    storage because you store data you don’t need or want.
  • It adds unwanted complexity
    to many queries because the data types often have to be converted to a different
    form to be useful.
  • It often reduces performance
    because WHERE clauses with these data and time data types  often have to include
    functions to convert them to a more useful form, preventing these queries from using
    indexes.
  • They are not time-zone aware,
    which requires extra coding for time-aware applications.
  • Precision is only .333 seconds,
    which is not granular enough for some applications.
  • The range of supported dates
    is not adequate for some applications, and the range does not match the range of
    the .NET CLR DATETIME data type, which requires additional conversion code.

To overcome these problems, SQL Server 2008 introduces four new date and time data
types,
described in the
following sections. All of these new date and time data types work with SQL Server
2008 date and time functions,
which have been enhanced in order to properly understand the new

In addition, some new date and time functions have been added to take advantage
of the
capabilities of
these new data types. The new functions include SYSDATETIME,

TODATETIMEOFFSET, SYSUTCDATETIME,
and
DATE.

As you can imagine, the DATE data type only stores a date in the format of YYYY-MM-DD.
It has


a range of 0001-01-01 through
9999-12-32, which should be adequate for most business and


scientific applications. The
accuracy is 1 day, and it only takes 3 bytes to store the date.






        --Sample DATE output
        DECLARE @datevariable as DATE            
        SET @datevariable = getdate()            
        PRINT @datevariable
        Result: 2008-08-15        
        
TIME 


 


TIME is stored in the format:
hh:mm:ss.nnnnnnn, with a range of 00:00:00.0000000 through


23:59:59:9999999 and is accurate
to 100 nanoseconds. Storage depends on the precision and scale

selected, and runs from 3 to 5 bytes.



                    --Sample TIME output                        
                    DECLARE @timevariable as TIME
                    SET @timevariable = getdate()                       
                    PRINT @timevariable                        
                    Result: 14:26:52.3100000
                    
DATETIME2




DATETIME2 is very similar to
the older DATETIME data type, but has a greater range and


precision. The format is YYYY-MM-DD
hh:mm:ss:nnnnnnnm with a range of 0001-01-01


00:00:00.0000000 through 9999-12-31
23:59:59.9999999, with an accuracy of 100 nanoseconds.

depends on the precision and scale selected, and runs from 6 to 8 bytes.




        --Sample DATETIME2 output with a precision of 7
        DECLARE @datetime2variable datetime2(7)
        SET @datetime2variable = Getdate()
        PRINT @datetime2variable
        Result: 2008-08-15 14:27:51.5300000
        


DATETIMEOFFSET




DATETIMEOFFSET is similar to
DATETIME2, but includes additional information to track the


time zone. The format is YYYY-MM-DD
hh:mm:ss[.nnnnnnn] [+|-]hh:mm with a range of 0001-


01-01 00:00:00.0000000 through
0001-01-01 00:00:00.0000000 through 9999-12-31 23:59:59.9999999.

Universal Time (UTC), with an accuracy of 100 nanoseconds. Storage depends on the
and scale selected, and
runs from 8 to 10 bytes.

zone aware means a time zone identifier is stored as a part of DATETIMEOFFSET column.
time zone identification
is represented by a [-|+] hh:mm designation. A valid time zone falls in

range of -14:00 to +14:00, and
this value is added or subtracted from UTC to obtain the local 


 
--Sample DATETIMEOFFSET
output with a precision of 0



        --Specify a date, time, and time zone
        DECLARE @datetimeoffsetvariable DATETIMEOFFSET(0)
        SET @datetimeoffsetvariable ='2008-10-03 09:00:00 -10:00'
        --Specify a different date, time and time zone
        
        DECLARE @datetimeoffsetvariable1 DATETIMEOFFSET(0)
        SET @datetimeoffsetvariable1= '2008-10-04 18:00:00 +0:00'
        
        --Find the difference in hours between the above dates, times,and timezones
        SELECT DATEDIFF(hh,@datetimeoffsetvariable,@datetimeoffsetvariable1)                    
        
        Result: 23
        





Spatial

While spatial data has been stored
in many SQL Server databases for many years (using conventional

data types), SQL Server 2008 introduces two specific spatial data types that can
make it easier for
developers
to integrate spatial data in their SQL Server-based applications. In addition, by
storing
spatial data in
relational tables, it becomes much easier to combine spatial data with other kinds
of
business data. For
example, by combining spatial data (such as longitude and latitude) with the

physical address of a
business, applications can be created to map business locations on a map.


The two new spatial data types
in SQL 2008 are:





GEOMETRY: Used to store
planar (flat-earth) data. It is generally used to store XY


coordinates that represent points,
lines, and polygons in a two-dimensional space. For example


storing XY coordinates in the
GEOMETRY data type can be used to map the exterior of a


building.

GEOGRAPHY: Used to store
ellipsoidal (round-earth) data. It is used to store latitude and


longitude coordinates that represent
points, lines, and polygons on the earth’s surface. For


example, GPS data that represents
the lay of the land is one example of data that can be stored


in the GEOGRAPHY data type.



GEOMETRY and GEOGRAPHY
data types are implemented as .NET CLR data types. This means

that they can support various properties and methods specific to the data. For example,
a method
can be used to
calculate the distance between two GEOMETRY XY coordinates, or the distance

between two GEOGRAPHY latitude
and longitude coordinates. Another example is a method to see

if two spatial objects intersect or not. Methods defined by the Open Geospatial
Consortium
standard, and
Microsoft extensions to that standard, can be used. To take full advantage of these

methods, you will have to be
an expert in spatial data.
Another
feature of spatial data types is that they support special spatial indexes. Unlike
conventional
indexes,
spatial indexes consist of a grid-based hierarchy in which each level of the index
subdivides


the grid sector that is defined
in the level above. But like conventional indexes, the SQL Server query

optimizer can use spatial indexes
to speed up the performance of queries that return spatial data.
Spatial
data is an area unfamiliar to many DBAs. If this is a topic you want to learn more
about, you
will need a
good math background, otherwise you will get lost very quickly.







HIERARCHYID


While hierarchical tree structures
are commonly used in many applications, SQL Server has, up to

not made it easy to represent and store them in relational tables. In SQL Server
2008, the


HIERARCHYID data type
has been added to help resolve this problem. It is designed to store

that represent the position of
nodes in a hierarchal tree structure.

For example, the HIERARCHYID data type makes it easier to express the following
types of
relationships
without requiring multiple parent/child tables and complex joins:


  • Organizational structures

  • A set of tasks that make up a larger projects (like a GANTT chart)

  • File systems (folders and their sub-folders)

  • A classification of language terms

  • A bill of materials to assemble or build a product

  • A graphical representation of links between web pages

Unlike standard data types, the HIERARCHYID data type is a CLR user-defined
type, and it exposes
many
methods that allow you to manipulate the date stored within it. For example, there
are methods
to get the
current hierarchy level, get the previous level, get the next level, and many more.
In fact,
the HIERARCHYID
data type is only used to store hierarchical data; it does not automatically
represent a hierarchical
structure. It is the responsibility of the application to create and assign

HIERARCHYID values in a way that
represents the desired relationship. Think of a

HIERARCHYID data type as a place to store positional nodes of a tree structure,
not as a way to
create
the tree structure.






FILESTREAM


SQL Server is great for storing
relational data in a highly structured format, but it has never been

particularly good at storing
unstructured data, such as videos, graphic files, Word documents, Excel

spreadsheets, and so on. In the
past, when developers wanted to use SQL Server to manage such

unstructured data, they essentially had two choices:


  • Store it in VARBINARY(MAX) columns inside the database

  • Store the data outside of the database as part of the file system, and include pointers
    inside
    a column that pointed
    to the file’s location. This allowed an application that needed access

    to the file to find it by looking
    up the file’s location from inside a SQL Server table.Neither of these options was
    perfect. Storing unstructured data in VARBINARY(MAX) columns

    offers less than ideal performance, has a 2 GB size limit, and can dramatically
    increase the size of a

    database. Likewise, storing unstructured data in the file system requires the DBA
    to overcome several
    difficulties. 

For example:


  • Files have a unique naming system that allows hundreds, if not thousands of files
    to be keep
    track of and
    requires very careful management of the folders to store the data.

  • Security is a problem and often requires using NTFS permissions to keep people from
    accessing the files inappropriately.

  • The DBA has to perform separate backups of the database and the files

  • Problems can occur when outside files are modified or moved and the database is
    not updated
    to reflect
    this.



To help resolve these problems,
SQL Server 2008 has introduced what is called FILESTREAM

storage, essentially a hybrid approach that combines the best features of the previous
two options.





Benefits of FILESTREAM

FILESTREAM storage is
implemented in SQL Server 2008 by storing VARBINARY(MAX) binary

large objects (BLOBs) outside of the database and in the NTFS file system. While
this sounds very
similar
to the older method of storing unstructured data in the file system and pointing
to it from a
column, it
is much more sophisticated. Instead of a simple link from a column to an outside
file, the
SQL Server Database
Engine has been integrated with the NTFS file system for optimum
performance and ease of
administration. For example, FILESTREAM data uses the Windows OS

system cache for caching data
instead of the SQL Server buffer pool. This allows SQL Server to do
what it does best: manage structured
data, and allows the Windows OS to do what is does best:

manage large files. In addition, SQL Server handles all of the links between database
columns and
the files,
so we don’t have to.
In
addition, FILESTREAM storage offers these additional benefits:


  • Transact-SQL can be used to SELECT, INSERT, UPDATE, DELETE FILESTREAM data.

  • By default, FILESTREAM data is backed up and restored as part of the database file.
    If you want, there is an option available so you can backup a database without the
    FILESTREAM data.

  • The size of the stored data is only limited by the available space of the file system.
    Standard
    VARBINARY(MAX)
    data is limited to 2 GB.

Limitations of FILESTREAM


As you might expect, using FILESTREAM
storage is not right for every situation. For example, it is

best used under the following conditions:


  • When the BLOB file sizes average 1MB or higher.

  • When fast read access is important to your application.

  • When applications are being built that use a middle layer for application logic.

  • When encryption is not required, as it is not supported for FILESTREAM data.

    If your application doesn’t meet
    the above conditions, then using the standard VARBINARY(MAX) data type might be
    your best option.
    If you
    are used to storing binary data inside your database, or outside your database (but
    using
    pointers inside
    the database that point to the binary files), then you will find using FILESTREAM

    storage to be substantially different.
    You will want to thoroughly test your options before

    implementing one option or the other, in any new applications you build. 

How to Implement FILESTREAM
Storage
Enabling SQL Server
to use FILESTREAM data is a multiple-step process, which includes:


  • Enabling the SQL Server instance to use FILESTREAM data

  • Enabling a SQL Server database to use FILESTREAM data

  • Creating FILESTREAM-enabled columns in a table, by specifying the "VARBINARY(MAX)
    FILESTREAM" data type.




No comments:

Post a Comment