Showing posts with label Sql Server. Show all posts
Showing posts with label Sql Server. Show all posts

Tuesday, May 10, 2011

Capitalize the First Character of Each Word using queries

I have found t-sql challenges question where they were asked to provide a solution of oracle INITCAP function in sql.

I was started to find the solution of INITCAP function & found lots of user defined function.

But here I have shared a solution where queries doesn't use any user defined function as well as any variable.

--First of all create one temp table    
DECLARE @t TABLE(ID INT IDENTITY, Data VARCHAR(1000))

--insert the sample data
INSERT INTO @t 
SELECT 'yogesh bhadauriya' 


--use the Common table expression
;with CTE AS
(

    --anchor query where we are finding the first character for replacing with uppercase
    SELECT Cast(STUFF(LOWER(Data), 1, 1, UPPER(LEFT(Data, 1))) As varchar(max)) As Data,
    PATINDEX('%[^a-zA-Z][a-z]%', Data COLLATE Latin1_General_Bin) As Position,ID,Data As OldData
    From @t
    
    Union All
    
   --recursive query
    SELECT Cast(STUFF(CTE.Data, Position, 2, UPPER(SUBSTRING(CTE.Data, CTE.Position, 2))) As Varchar(max)) as Data,
    Cast(PATINDEX('%[^a-zA-Z][a-z]%', CTE.Data COLLATE Latin1_General_Bin) As Int) As Position,CTE.ID,T.Data As OldData
    From @t As T
    Inner Join CTE On CTE.ID=CTE.ID
    where CTE.Data is not null
)
Select ID,OldData As Data,Data As Results
From CTE
where Data is not null
and Position=0


---output
--Id Data    Results
--1 yogesh bhadauriya Yogesh Bhadauriya
    

Friday, April 29, 2011

User defined function to convert the number into charcter

How to convert the numeric value into its specific character? If this is your question, than create the following User Defined Function into your database.


--create the following function 
CREATE FUNCTION dbo.udf_Num_ToWords 
(
 @Number Numeric (38, 0) -- Input number with as many as 18 digits

) RETURNS VARCHAR(8000) 
/*
* Converts a integer number as large as 34 digits into the 
* equivalent words.  The first letter is capitalized.
*
* Attribution: Based on NumberToWords by Srinivas Sampath
*        as revised by Nick Barclay
*
* Example:
select dbo.udf_Num_ToWords (1234567890) + CHAR(10)
      +  dbo.udf_Num_ToWords (0) + CHAR(10)
      +  dbo.udf_Num_ToWords (123) + CHAR(10)
select dbo.udf_Num_ToWords(76543210987654321098765432109876543210)
 
DECLARE @i numeric (38,0)
SET @i = 0
WHILE @I <= 1000 BEGIN 
    PRINT convert (char(5), @i)  
            + convert(varchar(255), dbo.udf_Num_ToWords(@i)) 
    SET @I  = @i + 1 
END
*
* Published as the T-SQL UDF of the Week Vol 2 #9 2/17/03
****************************************************************/
AS BEGIN

DECLARE @inputNumber VARCHAR(38)
DECLARE @NumbersTable TABLE (number CHAR(2), word VARCHAR(10))
DECLARE @outputString VARCHAR(8000)
DECLARE @length INT
DECLARE @counter INT
DECLARE @loops INT
DECLARE @position INT
DECLARE @chunk CHAR(3) -- for chunks of 3 numbers
DECLARE @tensones CHAR(2)
DECLARE @hundreds CHAR(1)
DECLARE @tens CHAR(1)
DECLARE @ones CHAR(1)

IF @Number = 0 Return 'Zero'

-- initialize the variables
SELECT @inputNumber = CONVERT(varchar(38), @Number)
     , @outputString = ''
     , @counter = 1
SELECT @length   = LEN(@inputNumber)
     , @position = LEN(@inputNumber) - 2
     , @loops    = LEN(@inputNumber)/3

-- make sure there is an extra loop added for the remaining numbers
IF LEN(@inputNumber) % 3 <> 0 SET @loops = @loops + 1

-- insert data for the numbers and words
INSERT INTO @NumbersTable   SELECT '00', ''
    UNION ALL SELECT '01', 'one'      UNION ALL SELECT '02', 'two'
    UNION ALL SELECT '03', 'three'    UNION ALL SELECT '04', 'four'
    UNION ALL SELECT '05', 'five'     UNION ALL SELECT '06', 'six'
    UNION ALL SELECT '07', 'seven'    UNION ALL SELECT '08', 'eight'
    UNION ALL SELECT '09', 'nine'     UNION ALL SELECT '10', 'ten'
    UNION ALL SELECT '11', 'eleven'   UNION ALL SELECT '12', 'twelve'
    UNION ALL SELECT '13', 'thirteen' UNION ALL SELECT '14', 'fourteen'
    UNION ALL SELECT '15', 'fifteen'  UNION ALL SELECT '16', 'sixteen'
    UNION ALL SELECT '17', 'seventeen' UNION ALL SELECT '18', 'eighteen'
    UNION ALL SELECT '19', 'nineteen' UNION ALL SELECT '20', 'twenty'
    UNION ALL SELECT '30', 'thirty'   UNION ALL SELECT '40', 'forty'
    UNION ALL SELECT '50', 'fifty'    UNION ALL SELECT '60', 'sixty'
    UNION ALL SELECT '70', 'seventy'  UNION ALL SELECT '80', 'eighty'
    UNION ALL SELECT '90', 'ninety'   

WHILE @counter <= @loops BEGIN

 -- get chunks of 3 numbers at a time, padded with leading zeros
 SET @chunk = RIGHT('000' + SUBSTRING(@inputNumber, @position, 3), 3)

 IF @chunk <> '000' BEGIN
  SELECT @tensones = SUBSTRING(@chunk, 2, 2)
       , @hundreds = SUBSTRING(@chunk, 1, 1)
       , @tens = SUBSTRING(@chunk, 2, 1)
       , @ones = SUBSTRING(@chunk, 3, 1)

  -- If twenty or less, use the word directly from @NumbersTable
  IF CONVERT(INT, @tensones) <= 20 OR @Ones='0' BEGIN
   SET @outputString = (SELECT word 
                                      FROM @NumbersTable 
                                      WHERE @tensones = number)
                   + CASE @counter WHEN 1 THEN '' -- No name
                       WHEN 2 THEN ' thousand ' WHEN 3 THEN ' thousand '
                       WHEN 4 THEN ' billion '  WHEN 5 THEN ' trillion '
                       WHEN 6 THEN ' quadrillion ' WHEN 7 THEN ' quintillion '
                       WHEN 8 THEN ' sextillion '  WHEN 9 THEN ' septillion '
                       WHEN 10 THEN ' octillion '  WHEN 11 THEN ' nonillion '
                       WHEN 12 THEN ' decillion '  WHEN 13 THEN ' undecillion '
                       ELSE '' END
                               + @outputString
      END
   ELSE BEGIN -- break down the ones and the tens separately

             SET @outputString = ' ' 
                            + (SELECT word 
                                    FROM @NumbersTable 
                                    WHERE @tens + '0' = number)
              + '-'
                             + (SELECT word 
                                    FROM @NumbersTable 
                                    WHERE '0'+ @ones = number)
                   + CASE @counter WHEN 1 THEN '' -- No name
                       WHEN 2 THEN ' thousand ' WHEN 3 THEN ' million '
                       WHEN 4 THEN ' billion '  WHEN 5 THEN ' trillion '
                       WHEN 6 THEN ' quadrillion ' WHEN 7 THEN ' quintillion '
                       WHEN 8 THEN ' sextillion '  WHEN 9 THEN ' septillion '
                       WHEN 10 THEN ' octillion '  WHEN 11 THEN ' nonillion '
                       WHEN 12 THEN ' decillion '   WHEN 13 THEN ' undecillion '
                       ELSE '' END
                            + @outputString
  END

  -- now get the hundreds
  IF @hundreds <> '0' BEGIN
   SET @outputString  = (SELECT word 
                                      FROM @NumbersTable 
                                      WHERE '0' + @hundreds = number)
                 + ' hundred ' 
                                + @outputString
  END
 END

 SELECT @counter = @counter + 1
      , @position = @position - 3

END

-- Remove any double spaces
SET @outputString = LTRIM(RTRIM(REPLACE(@outputString, '  ', ' ')))
SET @outputstring = UPPER(LEFT(@outputstring, 1)) + SUBSTRING(@outputstring, 2, 8000)


RETURN @outputString -- return the result
END





To check the output of the above User Defined Function, execute the below query into your database.

--now run the following query to get the result

select dbo.udf_Num_ToWords(5800000)

--output
--Five thousand eight hundred thousand


--
Happy Coding to All

Recursive Query To Find All Child Node in SQL server 2005+

How to find the all child of particular parent using the Recursive Query
--here we are creating the table variable so that we can insert some dummy record
Declare @Table Table
(
 TId int,
 ParentId int,
 Name varchar(10)
)
--inserting some records
--using the union all to insert more than one records
insert into @Table
Select 1,NULL,'ICT'
Union All
Select 2,1,'ICT-M1'
Union All
Select 4,1,'ICT-M2'
Union All
Select 7,2,'ICT-M1U1'
Union All
Select 8,2,'ICT-M1U2'
Union All
Select 9,4,'ICT-M2U1'
Union All
Select 10,4,'ICT-M2U2'
Union All
Select 11,7,'ICT-M1U1P1'
Union All
Select 12,7,'ICT-M1U1P2'
Union All
Select 13,8,'ICT-M1U2P1'
Union All
Select 14,8,'ICT-M1U2P2'
Union All
Select 15,9,'ICT-M2U1P1'
Union All
Select 16,9,'ICT-M2U1P2'
Union All
Select 17,10,'ICT-M2U2P1'
Union All
Select 18,10,'ICT-M2U2P2'

--variable to hold data
Declare @ChildNode varchar(1000)
Set @ChildNode='';

 
--use the standard recursive query
;with [CTE] as 
(
 --anchor query where we are finding the all parents
    select TId,ParentId,Name,CAST(ISNULL(CAST(ParentId as varchar(10)),'0') As Varchar(100)) As ChildNode
    from @Table c where c.[ParentId] is null

    union all

 --recursive query where we are finding the all child according to anchor query parent
    select c.TId,c.ParentId,c.Name,
    CAST( p.ChildNode +','+cast(c.TId as varchar(10) ) As Varchar(100)) As ChildNode
    from [CTE] p, @Table c where c.[ParentId] = p.[TId]    
)
--select the child node as per the id
--Assigin the all Ids into  one variable
select @ChildNode=@ChildNode+','+Cast(TId as varchar(10))
from [CTE]
Cross Apply
dbo.Split(ChildNode,',')
where items=9
order by TId

select SUBSTRING(@ChildNode,2,LEN(@ChildNode))

--outpu
--2,7,8,11,12,13,14


----
--create the tabular function to split the multi valued into table
Create FUNCTION [dbo].[Split](@String varchar(8000), @Delimiter char(1))     
returns @temptable TABLE (items varchar(8000))     
as     
begin     
 declare @idx int     
 declare @slice varchar(8000)     
    
 select @idx = 1     
  if len(@String)<1 or @String is null  return     
    
 while @idx!= 0     
 begin     
  set @idx = charindex(@Delimiter,@String)     
  if @idx!=0     
   set @slice = left(@String,@idx - 1)     
  else     
   set @slice = @String     
  
  if(len(@slice)>0)
   insert into @temptable(Items) values(@slice)     

  set @String = right(@String,len(@String) - @idx)     
  if len(@String) = 0 break     
 end 
return     
end


--
Happy Coding

Wednesday, May 19, 2010

Age calculation with SQL Server

There seem to be many different methods being suggested to calculate an age in SQLServer.  Some are quite complex but most are simply wrong.  This is by far the simplest and accurate method that I know.

Declare @Date1 datetime
Declare @Date2 datetime


Select @Date1 = '15Feb1971'Select @Date2 = '08Dec2009'select CASE
WHEN dateadd(year, datediff (year, @Date1, @Date2), @Date1) > @Date2
THEN datediff (year, @Date1, @Date2) - 1
ELSE datediff (year, @Date1, @Date2)END as Age
 
--
happy coding 

Wednesday, February 24, 2010

Reset Identity Column Value in SQL Server

If you are using an identity column on your SQL Server tables, you can set the next insert value to whatever value you want. An example is if you wanted to start numbering your ID column at 1000 instead of 1.


It would be wise to first check what the current identify value is. We can use this command to do so:


DBCC CHECKIDENT ('tablename', NORESEED) 
 
For instance, if I wanted to check the next ID value of my orders table, I could use this command:


DBCC CHECKIDENT (orders, NORESEED) 
 
To set the value of the next ID to be 1000, I can use this command:
DBCC CHECKIDENT (orders, RESEED, 999)

Note that the next value will be whatever you reseed with + 1, so in this case I set it to 999 so that the next value will be 1000.


Another thing to note is that you may need to enclose the table name in single quotes or square brackets if you are referencing by a full path, or if your table name has spaces in it. (which it really shouldn’t)


DBCC CHECKIDENT ('databasename.dbo.orders',RESEED, 999)

Saturday, February 6, 2010

NEW DATA TYPES IN SQL SERVER 2008

We will take
a look at the following new data types, each of which is available in all editions
of SQL Server 2008:Date and Time: Four new date and time data types
have been added, making working with time much easier than it ever has in the past.
They include: DATE, TIME, DATETIME2, and DATETIMEOFFSET.

Spatial: Two new spatial data types have been added--GEOMETRY and GEOGRAPHY--
which you can use to natively store and manipulate location-based information, such
as Global Positioning System (GPS) data.

HIERARCHYID:
The HIERARCHYID data type is used to enable database applications
to model hierarchical tree structures, such as the organization chart of a business.

FILESTREAM: FILESTREAM is not a data type as
such, but is a variation of the VARBINARY(MAX) data type that allows unstructured
data to be stored in the file system instead of inside the SQL Server database.
Because this option requires a lot of involvement from both the DBA administration
and development side, I will spend more time on this topic than the rest.

Date and Time
In SQL Server 2005 and earlier, SQL Server only offered two date and time data types:
DATETIME and SMALLDATETIME. While they were useful in many cases, they had a lot
of limitations, including:


  • Both the date value and the
    time value are part of both of these data types, and you can’t choose to store one
    or the other. This can cause several problems:
  • It often causes a lot of wasted
    storage because you store data you don’t need or want.
  • It adds unwanted complexity
    to many queries because the data types often have to be converted to a different
    form to be useful.
  • It often reduces performance
    because WHERE clauses with these data and time data types  often have to include
    functions to convert them to a more useful form, preventing these queries from using
    indexes.
  • They are not time-zone aware,
    which requires extra coding for time-aware applications.
  • Precision is only .333 seconds,
    which is not granular enough for some applications.
  • The range of supported dates
    is not adequate for some applications, and the range does not match the range of
    the .NET CLR DATETIME data type, which requires additional conversion code.

To overcome these problems, SQL Server 2008 introduces four new date and time data
types,
described in the
following sections. All of these new date and time data types work with SQL Server
2008 date and time functions,
which have been enhanced in order to properly understand the new

In addition, some new date and time functions have been added to take advantage
of the
capabilities of
these new data types. The new functions include SYSDATETIME,

TODATETIMEOFFSET, SYSUTCDATETIME,
and
DATE.

As you can imagine, the DATE data type only stores a date in the format of YYYY-MM-DD.
It has


a range of 0001-01-01 through
9999-12-32, which should be adequate for most business and


scientific applications. The
accuracy is 1 day, and it only takes 3 bytes to store the date.






        --Sample DATE output
        DECLARE @datevariable as DATE            
        SET @datevariable = getdate()            
        PRINT @datevariable
        Result: 2008-08-15        
        
TIME 


 


TIME is stored in the format:
hh:mm:ss.nnnnnnn, with a range of 00:00:00.0000000 through


23:59:59:9999999 and is accurate
to 100 nanoseconds. Storage depends on the precision and scale

selected, and runs from 3 to 5 bytes.



                    --Sample TIME output                        
                    DECLARE @timevariable as TIME
                    SET @timevariable = getdate()                       
                    PRINT @timevariable                        
                    Result: 14:26:52.3100000
                    
DATETIME2




DATETIME2 is very similar to
the older DATETIME data type, but has a greater range and


precision. The format is YYYY-MM-DD
hh:mm:ss:nnnnnnnm with a range of 0001-01-01


00:00:00.0000000 through 9999-12-31
23:59:59.9999999, with an accuracy of 100 nanoseconds.

depends on the precision and scale selected, and runs from 6 to 8 bytes.




        --Sample DATETIME2 output with a precision of 7
        DECLARE @datetime2variable datetime2(7)
        SET @datetime2variable = Getdate()
        PRINT @datetime2variable
        Result: 2008-08-15 14:27:51.5300000
        


DATETIMEOFFSET




DATETIMEOFFSET is similar to
DATETIME2, but includes additional information to track the


time zone. The format is YYYY-MM-DD
hh:mm:ss[.nnnnnnn] [+|-]hh:mm with a range of 0001-


01-01 00:00:00.0000000 through
0001-01-01 00:00:00.0000000 through 9999-12-31 23:59:59.9999999.

Universal Time (UTC), with an accuracy of 100 nanoseconds. Storage depends on the
and scale selected, and
runs from 8 to 10 bytes.

zone aware means a time zone identifier is stored as a part of DATETIMEOFFSET column.
time zone identification
is represented by a [-|+] hh:mm designation. A valid time zone falls in

range of -14:00 to +14:00, and
this value is added or subtracted from UTC to obtain the local 


 
--Sample DATETIMEOFFSET
output with a precision of 0



        --Specify a date, time, and time zone
        DECLARE @datetimeoffsetvariable DATETIMEOFFSET(0)
        SET @datetimeoffsetvariable ='2008-10-03 09:00:00 -10:00'
        --Specify a different date, time and time zone
        
        DECLARE @datetimeoffsetvariable1 DATETIMEOFFSET(0)
        SET @datetimeoffsetvariable1= '2008-10-04 18:00:00 +0:00'
        
        --Find the difference in hours between the above dates, times,and timezones
        SELECT DATEDIFF(hh,@datetimeoffsetvariable,@datetimeoffsetvariable1)                    
        
        Result: 23
        





Spatial

While spatial data has been stored
in many SQL Server databases for many years (using conventional

data types), SQL Server 2008 introduces two specific spatial data types that can
make it easier for
developers
to integrate spatial data in their SQL Server-based applications. In addition, by
storing
spatial data in
relational tables, it becomes much easier to combine spatial data with other kinds
of
business data. For
example, by combining spatial data (such as longitude and latitude) with the

physical address of a
business, applications can be created to map business locations on a map.


The two new spatial data types
in SQL 2008 are:





GEOMETRY: Used to store
planar (flat-earth) data. It is generally used to store XY


coordinates that represent points,
lines, and polygons in a two-dimensional space. For example


storing XY coordinates in the
GEOMETRY data type can be used to map the exterior of a


building.

GEOGRAPHY: Used to store
ellipsoidal (round-earth) data. It is used to store latitude and


longitude coordinates that represent
points, lines, and polygons on the earth’s surface. For


example, GPS data that represents
the lay of the land is one example of data that can be stored


in the GEOGRAPHY data type.



GEOMETRY and GEOGRAPHY
data types are implemented as .NET CLR data types. This means

that they can support various properties and methods specific to the data. For example,
a method
can be used to
calculate the distance between two GEOMETRY XY coordinates, or the distance

between two GEOGRAPHY latitude
and longitude coordinates. Another example is a method to see

if two spatial objects intersect or not. Methods defined by the Open Geospatial
Consortium
standard, and
Microsoft extensions to that standard, can be used. To take full advantage of these

methods, you will have to be
an expert in spatial data.
Another
feature of spatial data types is that they support special spatial indexes. Unlike
conventional
indexes,
spatial indexes consist of a grid-based hierarchy in which each level of the index
subdivides


the grid sector that is defined
in the level above. But like conventional indexes, the SQL Server query

optimizer can use spatial indexes
to speed up the performance of queries that return spatial data.
Spatial
data is an area unfamiliar to many DBAs. If this is a topic you want to learn more
about, you
will need a
good math background, otherwise you will get lost very quickly.







HIERARCHYID


While hierarchical tree structures
are commonly used in many applications, SQL Server has, up to

not made it easy to represent and store them in relational tables. In SQL Server
2008, the


HIERARCHYID data type
has been added to help resolve this problem. It is designed to store

that represent the position of
nodes in a hierarchal tree structure.

For example, the HIERARCHYID data type makes it easier to express the following
types of
relationships
without requiring multiple parent/child tables and complex joins:


  • Organizational structures

  • A set of tasks that make up a larger projects (like a GANTT chart)

  • File systems (folders and their sub-folders)

  • A classification of language terms

  • A bill of materials to assemble or build a product

  • A graphical representation of links between web pages

Unlike standard data types, the HIERARCHYID data type is a CLR user-defined
type, and it exposes
many
methods that allow you to manipulate the date stored within it. For example, there
are methods
to get the
current hierarchy level, get the previous level, get the next level, and many more.
In fact,
the HIERARCHYID
data type is only used to store hierarchical data; it does not automatically
represent a hierarchical
structure. It is the responsibility of the application to create and assign

HIERARCHYID values in a way that
represents the desired relationship. Think of a

HIERARCHYID data type as a place to store positional nodes of a tree structure,
not as a way to
create
the tree structure.






FILESTREAM


SQL Server is great for storing
relational data in a highly structured format, but it has never been

particularly good at storing
unstructured data, such as videos, graphic files, Word documents, Excel

spreadsheets, and so on. In the
past, when developers wanted to use SQL Server to manage such

unstructured data, they essentially had two choices:


  • Store it in VARBINARY(MAX) columns inside the database

  • Store the data outside of the database as part of the file system, and include pointers
    inside
    a column that pointed
    to the file’s location. This allowed an application that needed access

    to the file to find it by looking
    up the file’s location from inside a SQL Server table.Neither of these options was
    perfect. Storing unstructured data in VARBINARY(MAX) columns

    offers less than ideal performance, has a 2 GB size limit, and can dramatically
    increase the size of a

    database. Likewise, storing unstructured data in the file system requires the DBA
    to overcome several
    difficulties. 

For example:


  • Files have a unique naming system that allows hundreds, if not thousands of files
    to be keep
    track of and
    requires very careful management of the folders to store the data.

  • Security is a problem and often requires using NTFS permissions to keep people from
    accessing the files inappropriately.

  • The DBA has to perform separate backups of the database and the files

  • Problems can occur when outside files are modified or moved and the database is
    not updated
    to reflect
    this.



To help resolve these problems,
SQL Server 2008 has introduced what is called FILESTREAM

storage, essentially a hybrid approach that combines the best features of the previous
two options.





Benefits of FILESTREAM

FILESTREAM storage is
implemented in SQL Server 2008 by storing VARBINARY(MAX) binary

large objects (BLOBs) outside of the database and in the NTFS file system. While
this sounds very
similar
to the older method of storing unstructured data in the file system and pointing
to it from a
column, it
is much more sophisticated. Instead of a simple link from a column to an outside
file, the
SQL Server Database
Engine has been integrated with the NTFS file system for optimum
performance and ease of
administration. For example, FILESTREAM data uses the Windows OS

system cache for caching data
instead of the SQL Server buffer pool. This allows SQL Server to do
what it does best: manage structured
data, and allows the Windows OS to do what is does best:

manage large files. In addition, SQL Server handles all of the links between database
columns and
the files,
so we don’t have to.
In
addition, FILESTREAM storage offers these additional benefits:


  • Transact-SQL can be used to SELECT, INSERT, UPDATE, DELETE FILESTREAM data.

  • By default, FILESTREAM data is backed up and restored as part of the database file.
    If you want, there is an option available so you can backup a database without the
    FILESTREAM data.

  • The size of the stored data is only limited by the available space of the file system.
    Standard
    VARBINARY(MAX)
    data is limited to 2 GB.

Limitations of FILESTREAM


As you might expect, using FILESTREAM
storage is not right for every situation. For example, it is

best used under the following conditions:


  • When the BLOB file sizes average 1MB or higher.

  • When fast read access is important to your application.

  • When applications are being built that use a middle layer for application logic.

  • When encryption is not required, as it is not supported for FILESTREAM data.

    If your application doesn’t meet
    the above conditions, then using the standard VARBINARY(MAX) data type might be
    your best option.
    If you
    are used to storing binary data inside your database, or outside your database (but
    using
    pointers inside
    the database that point to the binary files), then you will find using FILESTREAM

    storage to be substantially different.
    You will want to thoroughly test your options before

    implementing one option or the other, in any new applications you build. 

How to Implement FILESTREAM
Storage
Enabling SQL Server
to use FILESTREAM data is a multiple-step process, which includes:


  • Enabling the SQL Server instance to use FILESTREAM data

  • Enabling a SQL Server database to use FILESTREAM data

  • Creating FILESTREAM-enabled columns in a table, by specifying the "VARBINARY(MAX)
    FILESTREAM" data type.




Wednesday, December 30, 2009

Some Sql Server Interview Questions

* Difference between Set and Select
Ans:
1) Set is a ANSI standard for variable assignment.
   Select is a Non-ANSI standard when assigning variables.We can assign only one variable at a time

2) We can assign multiple variable at a time.When assigning from a query that returns more than one value, SET will fail with an error.
   When assigning from a query that returns more than one value, SELECT will assign the last value returned by the query and hide the fact that the query returned

* What is Network Packet Size in SQL
Ans:
NPS(Network Packet Size) is the size of the TDS (tabular data stream) packets used to communicate between your applications and your relational database engine and default packet size is 4 kilobytes and its manily depends on network packet size configuration option.


* How many different locks in SQL SERVER
Ans:
(1)Intent
(2)shared
(3)Update
(4)Exclusive
(5)Schema
(6)Bulk Update

* Diffrence between temp table and table variable
Ans:
(1)Temp Tables are created in the SQL Server TEMPDB database and therefore require more IO resources and locking. Table Variables and Derived Tables are created in memory.

(2)Temp Tables will generally perform better for large amounts of data that can be worked on using parallelism whereas Table Variables are best used for small amounts of data (I use a rule of thumb of 100 or less rows) where parallelism would not provide a significant performance improvement.

(3)You cannot use a stored procedure to insert data into a Table Variable or Derived Table. For example, the following will work: INSERT INTO #MyTempTable EXEC dbo.GetPolicies_sp whereas the following will generate an error: INSERT INTO @MyTableVariable EXEC dbo.GetPolicies_sp.

(4)Derived Tables can only be created from a SELECT statement but can be used within an Insert, Update, or Delete statement.

(5) In order of scope endurance, Temp Tables extend the furthest in scope, followed by Table Variables, and finally Derived Tables.

* What is Cursor
Ans:
  Cursor is a database object used by applications to manipulate data in a set on a row-by-row basis, instead of the typical SQL commands that operate on all the rows in the set at one time. For example, you can use cursor to include a list of all user databases and make multiple operations against each database by passing each database name as a variable

* Write some disadvantage of Cursor
Ans:
  Cursor plays there row quite nicely but although there are some disadvantage of Cursor . Because we know cursor doing roundtrip it will make network line busy and also make time consuming methods. First of all select query gernate output and after that cursor goes one by one so roundtrip happen.Another disadvange of cursor are ther are too costly because they require lot of resources and temporary storage so network is quite busy.

* Can you tell me the difference between DELETE and TRUNCATE commands
Ans:
  Delete command removes the rows from a table based on the condition that we provide with a WHERE clause. Truncate will actually remove all the rows from a table and there will be no data in the table after we run the truncate command.

Thursday, December 24, 2009

How to pad a numeric type column with leading zeros?

Execute the following Microsoft SQL Server T-SQL example scripts to demonstrate SQL padding of numbers and strings with leading and trailing zeros as well as with other designated characters, and techniques
for removing (trim) leading zeros. 



-- SQL Server leading zero - T-SQL padding numbers - lpad tsql - sql pad zero

SELECT ListPrice,

  Padded=RIGHT('0000000000' + CONVERT(VARCHAR,ListPrice), 10)

FROM AdventureWorks2008.Production.Product WHERE ListPrice > 0.0

------------

-- T SQL pad leading zeros - transact sql leading zeros

SELECT ProductID, ReorderPoint = REPLACE(STR(ReorderPoint, 6), SPACE(1), '0')

FROM AdventureWorks2008.Production.Product

/* ProductID            ReorderPoint

      1                 000750

      2                 000750

      3                 000600  */

 ------------

-- SQL Server leading zero - SQL leading zeros - sql server pad leading zero

USE AdventureWorks2008;

DECLARE @Number int = 789

SELECT RIGHT ('000000'+ CAST (@Number AS varchar), 6)

-- 000789

------------

USE AdventureWorks;

-- SQL padding salary pay rate money data type with leading zeroes

-- SQL left pad number - prefix number with zeros - sql server leading zero

-- SQL convert number to text - cast number to varchar string

SELECT EmployeeID,

       Rate,

       PaddedRate = RIGHT(REPLICATE('0',8) + CAST(Rate AS VARCHAR(8)),8)

FROM   HumanResources.EmployeePayHistory
/* Partial results

EmployeeID        Rate              PaddedRate

1                 12.45             00012.45

2                 13.4615           00013.46

3                 43.2692           00043.27

*/


-- SQL zero padding ListPrice money data type - t sql leading zero

-- SQL left pad - T-SQL string concatenation - sql concat

-- SQL convert number to string - pad numeric with zeros

SELECT ProductID,

       ListPrice,

       PaddedListPrice = RIGHT(REPLICATE('0', 8) + CAST(ListPrice AS VARCHAR(8)),8)

FROM Production.Product

/* Results sample

ProductID               ListPrice               PaddedListPrice

965                     742.35                  00742.35

*/


-- SQL month leading zero - sql pad month number with zero

SELECT RIGHT('0' + convert(varchar(2), month(GetDate())), 2)

-- 06
----------

-- trim leading zeros - sql trim leading zeros - remove leading zeros sql

USE AdventureWorks2008;

DECLARE @num  varchar(32)= '00091234560'

SELECT      right(@num, len(@num)+1 - patindex('%[^0]%', @num))

-- 91234560

------------

-- SQL pad numeric data type - SQL pad digits - transact sql leading zeros

-- SQL pad with leading zeroes - append leading zeros - T-SQL top function

-- SQL pad with trailing zeroes - MSSQL append trailing zeros

-- SQL cast money to numeric - cast numeric to string - mssql newid function

SELECT TOP (5)

      ProductName = Name,

      ListPrice = RIGHT(REPLICATE('0', 10)

      + CAST(CAST(ListPrice AS NUMERIC(9,3)) AS VARCHAR) ,10)  

FROM AdventureWorks.Production.Product

ORDER BY NEWID()

/* ProductName                      ListPrice

LL Mountain Frame - Black, 40       000249.790

HL Touring Frame - Yellow, 46       001003.910

Bike Wash - Dissolver               000007.950

Metal Sheet 7                       000000.000

HL Road Frame - Red, 56             001431.500

*/

----------


-- PAD leading zeros function - sql server leading zeros - UDF

USE AdventureWorks2008;

GO

CREATE FUNCTION fnPadNumber

               (@n      DECIMAL(26,2),

                @length TINYINT)

RETURNS VARCHAR(32)

AS

  BEGIN

    RETURN ( replicate('0',@length - len(convert(VARCHAR(32),@n))) +

             convert(VARCHAR(32),@n))

  END

GO


SELECT dbo.fnPadNumber(1234567890.12,16)

-- 0001234567890.12
------------

-- T SQL computed column zero padding - sql generate alphanumeric sequence

USE tempdb; -- SQL Server 2008 T-SQL

CREATE TABLE Celebrity (

  ID           INT    IDENTITY ( 1 , 1 )    PRIMARY KEY,

  CelebrityID  AS 'CEL' + RIGHT('0000' + CAST( ID as varchar),5), -- computed column

  FirstName    VARCHAR(32),

  LastName     VARCHAR(32),

  ModifiedDate DATE    DEFAULT getdate())

GO


INSERT Celebrity

      (FirstName,

       LastName)

VALUES('Jennifer','Aniston'),

      ('Drew','Barrymore'),

      ('Diana','Princess of Wales'),

      ('Tom','Jones'),

      ('Lucille','Ball'),

      ('Frank','Sinatra'),

      ('Elvis','Presley')


SELECT * FROM   Celebrity

GO

-- CelebrityID is zero padded alphanumeric sequence

/* 

ID    CelebrityID       FirstName  LastName          ModifiedDate

1     CEL00001          Jennifer    Aniston          2012-07-04

2     CEL00002          Drew        Barrymore        2012-07-04

3     CEL00003          Diana       Princess of Wales 2012-07-04

4     CEL00004          Tom         Jones             2012-07-04

5     CEL00005          Lucille     Ball              2012-07-04

6     CEL00006          Frank       Sinatra          2012-07-04

7     CEL00007          Elvis       Presley          2012-07-04

*/

-- Cleanup demo

DROP TABLE Celebrity

GO
------------

-- SQL removing leading zeros when no spaces in string - trimming Leading Zeros

USE AdventureWorks2008;

DECLARE @NumberString varchar(16)='000000246'

SELECT REPLACE(LTRIM(REPLACE(@NumberString, '0', ' ')), ' ', '0')

-- 246

------------

-- SQL remove leading zeros - sql trim leading zeros - numeric test

DECLARE @StringWithLeadingZeros    VARCHAR(12) = '000000654321'

SELECT CAST(CAST(@StringWithLeadingZeros AS INT) AS VARCHAR(10))

WHERE ISNUMERIC (@StringWithLeadingZeros)=1

-- 654321

------------

-- LPAD & RPAD string scalar-valued user-defined functions (UDF)

USE AdventureWorks;

GO

-- Left pad string function

CREATE FUNCTION LPAD

               (@SourceString VARCHAR(MAX),

                @FinalLength  INT,

                @PadChar      CHAR(1))

RETURNS VARCHAR(MAX)

AS

  BEGIN

    RETURN

      (SELECT Replicate(@PadChar,@FinalLength - Len(@SourceString)) + @SourceString)

  END

GO


-- T-SQL Test left padding

SELECT LeftPaddedString = dbo.LPAD(Cast(84856 AS VARCHAR),12,'0')

GO

-- 000000084856


-- MSSQL right pad string function

CREATE FUNCTION RPAD

               (@SourceString VARCHAR(MAX),

                @FinalLength  INT,

                @PadChar      CHAR(1))

RETURNS VARCHAR(MAX)

AS

  BEGIN

    RETURN

      (SELECT @SourceString + Replicate(@PadChar,@FinalLength - Len(@SourceString)))

  END

GO


-- Test right padding

SELECT RightPaddedString = dbo.RPAD(Cast(84856 AS VARCHAR),12,'*')

GO

-- 84856*******

----------


-- Padding a money column with leading zeroes - sql convert leading zero

-- SQL convert money data type to string

SELECT PaddedUnitPrice = RIGHT(replicate('0',20) +

                         convert(varchar,UnitPrice,1), 20)

FROM Northwind.dbo.Products

/* Partial results


PaddedUnitPrice

00000000000000018.00

00000000000000019.00

00000000000000010.00

00000000000000022.00

*/


/**************** Zero padding other numeric data ****************/


-- SQL Server 2008 version featuring the LEFT function

-- SQL convert integer to text - convert integer to varchar

USE AdventureWorks2008;

DECLARE @InputNumber int = 522, @OutputLength tinyint = 12

DECLARE @PaddedString char(12)

SET @PaddedString = LEFT( replicate( '0', @OutputLength ),

    @OutputLength - len( @InputNumber ) ) + convert( varchar(12), @InputNumber)

SELECT PaddedNumber=@PaddedString

/* Result


PaddedNumber

000000000522

*/



-- SQL format currency and pad with leading spaces

-- SQL Server lpad to 9 characters

SELECT   TOP (3)  ProductName=Name,

                  Price=   CONVERT(char(9), ListPrice, 1)

FROM Production.Product

WHERE ListPrice > 0.0 ORDER BY Newid()

/*

ProductName                         Price

LL Touring Frame - Blue, 62           333.42

LL Road Seat Assembly                 133.34

Road-250 Red, 58                    2,443.35

*/

------------

-- Padding with zeroes in the middle of string

DECLARE @Number varchar(10)

SET @Number = '99999'

PRINT 'TRK' + REPLICATE('0', 12 - LEN(@Number)) + @Number

SELECT [Zero Padding]='TRK' + REPLICATE('0', 12 - LEN(@Number)) + @Number

-- Result: TRK000000099999


-- Applying the STUFF string function for zero padding

-- SQL convert integer data type to string

DECLARE @SerialNo int, @OutputSize tinyint

SET @OutputSize = 10

SET @SerialNo = 6543

SELECT PaddedSerialNo =  STUFF(replicate('0', @OutputSize),

    @OutputSize - len(@SerialNo)+1, len(@SerialNo), convert(varchar(9),@SerialNo))

-- Result: 0000006543

-- SQL pad integer with 0-s

-- SQL str function - numeric to character conversion

SELECT TOP 5

CAST(replace(str(ProductID,6),' ','0') AS char(6)) AS ProdID

FROM AdventureWorks.Production.Product

ORDER BY Name

/* Results

ProdID

000001

000879

000712

000003

000002

*/


-- SQL pad string with character

-- SQL create function

-- SQL user-defined function

CREATE FUNCTION dbo.fnLeftPadString (

      @Input VARCHAR(255),

      @PadChar CHAR(1),

      @LengthToPad TINYINT

      )

RETURNS VARCHAR(255) AS 

BEGIN

DECLARE @InputLen TINYINT

SET @InputLen = LEN(@Input)

RETURN

   CASE

      WHEN @InputLen < @LengthToPad

      THEN REPLICATE(@PadChar, @LengthToPad - @InputLen) + @Input

      ELSE @Input

   END -- CASE

END -- UDF

GO


-- SQL pad string – left padding - SQL left pad with asterisk

-- SQL check printing - SQL currency formatting

DECLARE @DollarAmount varchar(20)

SET @DollarAmount = '234.40'

SELECT PaddedString='$'+dbo.fnLeftPadString(@DollarAmount, '*', 10)

GO

-- Result: $****234.40


-- SQL currency formatting with asterisk-fill

DECLARE @Amount MONEY

SET @Amount = '3534.40'

SELECT CurrencyFormat = '$' + REPLACE(Convert(char(12),@Amount,1),' ','*')

-- $****3,534.40

SELECT      PaddedProductID =

                  dbo.fnLeftPadString (CONVERT(varchar, ProductID), '0', 6),

            ProductName=Name,

            ListPrice

FROM Production.Product

ORDER BY PaddedProductID

GO


/* Partial results


PaddedProductID         ProductName       ListPrice

000757                  Road-450 Red, 48  1457.99

000758                  Road-450 Red, 52  1457.99

000759                  Road-650 Red, 58  782.99

000760                  Road-650 Red, 60  782.99

000761                  Road-650 Red, 62  782.99

*/

----------

-- Generating tracking numbers

-- SQL pad zeroes

WITH cteSequence(SeqNo)

     AS (SELECT 1

         UNION ALL

         SELECT SeqNo + 1

         FROM   cteSequence

         WHERE  SeqNo < 1000000)

SELECT TOP 100 CAST(('TRK' + REPLICATE('0',

               7 - LEN(CAST(SeqNo AS VARCHAR(6)))) +

               CAST(SeqNo AS VARCHAR(6))) AS VARCHAR(10)) AS TrackingNo

FROM   cteSequence

OPTION (MAXRECURSION 0)

GO

/* Partial results


TrackingNo

TRK0000001

TRK0000002

TRK0000003

TRK0000004

TRK0000005

TRK0000006

TRK0000007

*/

----------

-- SQL server pad

-- SQL str function

-- SQL pad integer

-- SQL left pad

SELECT TOP (4) StaffName=LastName+', '+FirstName,

PaddedEmployeeID = REPLACE(STR(EmployeeID, 6, 0), ' ', '0')

FROM HumanResources.Employee e

INNER JOIN Person.Contact c

      ON e.ContactID = c.ContactID

ORDER BY NEWID()

/* Results


StaffName         PaddedEmployeeID

Dyck, Shelley     000214

Hines, Michael    000039

Ford, Jeffrey     000015

Caron, Rob        000168

*/


-- SQL asterisk padding

-- SQL pad with asterisk

-- SQL right pad

SELECT  TOP ( 2 * 2 )

              AddressID

            , City+REPLICATE('*', 20-len(City)) AS City

            , PostalCode

FROM AdventureWorks.Person.[Address]

WHERE LEN(City) <= 20

ORDER by NEWID()

GO

/*

AddressID   City                          PostalCode

13465       Imperial Beach******    91932

23217       Corvallis***********    97330

18548       Milwaukie***********    97222

24893       Goulburn************    2580

*/

------------

------------

-- SQL left pad any size string Alpha with any length string Beta

------------

-- SQL user-defined function - UDF - scalar-valued string function

-- T-SQL varchar(max)

USE AdventureWorks2008;

GO

CREATE FUNCTION fnMaxPad

               (@SourceString  VARCHAR(MAX),

                @PaddingString VARCHAR(MAX),

                @OutputLength  INT)

RETURNS VARCHAR(MAX)

AS

  BEGIN

    DECLARE  @WorkString VARCHAR(MAX) =

      ISNULL(REPLICATE(@PaddingString,@OutputLength - len(ISNULL(@SourceString,0))),

                            '') + @SourceString

    RETURN RIGHT(@WorkString, @OutputLength)

  END

GO


-- Execute UDF

SELECT TOP ( 5 ) [SQL Padding Demo] =

  dbo.fnMaxPad(CONVERT(VARCHAR,FLOOR(ListPrice)),'_/',21)

FROM     Production.Product

ORDER BY NEWID()

GO

/* Results


SQL Padding Demo

_/_/_/_/_/_/_/2443.00

/_/_/_/_/_/_/_/_/0.00

/_/_/_/_/_/_/_/147.00

/_/_/_/_/_/_/_/_/0.00

_/_/_/_/_/_/_/1003.00

*/

------------

-- SQL left pad unicode string Alpha with any length unicode string Beta

-- MSSQL pad international

-- SQL user-defined function - UDF - scalar-value function

-- T-SQL nvarchar(max)

USE AdventureWorks2008;

GO

CREATE FUNCTION fnMaxPadInternational

               (@SourceString  NVARCHAR(MAX),

                @PaddingString NVARCHAR(MAX),

                @OutputLength  INT)

RETURNS NVARCHAR(MAX)

AS

  BEGIN

    DECLARE  @WorkString NVARCHAR(MAX) =

      ISNULL(REPLICATE(@PaddingString,@OutputLength - len(ISNULL(@SourceString,0))),

                            '') + @SourceString

    

    RETURN RIGHT(@WorkString, @OutputLength)

  END

GO


-- Execute UDF

SELECT TOP ( 5 ) [SQL Padding Demo] =

dbo.fnMaxPadInternational(LEFT(Description,6),'_/',21)

FROM     Production.ProductDescription

ORDER BY NEWID()

GO

/* Results


SQL Padding Demo

/_/_/_/_/_/_/_/ล้อ ที่

/_/_/_/_/_/_/_/Roue d

/_/_/_/_/_/_/_/شوكة ط

/_/_/_/_/_/_/_/Each f

/_/_/_/_/_/_/_/Jeu de

*/

------------
Reference from sqlusa.com

Saturday, December 19, 2009

How to configure SQL Server 2005 to allow remote connections?

To configure SQL Server 2005 to allow remote connections, complete all the following steps.
• Enable remote connections on the instance of SQL Server that you want to connect to from a remote computer.
• Turn on the SQL Server Browser service.
• Configure the firewall to allow network traffic that is related to SQL Server and to the SQL Server Browser service.

Enable remote connections for SQL Server 2005 Express

1. Click Start, point to Programs, point to Microsoft SQL Server 2005, point to Configuration Tools, and then click SQL Server Surface Area Configuration.
2. On the SQL Server 2005 Surface Area Configuration page, click Surface Area Configuration for Services and Connections.
3. On the Surface Area Configuration for Services and Connections page, expand Database Engine, click Remote Connections, click Local and remote connections, click the appropriate protocol to enable for your environment, and then click Apply.
Note Click OK when you receive the following message:
Changes to Connection Settings will not take effect until you restart the Database Engine service.
4. On the Surface Area Configuration for Services and Connections page, expand Database Engine, click Service, click Stop. wait until the MSSQLSERVER service stops, and then click Start to restart the MSSQLSERVER service.

Enable the SQL Server Browser service

1.Click Start, point to Programs, point to Microsoft SQL Server 2005, point to Configuration Tools, and then click SQL Server Surface Area Configuration.
2.On the SQL Server 2005 Surface Area Configuration page, click Surface Area Configuration for Services and Connections.
3.On the Surface Area Configuration for Services and Connections page, click SQL Server Browser, click Automatic for Startup type, and then click Apply.
Note When you click the Automatic option, the SQL Server Browser service starts automatically every time that you start Microsoft Windows.
4.Click Start, and then click OK.

Create exceptions in Windows Firewall

SQL Server 2005 uses an instance ID as part of the path when you install its program files. To create an exception for each instance of SQL Server, you must identify the correct instance ID. To obtain an instance ID, follow these steps:
1. Click Start, point to Programs, point to Microsoft SQL Server 2005, point to Configuration Tools, and then click SQL Server Configuration Manager.
2. In SQL Server Configuration Manager, click the SQL Server Browser service in the right pane, right-click the instance name in the main window, and then click Properties.
3. On the SQL Server Browser Properties page, click the Advanced tab, locate the instance ID in the property list, and then click OK.
To open Windows Firewall, click Start, click Run, type firewall.cpl, and then click OK.
Create an exception for SQL Server 2005 in Windows Firewall
To create an exception for SQL Server 2005 in Windows Firewall, follow these steps:
1. In Windows Firewall, click the Exceptions tab, and then click Add Program.
2. In the Add a Program window, click Browse.
3. Click the C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Binn\sqlservr.exe executable program, click Open, and then click OK.
Note The path may be different depending on where SQL Server 2005 is installed. MSSQL.1 is a placeholder for the instance ID that you obtained in step 3 of the previous procedure.
4. Repeat steps 1 through 3 for each instance of SQL Server 2005 that needs an exception.
Create an exception for the SQL Server Browser service in Windows Firewall
To create an exception for the SQL Server Browser service in Windows Firewall, follow these steps:
1. In Windows Firewall, click the Exceptions tab, and then click Add Program.
2. In the Add a Program window, click Browse.
3. Click the C:\Program Files\Microsoft SQL Server\90\Shared\sqlbrowser.exe executable program, click Open, and then click OK.

---
Hope this helps everyone.

Friday, December 18, 2009

Differences between varchar and nvarchar in SQL Server

=>SQL Server 2000 and SQL Server 7, a row cannot exceed 8000 bytes in size. This means that a VARBINARY column can only store 8000 bytes (assuming it is the only column in a table), a VARCHAR column can store up to 8000 characters and an NVARCHAR column can store up to 4000 characters (2 bytes per unicode character). This limitation stems from the 8 KB internal page size SQL Server uses to save data to disk.
To store more data in a single column, you needed to use the TEXT, NTEXT, or IMAGE data types (BLOBs) which are stored in a collection of 8 KB data pages that are separate from the data pages that store the other data in the same table. These data pages are arranged in a B-tree structure. BLOBs are hard to work with and manipulate. They cannot be used as variables in a procedure or a function and they cannot be used inside string functions such as REPLACE, CHARINDEX or SUBSTRING. In most cases, you have to use READTEXT, WRITETEXT, and UPDATETEXT commands to manipulate BLOBs.

To solve this problem, Microsoft introduced the VARCHAR(MAX),  NVARCHAR(MAX), and VARBINARY(MAX) data types in SQL Server 2005. These data types can hold the same amount of data BLOBs can hold (2 GB) and they are stored in the same type of data pages used for other data types. When data in a MAX data type exceeds 8 KB, an over-flow page is used. SQL Server 2005 automatically assigns an over-flow indicator to the page and knows how to manipulate data rows the same way it manipulates other data types. You can declare variables of MAX data types inside a stored procedure or function and even pass them as variables. You can also use them inside string functions.



The broad range of data types in SQL Server can sometimes throw people through a loop, especially when the data types seem to be highly interchangeable. Two in particular that constantly spark questions are VARCHAR and NVARCHAR: what's the difference between the two, and how important is the difference? VARCHAR is an abbreviation for variable-length character string. It's a string of text characters that can be as large as the page size for the database table holding the column in question. The size for a table page is 8,196 bytes, and no one row in a table can be more than 8,060 characters. This in turn limits the maximum size of a VARCHAR to 8,000 bytes.
The "N" in NVARCHAR means uNicode. Essentially, NVARCHAR is nothing more than a VARCHAR that supports two-byte characters. The most common use for this sort of thing is to store character data that is a mixture of English and non-English symbols — in my case, English and Japanese.
The key difference between the two data types is how they're stored. VARCHAR is stored as regular 8-bit data. But NVARCHAR strings are stored in the database as UTF-16 — 16 bits or two bytes per character, all the time — and converted to whatever codepage is being used by the database connection on output (typically UTF-8). That said, NVARCHAR strings have the same length restrictions as their VARCHAR cousins — 8,000 bytes. However, since NVARCHARs use two bytes for each character, that means a given NVARCHAR can only hold 4,000 characters (not bytes) maximum. So, the amount of storage needed for NVARCHAR entities is going to be twice whatever you'd allocate for a plain old VARCHAR.
Because of this, some people may not want to use NVARCHAR universally, and may want to fall back on VARCHAR — which takes up less space per row — whenever possible.


Tuesday, November 24, 2009

Timestamp (rowversion) Data Type

A table can have only one timestamp column. The value in the timestamp column is updated every time a row containing a timestamp column is inserted or updated. Of course this means we should not use it as a primary key, because we can get many orphans quickly if updates are performed on other columns. As a row is modified in a table, the timestamp is updated with the current database timestamp value obtained from the @@DBTS function.

Timestamp is the wrong name, quite confusing in fact. It has nothing to do with time. Microsoft will rename it rowversion in the future. Rowversion is the synonym for timestamp in SQL Server 2005 and SQL Server 2008.

It is an 8 bytes unique binary key within the database.

Here is how it looks like: 0x0000000000000743. After an update to the row: 0x0000000000000745.
The rowversion(timestamp) starts changing as soon as the transaction begins. If the transaction is rolled back, it returns to the original value.

So how can we use it?

The main purpose is row versioning in multi user environment, in other words concurrency checking.
Assume you are a developer and developing a program in Visual Basic to update the name and address table of customers. There will be 100 staff member who can perform this application function. How can you be sure that while staff A typing in the change, staff X is not changing the same row?

Here is what you do:

1. Read the name and address table including the timestamp. You display the info to the user for update and save the timestamp.
2. Certain amount of time later, like 2 minutes, the user presses the submit button after changes were typed in.
3. You open a transaction with Begin Transaction
4. You read the timestamp of the name and address row
5. You compare the current timestamp to the saved timestamp.
6. If the timestamps are same, you update the row and commit the transaction
7. If timestamps are different, you roll back the transaction and notify the user about the fact that the data was changed by someone else. You can let the user decide what to do or follow the appropriate company business rule for data entry conflict resolution.
This is pretty common practice in multi user environment. The alternate would be to examine a datetime column, or the entire row which is more processing intensive.

The following example shows timestamp (rowversion in SQL Server 2008) in action:
-- SQL Server 2008 T-SQL Code
USE tempdb;

-- SQL create table for Concurrency Checking demo
CREATE TABLE Celebrity (
CelebrityID INT     IDENTITY     PRIMARY KEY,
FirstName   VARCHAR(25),
LastName     VARCHAR(30),
VERSIONSTAMP  ROWVERSION)
GO

-- SQL insert - populate table
INSERT Celebrity (FirstName, LastName)
VALUES
('Jessica', 'Simpson'),
('Nick', 'Carter'),
('Stevie', 'Brock'),
('Christina', 'Aguilera'),
('Frank','Sinatra'),
('Doris','Day'),
('Elvis', 'Presley')
GO

SELECT * FROM Celebrity
GO

/* Results
CelebrityID FirstName  VERSIONSTAMP
1             Jessica      Simpson      0x0000000000000876
2             Nick         Carter       0x0000000000000877
3             Stevie       Brock        0x0000000000000878
4             Christina   Aguilera     0x0000000000000879
5             Frank        Sinatra      0x000000000000087A
6             Doris        Day          0x000000000000087B
7             Elvis        Presley      0x000000000000087C
*/

-- SQL update demo: SOMEONE UPDATED RECORD since it was read
CREATE TABLE #Semaphore (ID int identity(1,1) primary key,
StartVersion bigint,
PK int)
DECLARE @MyKey int
INSERT INTO #Semaphore (StartVersion, PK)
SELECT  VERSIONSTAMP, 1
FROM Celebrity WHERE CelebrityID=1
SELECT @MyKey = SCOPE_IDENTITY()

-- SIMULATION: somebody else updating the same record
UPDATE Celebrity
SET     FirstName = 'Celine',
LastName = 'Dion'
WHERE  CelebrityID = 1

-- We are attempting to update.
BEGIN TRANSACTION
IF (SELECT StartVersion
FROM   #Semaphore
WHERE  ID = @MyKey) = (SELECT VERSIONSTAMP
FROM   Celebrity
WHERE  CelebrityID = 1)
BEGIN
UPDATE Celebrity
SET     FirstName = 'Lindsay',
LastName = 'Lohan'
WHERE  CelebrityID = 1

COMMIT TRANSACTION
END
ELSE
BEGIN
ROLLBACK TRANSACTION
PRINT 'ROLLBACK - UPDATE CONFLICT'
RAISERROR ('Celebrity update conflict.',10,0)
END
DELETE #Semaphore WHERE ID = @MyKey
SELECT * FROM   Celebrity
GO
/* CelebrityID     FirstName   LastName     VERSIONSTAMP
1             Celine       Dion         0x000000000000087D
2             Nick         Carter       0x0000000000000877
3             Stevie       Brock        0x0000000000000878
4             Christina   Aguilera     0x0000000000000879
5             Frank        Sinatra      0x000000000000087A
6             Doris        Day          0x000000000000087B
7             Elvis        Presley      0x000000000000087C
*/

-- SQL UPDATE with NO CONFLICT
DECLARE @MyKey int
INSERT INTO #Semaphore (StartVersion, PK)
SELECT  VERSIONSTAMP, 1
FROM Celebrity WHERE CelebrityID=1
SELECT @MyKey = SCOPE_IDENTITY()

-- We are trying to update.
BEGIN TRANSACTION
IF (SELECT StartVersion
FROM   #Semaphore
WHERE  ID = @MyKey) = (SELECT VERSIONSTAMP
FROM   Celebrity
WHERE  CelebrityID = 1)
BEGIN
UPDATE Celebrity
SET     FirstName = 'Lindsay',
LastName = 'Lohan'
WHERE  CelebrityID = 1

COMMIT TRANSACTION
END
ELSE
BEGIN
ROLLBACK TRANSACTION
PRINT 'ROLLBACK - UPDATE CONFLICT'
RAISERROR ('Celebrity update conflict.',10,0)
END
DELETE #Semaphore WHERE ID = @MyKey
SELECT * FROM   Celebrity
GO
/*
CelebrityID FirstName   LastName     VERSIONSTAMP
1             Lindsay      Lohan        0x000000000000087E
2             Nick         Carter       0x0000000000000877
3             Stevie       Brock        0x0000000000000878
4             Christina   Aguilera     0x0000000000000879
5             Frank        Sinatra      0x000000000000087A
6             Doris        Day          0x000000000000087B
7             Elvis        Presley      0x000000000000087C
*/
-- Cleanup
DROP TABLE #Semaphore
DROP TABLE Celebrity

---
got mail from http://www.sqlusa.com/register/

Saturday, November 21, 2009

Time Delay While Running T-SQL Query – Sleep Facility in SQL Server

Today, When I was created one store procedure, I fetched one problem. I have created on store procedure which will
generate 25 digits Primary key based on date and time.

I have to generate the primary key for 100k records to insert into another table.
I was using my Store procedure to generate the ID, but it gives me error which is very familiar for us.

Violation of PRIMARY KEY constraint


Now i have searching for function or way to wait for seconds or mili seconds in sql which is vary familiar in our programming language Sleep method to wait for some time before completion of code.


Today I will explain one very small but interesting feature of SQL Server. This is not much known feature of SQL Server. In SQL Server sometime, there are requirement like I described above when T-SQL script has to wait for some time before executing next statement. It is quite common that developers depends on application to take over this delay issue. However, SQL Server itself has very strong time management function of WAITFOR. Let us see two usage of WAITFOR clause.

Official explanation of WAITFOR clause from Book Online is “Blocks the execution of a batch, stored procedure, or transaction until a specified time or time interval is reached, or a specified statement modifies or returns at least one row.”

Please run the following query into your sql server query analyzer.


SELECT GETDATE() WAITFOR DELAY '00:00:00:010' 
-- wait for 10 mill seconds to run the below code SELECT GETDATE()


--
Happy programming

Tuesday, November 17, 2009

Update data in one table with data from another table

How to update more than one column in a table with values
from columns in another table and explains how to do it in the three RDBMS that we support.

Table Structures and values:

TableA has four columns: a, b, c, d (a is the primary key column)
TableB has five columns: a1, b1, c1, d1, e1 (a1 and b1 together constitute the primary key for this table)

The foreign key relationship between the two tables is based on A.a = B.a1

The data in these 2 tables is as follows:
I.    TableA
a    b    c    d
1    x    y    z
2    a    b    c
3    t    x    z

II.    TableB
a1    b1    c1    d1    e1
1    x1    y1    z1    40
2    a1    b1    c1    50

The requirement is to write a SQL to update columns b, c and d in TableA from the columns b1, c1 and d1 from TableB where-ever the join condition satisfies and e1 > 40 in TABLEB.


SQL Server:

UPDATE TABLEA
SET     b = TABLEB.b1,
c = TABLEB.c1,
d = TABLEB.d1
FROM TABLEA, TABLEB
WHERE TABLEA.a = TABLEB.a1
AND TABLEB.e1 > 40
GO

Note: This is an extension in SQL Server i.e. the FROM clause – it does make it simple to understand and is a nice feature.

Results after the update:

a    b    c    d
————————————
1     x          y           z
2     a1        b1         c1
3     t           x           z



NOTE:

It is very important to make sure that your where clause for the update statement is correct since
that is what identifies the records that the update statement is going to qualify and do the update upon.
If it is incorrect, then you can get wrong results.  The reason I am mentioning this is
because I have seen people write wrong where clauses and then wondering what went wrong
because they specified the correct condition in the SET clause.

In the above example,
if the Where condition was omitted, the other record’s columns would be updated to NULL value
and this will be the final result set:

a    b    c    d
————————————
1     Null      Null      Null
2     a1        b1         c1
3     Null     Null      Null

--
happy programming

Monday, November 16, 2009

Date and time types in SQL Server

Prior to SQL Server 2008, there are two types in SQL Server to handle date and time. Since I will be referring to these types a lot, I introduce an abbreviation for each type in below two tables (the "sn" column):
Name
sn
Minimum value
Maximum value
Accuracy
Storage
smalldatetime
sdt
1900-01-01 00:00:00
2079-06-06 23:59:00
minute
4 bytes
datetime
dt
1753-01-01 00:00:00.000
9999-12-31 23:59:59.997
3.33 ms
8 bytes

Note that there is no type to store only date or only time. Above types, have both a date and a time portion.
If you only specify a date portion, then SQL Server will store 00:00:00.000 in the time portion.
And if you only specify a time portion, SQL Server will store 1900-01-01 in the date portion.

Above is important. Read it again.

SELECT CAST ('20041223' AS datetime)
-----------------------
2004-12-23 00:00:00.000

SELECT CAST ('14:23:58' AS datetime)
-----------------------
1900-01-01 14:23:58.000


As of SQL Server 2008, we have several new types related to date and time:
Name
sn
Minimum value
Maximum value
Accuracy
Storage
datetime2
dt2
0001-01-01 00:00:00.0000000
9999-12-31 23:59:59.9999999
100ns
6-8 bytes
date
d
0001-01-01
9999-12-31
day
3 bytes
time
t
00:00:00.0000000
23:59:59.9999999
100ns
3-5 bytes
datetimeoffset
dto
0001-01-01 00:00:00.0000000
9999-12-31 23:59:59.9999999
100ns
8-10 bytes

  • As you can see, we have finally a date-only and a time-only type.
  • Datetime2 is a "better datetime" for various reasons, and it does not cost any more storage than datetime - potentially less!
  • For the new types, which include a time portion, you can specify a "fractional seconds precision" part, which specifies how many digits you have to the right of the comma after the second. So, for instance time (3) can store a value like 14:23:12.567, where if you input 14:23:12.5677 it will be rounded to 14:23:12.568.
  • The new datetimeoffset includes a time zone offset part.

Date and time formats

A common misconception is that SQL Server stores these datatypes in some particular readable format. That is not the case. SQL Server stores these values in an internal format (for instance two integers for datetime and smalldatetime). However, when you use T-SQL to specify a value (in an INSERT statement, for instance) you express it as a string. And there are rules for how different datetime string formats are interpreted by SQL Server. But note that SQL Server does not remember the format in any way.



Date and time formats for input

There are many formats available for us to specify a date/time/datetime value. Some are "better" than others are and as you read this article, I hope you will understand what I mean by "better". Interestingly enough, all formats are available for all types. So even a time-only format is allowed for a date-only type etc.
(I am ignoring the time zone offset part, which is only used for the datetimeoffset type - you can read more about this in Books Online.)

Name
sn
Format
SET DATEFORMAT dependent
SET LANGUAGE dependent
language neutral
Unseparated
u
'19980223 14:23:05'
none
none
all
Separated
s
'02/23/1998 14:23:05'
all
all
no
ANSI SQL
ansisql
'1998-12-23 14:23:05'
sdt, dt
sdt, dt
not for sdt and dt
Alphabetic
a
'23 February 1998 14:23:05'
none
all (month name)
no
ODBC datetime
odt
{ts '1998-02-23 14:23:05'}
none
none
all
ODBC date
od
{d '1998-02-23'}
none
none
all
ODBC time
ot
{t '14:23:05'}
none
none
all
ISO 8601
iso
'1998-02-23T14:23:05'
none
none
all
Time
t
'14:23:05'
'2:23:05 PM'
none
none
all

  • Note that ANSI SQL is really only a special case of the separated (aka "numeric") format, which allow for separators dash (-), slash (/) and dot (.). But since this is the only format defined in the ANSI SQL standard, I think it is worth mentioning this as a special case.
  • Most formats allow for dropping the date and/or the time portion, and in some cases, it can look a bit ... strange. It might seem stupid to specify for instance '2008-08-25' for a time type, but doing that result in same as specifying nothing in the datetime string, ''. Consider below:
    SELECT CAST ('' AS time)
    SELECT CAST ('2008-08-25' AS time)
    above two gives, us the same result (time 00:00:00).
  • The ODBC formats are different in the sense that they have a marker (literal_type, t, d or ts), you need to define correctly depending on whether you specify both date and time, date only or time only.
  • The ISO 8601 format requires both a date and a time portion.
  • SET DATEFORMAT inherits its setting from SET LANGUAGE (but an explicit SET DATEFORMAT will override later SET LANGUAGE). The language defaults to each login's language. Default language for a login is specified using sp_configure.
  • The rules regarding the format for the date part and the new types can be confusing. Microsoft wanted to make the new date related types (date, datetime2 and datetimeoffset) less dependent on these settings and more ANSI SQL compliant. End result is that these new types are language neutral for a separated datetime literal as long as the year comes first. SQL Server need to determine that this is the year part, so it requires 4 numbers year (yyyy, not yy). If that is the case, then the string will be interpreted as year first, then month and finally day - regardless of DATEFORMAT or language setting. But if the month part comes first, then DATEFORMAT and language setting will be "honored":
SET LANGUAGE British --uses dmy
GO
SELECT CAST('02-23-1998 14:23:05' AS date) --Error
GO
SELECT CAST('2/23/1998 14:23:05' AS date) --Error
GO

SELECT CAST('1998-02-23 14:23:05' AS date) --Ok
GO
SELECT CAST('1998.02.23 14:23:05' AS date) --Ok
GO
SELECT CAST('1998/02/23 14:23:05' AS date) --Ok
GO


The first two fails because the year doesn't come first (and there are not 23 months in 1998). The last three doesn't fail because the year comes first (and we are using one of the new style date related types).
Crystal clear, huh? :-)
The available formats are documented in Books Online, so I won't go into details about each format. Here's a link to the online version of Books Online, see each subsection for the different formats.

Note that the ODBC time only format will give the current date, not 1900-01-01 as the other time-only formats.

Some conclusions:
  • The Alphabetic format is especially bad since this is SET LANGUAGE dependent for all types.
  • The smalldatetime and datetime types are unfortunate since they are dependent on not only SET LANGUAGE but also SET DATEFORMAT (including the login's language setting).
  • You might think that it doesn't matter if a format/type combination is SET LANGUAGE dependent or not since "I never execute that SET LANGUAGE command anyhow". But note very carefully that the default for DATEFORMAT is drawn from the login specification (CREATE LOGIN etc.). Imagine for instance an app developed in the US with login having default language (us_english) using m/d/y format and date time. Somebody in, say, Germany installs it and bang! The app does not work anymore. Do I have your attention now? :-)
  • The types introduced in 2008 are friendlier since they are not SET DATEFORMAT dependent if you for a separated format specify the year first. For the separated formats (including ANSI SQL), the new types always interprets month before date.


Recommendations for input

As you probably realize, you don't want to use a combination of format and type, which isn't language neutral unless you make sure that you have the correct DATEFORMAT and LANGUAGE settings in your applications. Be aware that setting these options inside a stored procedure can cause a recompile of the procedure at execution time!
My recommendation is to use a language neutral format. I typically use the unseparated format, mostly out of old habit. The ISO 8601 format has the advantage of being defined in an international standard.

SQL Server 2008 has just been released at the time of writing this, but I will probably move towards date, time, datetime2 and possibly datetimeoffset over time. Over time, I might be using the ANSI SQL format (being ANSI SQL compliant) with the new types, but i'd rather that ANSI SQL supports the ISO 8601 format. I will play it safe and use unseparated or ISO 8601 for a number of years to come - even when working with the new types.



Warnings and common misconceptions


Let me say this again: you don't want to use a combination of format and type, which isn't language neutral unless you make sure that you have the correct DATEFORMAT and LANGUAGE settings in your applications.

The Numeric format can use dash (-), dot (.) or slash (/) as separator. The rules for how SQL Server parses the string doesn't change depending on the separator. A common misconception is that the ANSI SQL format (sometime a bit incorrectly referred to as the "ISO format"), 1998-02-23, is language neutral for smalldatetime and datetime. It isn't. It is a numeric format and hence it is dependent on the SET DATEFORMAT and SET LANGUAGE setting:

SET LANGUAGE us_english
SELECT CAST('2003-02-28' AS datetime)
-----------------------
2003-02-28 00:00:00.000

SET LANGUAGE british
SELECT CAST('2003-02-28' AS datetime)


Server: Msg 242, Level 16, State 3, Line 4
The conversion of a char data type to a datetime data type resulted in an out-of-range datetime value.




Please read the error message closely. It says exactly what the problem is. You specify the datetime value as a string, and because the string isn't formed according to the format you are using and the DATEFORMAT setting, SQL Server cannot convert the string to a datetime value.


Output of datetime values

When a datetime value leaves SQL Server is has no readable format. It is the client application that formats the value so it is human-readable. Some applications/developer environments will respect the regional settings of the client machine and format the value accordingly. But this is out of SQL Server's control. You can return the data from SQL Server using a specific format, but that requires you to convert it to a string, and the value will not be of the datetime type anymore. If you want to do this, use the CONVERT function, which has a useful 3:rd parameter that defines the format when you convert from datetime to a string. A few examples:



SELECT CONVERT(char(8), CURRENT_TIMESTAMP, 112)
--------
20040312

SELECT CONVERT(char(19), CURRENT_TIMESTAMP, 120)
-------------------
2004-03-12 18:08:43


SELECT CONVERT(char(10), CURRENT_TIMESTAMP, 103)
----------
12/03/2004

Searching for datetime values

The problem with datetime searches often comes from the fact that if you use smalldatetime, datetime, datetime2 or datetimeoffset then there is both a date and a time portion. Let's create table that we will use in a couple of examples:

CREATE TABLE #dts(c1 char(1), dt datetime)
INSERT INTO #dts (c1, dt) VALUES('a', '20040305 09:12:59')
INSERT INTO #dts (c1, dt) VALUES('b', '20040305 16:03:12')
INSERT INTO #dts (c1, dt) VALUES('c', '20040306 00:00:00')
INSERT INTO #dts (c1, dt) VALUES('d', '20040306 02:41:32')
INSERT INTO #dts (c1, dt) VALUES('e', '20040315 11:45:17')
INSERT INTO #dts (c1, dt) VALUES('f', '20040412 09:12:59')
INSERT INTO #dts (c1, dt) VALUES('g', '20040523 11:43:25')
Say that we want to find all rows from 2004-03-15. A common mistake is to search like this:

SELECT c1, dt FROM #dts WHERE dt = '20040305'

No rows returned. Why? What happened? We have two different types in the WHERE clause, a datetime type (the column side) and a string (the right side). SQL Server will convert one to the other according to the documented "Data Type Precedence" in Books Online. Datetime has higher precedence than a string so the string will first be converted to a datetime type. As we didn't specify a time portion, the string will be converted to the datetime value 2004-03-05 00:00:00. There are no rows with that datetime value.
OK, another alternative. We use the CONVERT function on the column side and convert it to a string so it matches the string format on the right side:


SELECT c1, dt FROM #dts WHERE CONVERT(char(8), dt, 112) = '20040305'

That returned the expected rows, but there's another problem. Since we do a calculation on the column side (using the CONVERT function), SQL Server cannot use an index to support this search criteria. This can be disastrous for performance! So let's try BETWEEN:

SELECT c1, dt FROM #dts WHERE dt BETWEEN '20040305' AND '20040306'

Because BETWEEN is inclusive, we also returned the row from 2004-03-06 00:00:00. Let's try to handle that:

SELECT c1, dt FROM #dts WHERE dt BETWEEN '20040305' AND '20040305 23:59:59.999'

What? We still got the row from 2004-03-06 00:00:00 back. Why? We need to look at the type definitions again. The resolution for the datetime type is 3.33 ms. That means that 2004-03-05 23:59:59.999 will be rounded up to 2004-03-06 00:00:00.000. Not what we expected. To get around that rounding problem, we try:

SELECT c1, dt FROM #dts WHERE dt BETWEEN '20040305' AND '20040305 23:59:59.997'

Above return the expected rows, but I would not call it intuitive and easy to read. Also, if the type in the table is smalldatetime instead of datetime, the time portion will again be rounded up, hence not producing the expected rows. So you find yourself having two ways of doing this dependent on whether you have datetime or smalldatetime. If the type is smalldatetime, you would use:

SELECT c1, dt FROM #dts WHERE dt BETWEEN '20040305' AND '20040305 23:59:00'

Having two different ways of doing this depending on the datatype is not something I recommend. And, who knows if Microsoft in the future will add types with higher precision for the time portion (which they did in SQL Server 2008)? Again you would have to adjust the time portion in the WHERE clause. Below you find my recommendation:

SELECT c1, dt FROM #dts WHERE dt >= '20040305' AND dt < '20040306'

We get the expected rows back and it is easy to understand what the criteria means. And SQL Server can use an index on the dt column. This is also easy to expand for searches on greater ranges. Say we want to return all rows for March 2004:

SELECT c1, dt FROM #dts WHERE dt >= '20040301' AND dt < '20040401'

Same logic. Here's another way to search for all rows from a specific month:

SELECT c1, dt FROM #dts WHERE DATEPART(year, dt) = 2004 AND DATENAME(month, dt) = 'March'

Above returns the expected rows, but there are two problems. Since there is a calculation on the datetime column, SQL Server cannot use an index on the dt column to support the query. And, the DATENAME function is dependent on SET LANGUAGE.

Getting rid of the time portion

It is very common that you only want to work with date, and you want to get rid of the time portion. As you should know by now it is impossible if you are a using smalldatetime, datetime, datetime2 or datetimeoffset type. If you are using date, then no problem - this is what date is for, after all! But if you insist for some reason to use smalldatetime, datetime, datetime2 or datetimeoffset then perhaps setting the time portion to 00:00:00 is fine for you? Here's my suggestion for an expression that returns a string that can be (implicitly) converted to a datetime value, regardless of SET LANGUAGE or DATEFORMAT:


CONVERT(char(8), CURRENT_TIMESTAMP, 112)

As you see, I use CONVERT with conversion code 112, which return the date in unseparated format, as a string. And you already know that the unseparated format is interpreted regardless of datetime or language settings when converted to datetime. Let us wrap above in a CAST to convert it to datetime:

SELECT CAST(CONVERT(char(8), CURRENT_TIMESTAMP, 112) AS datetime)


In case you truly only want to return the date and no time portion to the client application, then you have to return a string instead (or as of 2008, the date type). Again, read about the different conversion codes for the CONVERT function and you will probably find a format which suits you.


Hugo Kornelis had some feedback on this topic and suggests another technique for setting the time portion to 00:00:00. The idea is to decide on a base date (any date within SQL Server's datetime range will do), specify it so that time portion is 00:00:00. You first take the difference between today and this base date, in days. Then you add this number of days to the base date. Algorithm: [today]-[refdate]+[refdate], or expressed with TSQL (using 20040101 as base date):

SELECT DATEADD(day, DATEDIFF(day, '20040101', CURRENT_TIMESTAMP), '20040101')

An argument for above is that is is faster than doing string manipulations. My opinion has been that the difference is so small (a few microseconds according to tests I did about year 2000) so that this would probably not even be measurable. However SQL Server MVP Dan Guzman emailed me with some to me surprising information. Here's a quote from the email:

"I've found that the DATEADD technique significantly outperforms the string method when the expression is applied to a column in large queries. For example, I often use the DATEADD method to group by date when the datetime column includes a time component. I haven't tested the performance recently but I recall it was minutes in some cases."

Dan is someone I trust very much, so I guess I have some re-learning to do (start using the DATEADD method instead of the string method). Dan also followed up with a script file showing this. Here it is (with some minor formatting modifications made by me). My tests show consistent results on 2005 and 2008 where for 10,000,000 rows the DATEADD version took about 6 seconds and the string version took about 13 seconds. Dan reported even greater difference on 2000.


2009-04-03, I received an email from HÃ¥kan Borneland. He mentioned that is is common, for instance in the MSDN forums, to use first alternative below, instead of second alternative (which I had originally in my performance test script):
SELECT DATEADD(DAY, 0, DATEDIFF(DAY, 0, CURRENT_TIMESTAMP))
SELECT DATEADD(DAY, 0, DATEDIFF(DAY, '', CURRENT_TIMESTAMP))

Note that both these differs from the DATEDIFF version discussed slightly higher above. So now we have 4 different ways to accomplish this. "My" original string manipulation method and three DATEDIFF alternatives:

SELECT CAST(CONVERT(CHAR(8), CURRENT_TIMESTAMP, 112) AS datetime) --1
SELECT DATEADD(DAY, 0, DATEDIFF(DAY, '', CURRENT_TIMESTAMP)) --2
SELECT DATEADD(DAY, DATEDIFF(DAY, '20040101', CURRENT_TIMESTAMP), '20040101') --3
SELECT DATEADD(DAY, 0, DATEDIFF(DAY, 0, CURRENT_TIMESTAMP)) --4

If you feel that you do operations so that performance can be a major concern, then I suggest you test. Here is an updated performance test script, with all four versions. Alternative 1 is slowest and the three DATEDIFF alternatives (2-4) are pretty close with alternative 4 as the winner. I also suggest you consider readability, which can be a matter of oppinion.

The DATEADD alternative also has the advantage of flexibility. Instead of re-typing the suggestions from Hugo, I've just copied the original text from Hugo in the email he sent to me:

"The version that uses string conversion can only be used to strip the time from the date. The dateadd/datediff version can easily be varied to get the start of the current month, quarter or year. SELECT DATEADD(month, DATEDIFF(month, '20000101', CURRENT_TIMESTAMP), '20000101') Or to discard only minutes and seconds, or only seconds SELECT DATEADD(hour, DATEDIFF(hour, '20000101', CURRENT_TIMESTAMP), '20000101') (Note - when dealing with minutes, seconds or milliseconds, the DATEDIFF function CAN overflow the integer range - the date constant might have to be adapted to the expected set of datetime values). Further variations are also possible: How to get the last day of the previous month? And the last day of the current month?

SELECT DATEADD(month, DATEDIFF(month, '20000101', CURRENT_TIMESTAMP), '19991231')
SELECT DATEADD(month, DATEDIFF(month, '20000101', CURRENT_TIMESTAMP), '20000131')

How to get tomorrow's date (without time)?

SELECT DATEADD(day, DATEDIFF(day, '20000101', CURRENT_TIMESTAMP), '20000102')

How to round the datetime to the nearest hour, or to the nearest day?

SELECT DATEADD(hour, DATEDIFF(hour, '20000101', DATEADD(minute, 30, CURRENT_TIMESTAMP)), '20000101')
SELECT DATEADD(day, DATEDIFF(day, '20000101', DATEADD(hour, 12, CURRENT_TIMESTAMP)), '20000101')

How to get last Friday's date, without using a calendar table and regardless of the current DATEFIRST setting?

SELECT DATEADD(day, (DATEDIFF (day, '20000107', CURRENT_TIMESTAMP) / 7) * 7, '20000107')
or
SELECT DATEADD(day, (DATEDIFF (day, '20000108', CURRENT_TIMESTAMP) / 7) * 7, '20000107')

The first will return the current day if run on friday, the latter will return the previous friday.

Once you understand the general principle, it isn't hard to come up with new variations, that might be useful in specific circumstances.

"Tip: Always set the time to the same value

If you aren't interested in the time portion and you can't use the date type, you can set the time portion to a standard value for all rows. I recommend 00:00:00.000. Say you want to set the value to the current date when the INSERT is performed. For that you can use a DEFAULT constraint:

CREATE TABLE #dts(id INT IDENTITY, d_date DATETIME DEFAULT CONVERT(char(8), CURRENT_TIMESTAMP, 112))

INSERT #dts DEFAULT VALUES
SELECT * FROM #dts

I'm using the code 112 in the CONVERT() function above, which returns a string with the date only, in the unseparated format. However, we have not protected us from if anyone now explicitly sets the value and set the time portion to something other than 00:00:00:000. We can do that with a constraint:


ALTER TABLE #dts ADD CONSTRAINT dts_chk_d_date CHECK(CONVERT(char(12), d_date, 114) = '00:00:00:000')

What advantage does above give us? If we now want to search for all rows from a certain date, we can write the WHERE clause like:

WHERE dt = '20040305'

Above might be considered easier to read and understand. Also, say you want to join between two table based on the day only, you can now do this directly. If you have a time portion that varies, you would have to use some CONVERT in the JOIN operation which again can be disastrous for performance.
Above technique can also be applied if you are only interested in the time portion, in that case I recommend that you always set the date portion to 1900-01-01.
Now, isn't it great that we as of 2008 have separate data and time types?

Other tips, from SQL Server MVP Steve Kass

The super-careful can specify all dates in queries as smalldatetime, using cast or convert. But note that this may have unexpected behavior in SQL Server 7.0 against a (not small-) datetime column.

Without a very good reason and comments everywhere, store dates only as datetime or smalldatetime, and not as strings. If dates must be stored as strings (there are some good reasons to, but they are few), use a language neutral format and a 24-hour clock if time is also stored, so that the strings work everywhere and also sort the way the dates sort. For example, you use "dt < '20030101'" in the WHERE clause and the query doesn't return expected rows because dt is stored as a string in yyyy-mm-dd format.

The base date in Enterprise Manager (and some other Microsoft tools and applications) is 1899-12-30. EM should never be used for data entry, and rarely if ever for viewing data. If you use EM to specify time-only data, EM will store the date 1899-12-30. If you really want to use EM, specify the date 1900-01-01 explicitly. Note also that EM will only show the time portion if the date is 1899-12-30, not what you expect if you use 1900-01-01 as a standard value for date.


Why is 1753 the earliest date for datetime?

Good question. It is for historical reasons. In what we sometimes refer to as the western world, we have had two calendars in modern time: the Julian and the Gregorian calendars. These calendars were a number of days apart (depending on which century you look at), so when a culture that used the Julian calendar moved to the Gregorian calendar, they dropped from 10 to 13 days. Great Britain made this shift in 1752 (1752-09-02 were followed by 1752-09-14). An educated guess why Sybase selected 1753 as earliest date is that if you were to store an earlier date than 1753, you would also have to know which country and also handle this 10-13 day jump. So they decided to not allow dates earlier than 1753. Note, however that other countries did the shift later than 1752. Turkey, for instance, did it as late as 1927.

Being Swedish, I find it a bit amusing that Sweden had the weirdest implementation. They decided to skip the leap day over a period of 40 years (from 1700 to 1740), and Sweden would be in sync with the Gregorian calendar after 1740 (but meanwhile not in sync with anyone). However, in 1704 and 1708 the leap day was not skipped for some reason, so in 1712 which was a leap year, they inserted yet an extra day (imagine being born in Feb 30!) and then did the shift over a day like everyone else, in 1753.