Tuesday, November 24, 2009

Timestamp (rowversion) Data Type

A table can have only one timestamp column. The value in the timestamp column is updated every time a row containing a timestamp column is inserted or updated. Of course this means we should not use it as a primary key, because we can get many orphans quickly if updates are performed on other columns. As a row is modified in a table, the timestamp is updated with the current database timestamp value obtained from the @@DBTS function.

Timestamp is the wrong name, quite confusing in fact. It has nothing to do with time. Microsoft will rename it rowversion in the future. Rowversion is the synonym for timestamp in SQL Server 2005 and SQL Server 2008.

It is an 8 bytes unique binary key within the database.

Here is how it looks like: 0x0000000000000743. After an update to the row: 0x0000000000000745.
The rowversion(timestamp) starts changing as soon as the transaction begins. If the transaction is rolled back, it returns to the original value.

So how can we use it?

The main purpose is row versioning in multi user environment, in other words concurrency checking.
Assume you are a developer and developing a program in Visual Basic to update the name and address table of customers. There will be 100 staff member who can perform this application function. How can you be sure that while staff A typing in the change, staff X is not changing the same row?

Here is what you do:

1. Read the name and address table including the timestamp. You display the info to the user for update and save the timestamp.
2. Certain amount of time later, like 2 minutes, the user presses the submit button after changes were typed in.
3. You open a transaction with Begin Transaction
4. You read the timestamp of the name and address row
5. You compare the current timestamp to the saved timestamp.
6. If the timestamps are same, you update the row and commit the transaction
7. If timestamps are different, you roll back the transaction and notify the user about the fact that the data was changed by someone else. You can let the user decide what to do or follow the appropriate company business rule for data entry conflict resolution.
This is pretty common practice in multi user environment. The alternate would be to examine a datetime column, or the entire row which is more processing intensive.

The following example shows timestamp (rowversion in SQL Server 2008) in action:
-- SQL Server 2008 T-SQL Code
USE tempdb;

-- SQL create table for Concurrency Checking demo
CREATE TABLE Celebrity (
CelebrityID INT     IDENTITY     PRIMARY KEY,
FirstName   VARCHAR(25),
LastName     VARCHAR(30),
VERSIONSTAMP  ROWVERSION)
GO

-- SQL insert - populate table
INSERT Celebrity (FirstName, LastName)
VALUES
('Jessica', 'Simpson'),
('Nick', 'Carter'),
('Stevie', 'Brock'),
('Christina', 'Aguilera'),
('Frank','Sinatra'),
('Doris','Day'),
('Elvis', 'Presley')
GO

SELECT * FROM Celebrity
GO

/* Results
CelebrityID FirstName  VERSIONSTAMP
1             Jessica      Simpson      0x0000000000000876
2             Nick         Carter       0x0000000000000877
3             Stevie       Brock        0x0000000000000878
4             Christina   Aguilera     0x0000000000000879
5             Frank        Sinatra      0x000000000000087A
6             Doris        Day          0x000000000000087B
7             Elvis        Presley      0x000000000000087C
*/

-- SQL update demo: SOMEONE UPDATED RECORD since it was read
CREATE TABLE #Semaphore (ID int identity(1,1) primary key,
StartVersion bigint,
PK int)
DECLARE @MyKey int
INSERT INTO #Semaphore (StartVersion, PK)
SELECT  VERSIONSTAMP, 1
FROM Celebrity WHERE CelebrityID=1
SELECT @MyKey = SCOPE_IDENTITY()

-- SIMULATION: somebody else updating the same record
UPDATE Celebrity
SET     FirstName = 'Celine',
LastName = 'Dion'
WHERE  CelebrityID = 1

-- We are attempting to update.
BEGIN TRANSACTION
IF (SELECT StartVersion
FROM   #Semaphore
WHERE  ID = @MyKey) = (SELECT VERSIONSTAMP
FROM   Celebrity
WHERE  CelebrityID = 1)
BEGIN
UPDATE Celebrity
SET     FirstName = 'Lindsay',
LastName = 'Lohan'
WHERE  CelebrityID = 1

COMMIT TRANSACTION
END
ELSE
BEGIN
ROLLBACK TRANSACTION
PRINT 'ROLLBACK - UPDATE CONFLICT'
RAISERROR ('Celebrity update conflict.',10,0)
END
DELETE #Semaphore WHERE ID = @MyKey
SELECT * FROM   Celebrity
GO
/* CelebrityID     FirstName   LastName     VERSIONSTAMP
1             Celine       Dion         0x000000000000087D
2             Nick         Carter       0x0000000000000877
3             Stevie       Brock        0x0000000000000878
4             Christina   Aguilera     0x0000000000000879
5             Frank        Sinatra      0x000000000000087A
6             Doris        Day          0x000000000000087B
7             Elvis        Presley      0x000000000000087C
*/

-- SQL UPDATE with NO CONFLICT
DECLARE @MyKey int
INSERT INTO #Semaphore (StartVersion, PK)
SELECT  VERSIONSTAMP, 1
FROM Celebrity WHERE CelebrityID=1
SELECT @MyKey = SCOPE_IDENTITY()

-- We are trying to update.
BEGIN TRANSACTION
IF (SELECT StartVersion
FROM   #Semaphore
WHERE  ID = @MyKey) = (SELECT VERSIONSTAMP
FROM   Celebrity
WHERE  CelebrityID = 1)
BEGIN
UPDATE Celebrity
SET     FirstName = 'Lindsay',
LastName = 'Lohan'
WHERE  CelebrityID = 1

COMMIT TRANSACTION
END
ELSE
BEGIN
ROLLBACK TRANSACTION
PRINT 'ROLLBACK - UPDATE CONFLICT'
RAISERROR ('Celebrity update conflict.',10,0)
END
DELETE #Semaphore WHERE ID = @MyKey
SELECT * FROM   Celebrity
GO
/*
CelebrityID FirstName   LastName     VERSIONSTAMP
1             Lindsay      Lohan        0x000000000000087E
2             Nick         Carter       0x0000000000000877
3             Stevie       Brock        0x0000000000000878
4             Christina   Aguilera     0x0000000000000879
5             Frank        Sinatra      0x000000000000087A
6             Doris        Day          0x000000000000087B
7             Elvis        Presley      0x000000000000087C
*/
-- Cleanup
DROP TABLE #Semaphore
DROP TABLE Celebrity

---
got mail from http://www.sqlusa.com/register/

Saturday, November 21, 2009

Time Delay While Running T-SQL Query – Sleep Facility in SQL Server

Today, When I was created one store procedure, I fetched one problem. I have created on store procedure which will
generate 25 digits Primary key based on date and time.

I have to generate the primary key for 100k records to insert into another table.
I was using my Store procedure to generate the ID, but it gives me error which is very familiar for us.

Violation of PRIMARY KEY constraint


Now i have searching for function or way to wait for seconds or mili seconds in sql which is vary familiar in our programming language Sleep method to wait for some time before completion of code.


Today I will explain one very small but interesting feature of SQL Server. This is not much known feature of SQL Server. In SQL Server sometime, there are requirement like I described above when T-SQL script has to wait for some time before executing next statement. It is quite common that developers depends on application to take over this delay issue. However, SQL Server itself has very strong time management function of WAITFOR. Let us see two usage of WAITFOR clause.

Official explanation of WAITFOR clause from Book Online is “Blocks the execution of a batch, stored procedure, or transaction until a specified time or time interval is reached, or a specified statement modifies or returns at least one row.”

Please run the following query into your sql server query analyzer.


SELECT GETDATE() WAITFOR DELAY '00:00:00:010' 
-- wait for 10 mill seconds to run the below code SELECT GETDATE()


--
Happy programming

Tuesday, November 17, 2009

Update data in one table with data from another table

How to update more than one column in a table with values
from columns in another table and explains how to do it in the three RDBMS that we support.

Table Structures and values:

TableA has four columns: a, b, c, d (a is the primary key column)
TableB has five columns: a1, b1, c1, d1, e1 (a1 and b1 together constitute the primary key for this table)

The foreign key relationship between the two tables is based on A.a = B.a1

The data in these 2 tables is as follows:
I.    TableA
a    b    c    d
1    x    y    z
2    a    b    c
3    t    x    z

II.    TableB
a1    b1    c1    d1    e1
1    x1    y1    z1    40
2    a1    b1    c1    50

The requirement is to write a SQL to update columns b, c and d in TableA from the columns b1, c1 and d1 from TableB where-ever the join condition satisfies and e1 > 40 in TABLEB.


SQL Server:

UPDATE TABLEA
SET     b = TABLEB.b1,
c = TABLEB.c1,
d = TABLEB.d1
FROM TABLEA, TABLEB
WHERE TABLEA.a = TABLEB.a1
AND TABLEB.e1 > 40
GO

Note: This is an extension in SQL Server i.e. the FROM clause – it does make it simple to understand and is a nice feature.

Results after the update:

a    b    c    d
————————————
1     x          y           z
2     a1        b1         c1
3     t           x           z



NOTE:

It is very important to make sure that your where clause for the update statement is correct since
that is what identifies the records that the update statement is going to qualify and do the update upon.
If it is incorrect, then you can get wrong results.  The reason I am mentioning this is
because I have seen people write wrong where clauses and then wondering what went wrong
because they specified the correct condition in the SET clause.

In the above example,
if the Where condition was omitted, the other record’s columns would be updated to NULL value
and this will be the final result set:

a    b    c    d
————————————
1     Null      Null      Null
2     a1        b1         c1
3     Null     Null      Null

--
happy programming

Monday, November 16, 2009

Date and time types in SQL Server

Prior to SQL Server 2008, there are two types in SQL Server to handle date and time. Since I will be referring to these types a lot, I introduce an abbreviation for each type in below two tables (the "sn" column):
Name
sn
Minimum value
Maximum value
Accuracy
Storage
smalldatetime
sdt
1900-01-01 00:00:00
2079-06-06 23:59:00
minute
4 bytes
datetime
dt
1753-01-01 00:00:00.000
9999-12-31 23:59:59.997
3.33 ms
8 bytes

Note that there is no type to store only date or only time. Above types, have both a date and a time portion.
If you only specify a date portion, then SQL Server will store 00:00:00.000 in the time portion.
And if you only specify a time portion, SQL Server will store 1900-01-01 in the date portion.

Above is important. Read it again.

SELECT CAST ('20041223' AS datetime)
-----------------------
2004-12-23 00:00:00.000

SELECT CAST ('14:23:58' AS datetime)
-----------------------
1900-01-01 14:23:58.000


As of SQL Server 2008, we have several new types related to date and time:
Name
sn
Minimum value
Maximum value
Accuracy
Storage
datetime2
dt2
0001-01-01 00:00:00.0000000
9999-12-31 23:59:59.9999999
100ns
6-8 bytes
date
d
0001-01-01
9999-12-31
day
3 bytes
time
t
00:00:00.0000000
23:59:59.9999999
100ns
3-5 bytes
datetimeoffset
dto
0001-01-01 00:00:00.0000000
9999-12-31 23:59:59.9999999
100ns
8-10 bytes

  • As you can see, we have finally a date-only and a time-only type.
  • Datetime2 is a "better datetime" for various reasons, and it does not cost any more storage than datetime - potentially less!
  • For the new types, which include a time portion, you can specify a "fractional seconds precision" part, which specifies how many digits you have to the right of the comma after the second. So, for instance time (3) can store a value like 14:23:12.567, where if you input 14:23:12.5677 it will be rounded to 14:23:12.568.
  • The new datetimeoffset includes a time zone offset part.

Date and time formats

A common misconception is that SQL Server stores these datatypes in some particular readable format. That is not the case. SQL Server stores these values in an internal format (for instance two integers for datetime and smalldatetime). However, when you use T-SQL to specify a value (in an INSERT statement, for instance) you express it as a string. And there are rules for how different datetime string formats are interpreted by SQL Server. But note that SQL Server does not remember the format in any way.



Date and time formats for input

There are many formats available for us to specify a date/time/datetime value. Some are "better" than others are and as you read this article, I hope you will understand what I mean by "better". Interestingly enough, all formats are available for all types. So even a time-only format is allowed for a date-only type etc.
(I am ignoring the time zone offset part, which is only used for the datetimeoffset type - you can read more about this in Books Online.)

Name
sn
Format
SET DATEFORMAT dependent
SET LANGUAGE dependent
language neutral
Unseparated
u
'19980223 14:23:05'
none
none
all
Separated
s
'02/23/1998 14:23:05'
all
all
no
ANSI SQL
ansisql
'1998-12-23 14:23:05'
sdt, dt
sdt, dt
not for sdt and dt
Alphabetic
a
'23 February 1998 14:23:05'
none
all (month name)
no
ODBC datetime
odt
{ts '1998-02-23 14:23:05'}
none
none
all
ODBC date
od
{d '1998-02-23'}
none
none
all
ODBC time
ot
{t '14:23:05'}
none
none
all
ISO 8601
iso
'1998-02-23T14:23:05'
none
none
all
Time
t
'14:23:05'
'2:23:05 PM'
none
none
all

  • Note that ANSI SQL is really only a special case of the separated (aka "numeric") format, which allow for separators dash (-), slash (/) and dot (.). But since this is the only format defined in the ANSI SQL standard, I think it is worth mentioning this as a special case.
  • Most formats allow for dropping the date and/or the time portion, and in some cases, it can look a bit ... strange. It might seem stupid to specify for instance '2008-08-25' for a time type, but doing that result in same as specifying nothing in the datetime string, ''. Consider below:
    SELECT CAST ('' AS time)
    SELECT CAST ('2008-08-25' AS time)
    above two gives, us the same result (time 00:00:00).
  • The ODBC formats are different in the sense that they have a marker (literal_type, t, d or ts), you need to define correctly depending on whether you specify both date and time, date only or time only.
  • The ISO 8601 format requires both a date and a time portion.
  • SET DATEFORMAT inherits its setting from SET LANGUAGE (but an explicit SET DATEFORMAT will override later SET LANGUAGE). The language defaults to each login's language. Default language for a login is specified using sp_configure.
  • The rules regarding the format for the date part and the new types can be confusing. Microsoft wanted to make the new date related types (date, datetime2 and datetimeoffset) less dependent on these settings and more ANSI SQL compliant. End result is that these new types are language neutral for a separated datetime literal as long as the year comes first. SQL Server need to determine that this is the year part, so it requires 4 numbers year (yyyy, not yy). If that is the case, then the string will be interpreted as year first, then month and finally day - regardless of DATEFORMAT or language setting. But if the month part comes first, then DATEFORMAT and language setting will be "honored":
SET LANGUAGE British --uses dmy
GO
SELECT CAST('02-23-1998 14:23:05' AS date) --Error
GO
SELECT CAST('2/23/1998 14:23:05' AS date) --Error
GO

SELECT CAST('1998-02-23 14:23:05' AS date) --Ok
GO
SELECT CAST('1998.02.23 14:23:05' AS date) --Ok
GO
SELECT CAST('1998/02/23 14:23:05' AS date) --Ok
GO


The first two fails because the year doesn't come first (and there are not 23 months in 1998). The last three doesn't fail because the year comes first (and we are using one of the new style date related types).
Crystal clear, huh? :-)
The available formats are documented in Books Online, so I won't go into details about each format. Here's a link to the online version of Books Online, see each subsection for the different formats.

Note that the ODBC time only format will give the current date, not 1900-01-01 as the other time-only formats.

Some conclusions:
  • The Alphabetic format is especially bad since this is SET LANGUAGE dependent for all types.
  • The smalldatetime and datetime types are unfortunate since they are dependent on not only SET LANGUAGE but also SET DATEFORMAT (including the login's language setting).
  • You might think that it doesn't matter if a format/type combination is SET LANGUAGE dependent or not since "I never execute that SET LANGUAGE command anyhow". But note very carefully that the default for DATEFORMAT is drawn from the login specification (CREATE LOGIN etc.). Imagine for instance an app developed in the US with login having default language (us_english) using m/d/y format and date time. Somebody in, say, Germany installs it and bang! The app does not work anymore. Do I have your attention now? :-)
  • The types introduced in 2008 are friendlier since they are not SET DATEFORMAT dependent if you for a separated format specify the year first. For the separated formats (including ANSI SQL), the new types always interprets month before date.


Recommendations for input

As you probably realize, you don't want to use a combination of format and type, which isn't language neutral unless you make sure that you have the correct DATEFORMAT and LANGUAGE settings in your applications. Be aware that setting these options inside a stored procedure can cause a recompile of the procedure at execution time!
My recommendation is to use a language neutral format. I typically use the unseparated format, mostly out of old habit. The ISO 8601 format has the advantage of being defined in an international standard.

SQL Server 2008 has just been released at the time of writing this, but I will probably move towards date, time, datetime2 and possibly datetimeoffset over time. Over time, I might be using the ANSI SQL format (being ANSI SQL compliant) with the new types, but i'd rather that ANSI SQL supports the ISO 8601 format. I will play it safe and use unseparated or ISO 8601 for a number of years to come - even when working with the new types.



Warnings and common misconceptions


Let me say this again: you don't want to use a combination of format and type, which isn't language neutral unless you make sure that you have the correct DATEFORMAT and LANGUAGE settings in your applications.

The Numeric format can use dash (-), dot (.) or slash (/) as separator. The rules for how SQL Server parses the string doesn't change depending on the separator. A common misconception is that the ANSI SQL format (sometime a bit incorrectly referred to as the "ISO format"), 1998-02-23, is language neutral for smalldatetime and datetime. It isn't. It is a numeric format and hence it is dependent on the SET DATEFORMAT and SET LANGUAGE setting:

SET LANGUAGE us_english
SELECT CAST('2003-02-28' AS datetime)
-----------------------
2003-02-28 00:00:00.000

SET LANGUAGE british
SELECT CAST('2003-02-28' AS datetime)


Server: Msg 242, Level 16, State 3, Line 4
The conversion of a char data type to a datetime data type resulted in an out-of-range datetime value.




Please read the error message closely. It says exactly what the problem is. You specify the datetime value as a string, and because the string isn't formed according to the format you are using and the DATEFORMAT setting, SQL Server cannot convert the string to a datetime value.


Output of datetime values

When a datetime value leaves SQL Server is has no readable format. It is the client application that formats the value so it is human-readable. Some applications/developer environments will respect the regional settings of the client machine and format the value accordingly. But this is out of SQL Server's control. You can return the data from SQL Server using a specific format, but that requires you to convert it to a string, and the value will not be of the datetime type anymore. If you want to do this, use the CONVERT function, which has a useful 3:rd parameter that defines the format when you convert from datetime to a string. A few examples:



SELECT CONVERT(char(8), CURRENT_TIMESTAMP, 112)
--------
20040312

SELECT CONVERT(char(19), CURRENT_TIMESTAMP, 120)
-------------------
2004-03-12 18:08:43


SELECT CONVERT(char(10), CURRENT_TIMESTAMP, 103)
----------
12/03/2004

Searching for datetime values

The problem with datetime searches often comes from the fact that if you use smalldatetime, datetime, datetime2 or datetimeoffset then there is both a date and a time portion. Let's create table that we will use in a couple of examples:

CREATE TABLE #dts(c1 char(1), dt datetime)
INSERT INTO #dts (c1, dt) VALUES('a', '20040305 09:12:59')
INSERT INTO #dts (c1, dt) VALUES('b', '20040305 16:03:12')
INSERT INTO #dts (c1, dt) VALUES('c', '20040306 00:00:00')
INSERT INTO #dts (c1, dt) VALUES('d', '20040306 02:41:32')
INSERT INTO #dts (c1, dt) VALUES('e', '20040315 11:45:17')
INSERT INTO #dts (c1, dt) VALUES('f', '20040412 09:12:59')
INSERT INTO #dts (c1, dt) VALUES('g', '20040523 11:43:25')
Say that we want to find all rows from 2004-03-15. A common mistake is to search like this:

SELECT c1, dt FROM #dts WHERE dt = '20040305'

No rows returned. Why? What happened? We have two different types in the WHERE clause, a datetime type (the column side) and a string (the right side). SQL Server will convert one to the other according to the documented "Data Type Precedence" in Books Online. Datetime has higher precedence than a string so the string will first be converted to a datetime type. As we didn't specify a time portion, the string will be converted to the datetime value 2004-03-05 00:00:00. There are no rows with that datetime value.
OK, another alternative. We use the CONVERT function on the column side and convert it to a string so it matches the string format on the right side:


SELECT c1, dt FROM #dts WHERE CONVERT(char(8), dt, 112) = '20040305'

That returned the expected rows, but there's another problem. Since we do a calculation on the column side (using the CONVERT function), SQL Server cannot use an index to support this search criteria. This can be disastrous for performance! So let's try BETWEEN:

SELECT c1, dt FROM #dts WHERE dt BETWEEN '20040305' AND '20040306'

Because BETWEEN is inclusive, we also returned the row from 2004-03-06 00:00:00. Let's try to handle that:

SELECT c1, dt FROM #dts WHERE dt BETWEEN '20040305' AND '20040305 23:59:59.999'

What? We still got the row from 2004-03-06 00:00:00 back. Why? We need to look at the type definitions again. The resolution for the datetime type is 3.33 ms. That means that 2004-03-05 23:59:59.999 will be rounded up to 2004-03-06 00:00:00.000. Not what we expected. To get around that rounding problem, we try:

SELECT c1, dt FROM #dts WHERE dt BETWEEN '20040305' AND '20040305 23:59:59.997'

Above return the expected rows, but I would not call it intuitive and easy to read. Also, if the type in the table is smalldatetime instead of datetime, the time portion will again be rounded up, hence not producing the expected rows. So you find yourself having two ways of doing this dependent on whether you have datetime or smalldatetime. If the type is smalldatetime, you would use:

SELECT c1, dt FROM #dts WHERE dt BETWEEN '20040305' AND '20040305 23:59:00'

Having two different ways of doing this depending on the datatype is not something I recommend. And, who knows if Microsoft in the future will add types with higher precision for the time portion (which they did in SQL Server 2008)? Again you would have to adjust the time portion in the WHERE clause. Below you find my recommendation:

SELECT c1, dt FROM #dts WHERE dt >= '20040305' AND dt < '20040306'

We get the expected rows back and it is easy to understand what the criteria means. And SQL Server can use an index on the dt column. This is also easy to expand for searches on greater ranges. Say we want to return all rows for March 2004:

SELECT c1, dt FROM #dts WHERE dt >= '20040301' AND dt < '20040401'

Same logic. Here's another way to search for all rows from a specific month:

SELECT c1, dt FROM #dts WHERE DATEPART(year, dt) = 2004 AND DATENAME(month, dt) = 'March'

Above returns the expected rows, but there are two problems. Since there is a calculation on the datetime column, SQL Server cannot use an index on the dt column to support the query. And, the DATENAME function is dependent on SET LANGUAGE.

Getting rid of the time portion

It is very common that you only want to work with date, and you want to get rid of the time portion. As you should know by now it is impossible if you are a using smalldatetime, datetime, datetime2 or datetimeoffset type. If you are using date, then no problem - this is what date is for, after all! But if you insist for some reason to use smalldatetime, datetime, datetime2 or datetimeoffset then perhaps setting the time portion to 00:00:00 is fine for you? Here's my suggestion for an expression that returns a string that can be (implicitly) converted to a datetime value, regardless of SET LANGUAGE or DATEFORMAT:


CONVERT(char(8), CURRENT_TIMESTAMP, 112)

As you see, I use CONVERT with conversion code 112, which return the date in unseparated format, as a string. And you already know that the unseparated format is interpreted regardless of datetime or language settings when converted to datetime. Let us wrap above in a CAST to convert it to datetime:

SELECT CAST(CONVERT(char(8), CURRENT_TIMESTAMP, 112) AS datetime)


In case you truly only want to return the date and no time portion to the client application, then you have to return a string instead (or as of 2008, the date type). Again, read about the different conversion codes for the CONVERT function and you will probably find a format which suits you.


Hugo Kornelis had some feedback on this topic and suggests another technique for setting the time portion to 00:00:00. The idea is to decide on a base date (any date within SQL Server's datetime range will do), specify it so that time portion is 00:00:00. You first take the difference between today and this base date, in days. Then you add this number of days to the base date. Algorithm: [today]-[refdate]+[refdate], or expressed with TSQL (using 20040101 as base date):

SELECT DATEADD(day, DATEDIFF(day, '20040101', CURRENT_TIMESTAMP), '20040101')

An argument for above is that is is faster than doing string manipulations. My opinion has been that the difference is so small (a few microseconds according to tests I did about year 2000) so that this would probably not even be measurable. However SQL Server MVP Dan Guzman emailed me with some to me surprising information. Here's a quote from the email:

"I've found that the DATEADD technique significantly outperforms the string method when the expression is applied to a column in large queries. For example, I often use the DATEADD method to group by date when the datetime column includes a time component. I haven't tested the performance recently but I recall it was minutes in some cases."

Dan is someone I trust very much, so I guess I have some re-learning to do (start using the DATEADD method instead of the string method). Dan also followed up with a script file showing this. Here it is (with some minor formatting modifications made by me). My tests show consistent results on 2005 and 2008 where for 10,000,000 rows the DATEADD version took about 6 seconds and the string version took about 13 seconds. Dan reported even greater difference on 2000.


2009-04-03, I received an email from HÃ¥kan Borneland. He mentioned that is is common, for instance in the MSDN forums, to use first alternative below, instead of second alternative (which I had originally in my performance test script):
SELECT DATEADD(DAY, 0, DATEDIFF(DAY, 0, CURRENT_TIMESTAMP))
SELECT DATEADD(DAY, 0, DATEDIFF(DAY, '', CURRENT_TIMESTAMP))

Note that both these differs from the DATEDIFF version discussed slightly higher above. So now we have 4 different ways to accomplish this. "My" original string manipulation method and three DATEDIFF alternatives:

SELECT CAST(CONVERT(CHAR(8), CURRENT_TIMESTAMP, 112) AS datetime) --1
SELECT DATEADD(DAY, 0, DATEDIFF(DAY, '', CURRENT_TIMESTAMP)) --2
SELECT DATEADD(DAY, DATEDIFF(DAY, '20040101', CURRENT_TIMESTAMP), '20040101') --3
SELECT DATEADD(DAY, 0, DATEDIFF(DAY, 0, CURRENT_TIMESTAMP)) --4

If you feel that you do operations so that performance can be a major concern, then I suggest you test. Here is an updated performance test script, with all four versions. Alternative 1 is slowest and the three DATEDIFF alternatives (2-4) are pretty close with alternative 4 as the winner. I also suggest you consider readability, which can be a matter of oppinion.

The DATEADD alternative also has the advantage of flexibility. Instead of re-typing the suggestions from Hugo, I've just copied the original text from Hugo in the email he sent to me:

"The version that uses string conversion can only be used to strip the time from the date. The dateadd/datediff version can easily be varied to get the start of the current month, quarter or year. SELECT DATEADD(month, DATEDIFF(month, '20000101', CURRENT_TIMESTAMP), '20000101') Or to discard only minutes and seconds, or only seconds SELECT DATEADD(hour, DATEDIFF(hour, '20000101', CURRENT_TIMESTAMP), '20000101') (Note - when dealing with minutes, seconds or milliseconds, the DATEDIFF function CAN overflow the integer range - the date constant might have to be adapted to the expected set of datetime values). Further variations are also possible: How to get the last day of the previous month? And the last day of the current month?

SELECT DATEADD(month, DATEDIFF(month, '20000101', CURRENT_TIMESTAMP), '19991231')
SELECT DATEADD(month, DATEDIFF(month, '20000101', CURRENT_TIMESTAMP), '20000131')

How to get tomorrow's date (without time)?

SELECT DATEADD(day, DATEDIFF(day, '20000101', CURRENT_TIMESTAMP), '20000102')

How to round the datetime to the nearest hour, or to the nearest day?

SELECT DATEADD(hour, DATEDIFF(hour, '20000101', DATEADD(minute, 30, CURRENT_TIMESTAMP)), '20000101')
SELECT DATEADD(day, DATEDIFF(day, '20000101', DATEADD(hour, 12, CURRENT_TIMESTAMP)), '20000101')

How to get last Friday's date, without using a calendar table and regardless of the current DATEFIRST setting?

SELECT DATEADD(day, (DATEDIFF (day, '20000107', CURRENT_TIMESTAMP) / 7) * 7, '20000107')
or
SELECT DATEADD(day, (DATEDIFF (day, '20000108', CURRENT_TIMESTAMP) / 7) * 7, '20000107')

The first will return the current day if run on friday, the latter will return the previous friday.

Once you understand the general principle, it isn't hard to come up with new variations, that might be useful in specific circumstances.

"Tip: Always set the time to the same value

If you aren't interested in the time portion and you can't use the date type, you can set the time portion to a standard value for all rows. I recommend 00:00:00.000. Say you want to set the value to the current date when the INSERT is performed. For that you can use a DEFAULT constraint:

CREATE TABLE #dts(id INT IDENTITY, d_date DATETIME DEFAULT CONVERT(char(8), CURRENT_TIMESTAMP, 112))

INSERT #dts DEFAULT VALUES
SELECT * FROM #dts

I'm using the code 112 in the CONVERT() function above, which returns a string with the date only, in the unseparated format. However, we have not protected us from if anyone now explicitly sets the value and set the time portion to something other than 00:00:00:000. We can do that with a constraint:


ALTER TABLE #dts ADD CONSTRAINT dts_chk_d_date CHECK(CONVERT(char(12), d_date, 114) = '00:00:00:000')

What advantage does above give us? If we now want to search for all rows from a certain date, we can write the WHERE clause like:

WHERE dt = '20040305'

Above might be considered easier to read and understand. Also, say you want to join between two table based on the day only, you can now do this directly. If you have a time portion that varies, you would have to use some CONVERT in the JOIN operation which again can be disastrous for performance.
Above technique can also be applied if you are only interested in the time portion, in that case I recommend that you always set the date portion to 1900-01-01.
Now, isn't it great that we as of 2008 have separate data and time types?

Other tips, from SQL Server MVP Steve Kass

The super-careful can specify all dates in queries as smalldatetime, using cast or convert. But note that this may have unexpected behavior in SQL Server 7.0 against a (not small-) datetime column.

Without a very good reason and comments everywhere, store dates only as datetime or smalldatetime, and not as strings. If dates must be stored as strings (there are some good reasons to, but they are few), use a language neutral format and a 24-hour clock if time is also stored, so that the strings work everywhere and also sort the way the dates sort. For example, you use "dt < '20030101'" in the WHERE clause and the query doesn't return expected rows because dt is stored as a string in yyyy-mm-dd format.

The base date in Enterprise Manager (and some other Microsoft tools and applications) is 1899-12-30. EM should never be used for data entry, and rarely if ever for viewing data. If you use EM to specify time-only data, EM will store the date 1899-12-30. If you really want to use EM, specify the date 1900-01-01 explicitly. Note also that EM will only show the time portion if the date is 1899-12-30, not what you expect if you use 1900-01-01 as a standard value for date.


Why is 1753 the earliest date for datetime?

Good question. It is for historical reasons. In what we sometimes refer to as the western world, we have had two calendars in modern time: the Julian and the Gregorian calendars. These calendars were a number of days apart (depending on which century you look at), so when a culture that used the Julian calendar moved to the Gregorian calendar, they dropped from 10 to 13 days. Great Britain made this shift in 1752 (1752-09-02 were followed by 1752-09-14). An educated guess why Sybase selected 1753 as earliest date is that if you were to store an earlier date than 1753, you would also have to know which country and also handle this 10-13 day jump. So they decided to not allow dates earlier than 1753. Note, however that other countries did the shift later than 1752. Turkey, for instance, did it as late as 1927.

Being Swedish, I find it a bit amusing that Sweden had the weirdest implementation. They decided to skip the leap day over a period of 40 years (from 1700 to 1740), and Sweden would be in sync with the Gregorian calendar after 1740 (but meanwhile not in sync with anyone). However, in 1704 and 1708 the leap day was not skipped for some reason, so in 1712 which was a leap year, they inserted yet an extra day (imagine being born in Feb 30!) and then did the shift over a day like everyone else, in 1753.

Saturday, November 14, 2009

Difference between Managed and Unmanaged code.

Managed Code


Managed code is code that is written to target the services of the managed runtime execution environment (like Common Language Runtime in .NET Framework). The managed code is always executed by a managed runtime execution environment rather than the operating system directly. Managed refers to a method of exchanging information between the program and the runtime environment. Because the execution of code is governed by the runtime environment, the environment can guarantee what the code is going to do and provide the necessary security checks before executing any piece of code. Because of the same reason the managed code also gets different services from the runtime environment like Garbage Collection, type checking, exception handling, bounds checking, etc. This way managed code does not have to worry about memory allocations, type safety, etc. Applications written in Java, C#, VB.NET, etc target a runtime environment which manages the execution and the code written using these types of languages is known as Managed Code. Managed code is always compiled into an Intermediate Language (MSIL in case of .NET Framework). The compiler used by .NET framework to compile managed code compiles it into Intermediate Language and generates the necessary metadata, symbolic information that describes all of the entry points and the constructs exposed in the Intermediate Language (e.g., methods, properties) and their characteristics. The Common Language Infrastructure (CLI) Standard describes how the information is to be encoded, and programming languages that target the runtime emit the correct encoding.

In .NET Framework Managed Code runs within the .Net Framework’s CLR and benefits from the services provided by the CLR. When we compile the managed code, the code gets compiled to an intermediate language (MSIL) and an executable is created. When a user runs the executable the Just In Time Compiler of CLR compiles the intermediate language into native code specific to the underlying architecture. Since this translation happens by the managed execution environment (CLR), the managed execution environment can make guarantees about what the code is going to do, because it can actually reason about it. It can insert traps and sort of protection around, if it's running in a sandboxed environment, it can insert all the appropriate garbage collection hooks, exception handling, type safety, array bounce, index checking and so forth.

Managed code also provides platform independence. As the managed code is first compiled to intermediate language, the CLR’s JIT Compiler takes care of compiling this intermediate language into the architecture specific instructions.


Unmanaged Code


Code that is directly executed by the Operating System is known as un-managed code. Typically applications written in VB 6.0, C++, C, etc are all examples of unmanaged code. Unmanaged code typically targets the processor architecture and is always dependent on the computer architecture. Unmanaged code is always compiled to target a specific architecture and will only run on the intended platform. This means that if you want to run the same code on different architecture then you will have to recompile the code using that particular architecture. Unmanaged code is always compiled to the native code which is architecture specific. When we compile unmanaged code it gets compiled into a binary X86 image. And this image always depends on the platform on which the code was compiled and cannot be executed on the other platforms that are different that the one on which the code was compiled. Unmanaged code does not get any services from the managed execution environment.

In unmanaged code the memory allocation, type safety, security, etc needs to be taken care of by the developer. This makes unmanaged code prone to memory leaks like buffer overruns and pointer overrides and so forth.

Unmanaged executable files are basically a binary image, x86 code, loaded into memory. The program counter gets put there and that’s the last the Operating System knows. There are protections in place around memory management and port I/O and so forth, but the system doesn’t actually know what the application is doing.

Ten Common SQL Programming Mistakes

Without further ado, here is the list:
  1. NULLs and the NOT IN predicate
  2. Functions on indexed columns in predicates
  3. Incorrect subquery column
  4. Data type mismatch in predicates
  5. Predicate evaluation order
  6. Outer joins and placement of predicates
  7. Subqueries that return more than one value
  8. Use of SELECT *
  9. Scalar user-defined functions
  10. Overuse of cursors
The examples are presented using SQL Server's Transact-SQL dialect, but most of the concepts are valid in any SQL implementation.


Read full article here
--
Happy Programming

Friday, November 13, 2009

How to Get the Record Count of SQLDataSource?

To get the Record Count or Row Count returned by SQLDataSource control, we have to use DataSourceSelectArgument class.

Create an object to the DataSourceSelectArgument class, set its RetrieveTotalRowCount property to true and by using a DataView we can retrieve the Row Count of the SQLDataSource object. The sample block of code is given below. Try it.


<pre name="code" class="CSharp">

DataSourceSelectArguments dssa = new DataSourceSelectArguments();
dssa.AddSupportedCapabilities(DataSourceCapabilities.RetrieveTotalRowCount);
dssa.RetrieveTotalRowCount = true;
DataView dv = (DataView)SqlDataSource1.Select(dssa);
Response.Write("No of Records : " + dv.Table.Rows.Count);

</pre>

--
happy programming