Search This Blog

Wednesday, December 1, 2010

What are cursors in sql server?

SQL Server is very good at handling sets of data. For example, you can use a single UPDATE statement to update many rows of data. There are times when you want to loop through a series of rows a perform processing for each row. In this case you can use a cursor.

Please note that cursors are the SLOWEST way to access data inside SQL Server. The should only be used when you truly need to access one row at a time. The only reason for that is to call a stored procedure on each row.

The basic syntax of a cursor is:

DECLARE @AuthorID char(11)
 
DECLARE c1 CURSOR READ_ONLY
FOR
SELECT au_id
FROM authors

OPEN c1

FETCH NEXT FROM c1
INTO @AuthorID

WHILE @@FETCH_STATUS = 0
BEGIN

 PRINT @AuthorID

 FETCH NEXT FROM c1
 INTO @AuthorID

END

CLOSE c1
DEALLOCATE c1

The DECLARE CURSOR statement defines the SELECT statement that forms the basis of the cursor. You can do just about anything here that you can do in a SELECT statement. The OPEN statement statement executes the SELECT statement and populates the result set. The FETCH statement returns a row from the result set into the variable. You can select multiple columns and return them into multiple variables. The variable @@FETCH_STATUS is used to determine if there are any more rows. It will contain 0 as long as there are more rows. We use a WHILE loop to move through each row of the result set.

The READ_ONLY clause is important in the code sample above.  That dramatically improves the performance of the cursor.
In this example, I just print the contents of the variable. You can execute any type of statement you wish here. In a recent script I wrote I used a cursor to move through the rows in a table and call a stored procedure for each row passing it the primary key. Given that cursors are not very fast and calling a stored procedure for each row in a table is also very slow, my script was a resource hog. However, the stored procedure I was calling was written by the software vendor and was a very easy solution to my problem. In this case, I might have something like this:
EXEC spUpdateAuthor (@AuthorID)
instead of my Print statement. The CLOSE statement releases the row set and the DEALLOCATE statement releases the resources associated with a cursor.
If you are going to update the rows as you go through them, you can use the UPDATE clause when you declare a cursor. You'll also have to remove the READ_ONLY clause from above.
DECLARE c1 CURSOR FOR
SELECT au_id, au_lname
FROM authors
FOR UPDATE OF au_lname
You can code your UPDATE statement to update the current row in the cursor like this
UPDATE authors
SET au_lname = UPPER(Smith)
WHERE CURRENT OF c1

Performance Tuning SQL Server Cursors


1. If possible, avoid using SQL Server cursors. They generally use a lot of SQL Server resources and reduce the performance and scalability of your applications. If you need to perform row-by-row operations, try to find another method to perform the task.
Here are some alternatives to using a cursor:
  • Use WHILE LOOPS
  • Use temp tables
  • Use derived tables
  • Use correlated sub-queries
  • Use the CASE statement
  • Perform multiple queries
2. If you do find you must use a cursor, try to reduce the number of records to process. The lower the number of records to process, the faster the cursor will finish.

3. If the number of rows you need to return from a query is small, and you need to perform row-by-row operations on them, don't use a server-side cursor. Instead, consider returning the entire rowset to the client and have the client perform the necessary action on each row, then return any updated rows to the server.

4. If you have no choice but to use a server-side cursor in your application, try to use a FORWARD-ONLY or FAST-FORWARD, READ-ONLY cursor. When working with unidirectional, read-only data, use the FAST_FORWARD option instead of the FORWARD_ONLY option, as it has some internal performance optimizations to speed performance.

5. Avoid using static/insensitive and keyset cursors, unless you have no other choice. This is because they cause a temporary table to be created in TEMPDB, which increases overhead and can cause resource contention issues.

6. If you have no choice but to use cursors in your application, try to locate the SQL Server tempdb database on its own physical device for best performance. This is because cursors use the tempdb for temporary storage of cursor data. The faster your disk array running tempdb, the faster your cursor will be.

7. Using cursors can reduce concurrency and lead to unnecessary locking and blocking. To help avoid this, use the READ_ONLY cursor option if applicable, or if you need to perform updates, try to use the OPTIMISTIC cursor option to reduce locking. Try to avoid the SCROLL_LOCKS cursor option, which reduces concurrency.

8. When you are done using a cursor, don't just CLOSE it, you must also DEALLOCATE it. Deallocation is required to free up the SQL Server resources used by the cursor. If you only CLOSE the cursor, locks are freed, but SQL Server resources are not. If you don't DEALLOCATE your cursors, the resources used by the cursor will stay allocated, degrading the performance of your server until they are released.

9. If it is appropriate for your application, try to load the cursor as soon as possible by moving to the last row of the result set. This releases the share locks created when the cursor was built, freeing up SQL Server resources.

10. If you have to use a cursor because your application needs to manually scroll through records and update them, try to avoid client-side cursors, unless the number of rows is small or the data is static. If the number of rows is large, or the data is not static, consider using a server-side keyset cursor instead of a client-side cursor.

11. When using a server-side cursor, always try to fetch as small a result set as possible. This includes fetching only those rows and columns the client needs immediately. The smaller the cursor, no matter what type of server-side cursor it is, the fewer resources it will use, and performance will benefit.

Implicit and Explicit Cursores

An implicit cursor is one created "automatically" for you when you execute a query. It is simpler to code, but suffers from
  • inefficiency (the ANSI standard specifies that it must fetch twice to check if there is more than one record)
  • vulnerability to data errors
An explicit cursor is one you create yourself. It takes more code, but gives more control.

No comments:

Post a Comment