Different approaches for paging SQL data
Pagination is a common technique to improve user experience while pulling large chunk of data from SQL. But even pulling paged data could become an overhead while retrieving data across multiple tables. Here I will go through some of the techniques for pagination.
For illustration I am assuming there is an "Activities" table that holds logs for all activities throughout an application. I will be retriving paged data from this table.
Suppose there are some associations of the activities with other entities in the system and I need to join some other tables and/or addtitional where clauses on demand. Now it could be relatively slow when the joined tables contain large number of records based on the complexity of the query. Using temporary table can be another option in this scenario:
For illustration I am assuming there is an "Activities" table that holds logs for all activities throughout an application. I will be retriving paged data from this table.
Approach 1: using ROW_NUMBER
This is the most common approach for getting paged data from SQL query and it works pretty well for simple queries. Here is an example:DECLARE @StartIndex INT, @EndIndex INT; SET @StartIndex = 51 SET @EndIndex = 100 SELECT * FROM ( SELECT Count(1) OVER () AS TotalCount, ROW_NUMBER() OVER(ORDER BY CreatedDate DESC) AS RowNumber, * FROM Activities -- add additional filters ) AllRecords WHERE RowNumber >= @StartIndex AND RowNumber <= @EndIndex
Approach 2: using temporary table
Suppose there are some associations of the activities with other entities in the system and I need to join some other tables and/or addtitional where clauses on demand. Now it could be relatively slow when the joined tables contain large number of records based on the complexity of the query. Using temporary table can be another option in this scenario:DECLARE @StartIndex INT, @EndIndex INT; SET @StartIndex = 51 SET @EndIndex = 100 SELECT IDENTITY(INT,1,1) AS Id, a.ActivityType, u.Name AS CreatedBy, o.Name AS Organization, i.Name AS Individual, e.Name AS Employee INTO #temptable FROM Activities a INNER JOIN -- adding some complexity ( SELECT * FROM ( SELECT ActivityId, EntityType, EntityId FROM ActivityAssociations ) X PIVOT ( MAX(EntityId) FOR EntityType IN([Organization],[Individual],[Employee]) ) Y ) Associations ON Associations.ActivityId=a.Id LEFT JOIN Organizations o ON Associations.Organization = o.Id LEFT JOIN Individuals i ON Associations.Individual = i.Id LEFT JOIN Employees e ON Associations.Employee = e.Id JOIN Users u ON u.Guid = a.CreatedByGuid -- filters WHERE a.ActivityType = 'Email' AND a.CreatedByGuid = '03764AC4-011B-B41B-CA1D-EC80F9B5CCBA' -- sorting ORDER BY a.CreatedDate SELECT * FROM #temptable WHERE Id >= @StartIndex AND Id <= @EndIndex SELECT @TotalCount = MAX(Id) FROM #temptable DROP TABLE #temptable
Approach 3: using CTE
The temporary table creation part can be replaced using common table expression (CTE):
DECLARE @StartIndex INT, @EndIndex INT; SET @StartIndex = 51 SET @EndIndex = 100 WITH Associations AS ( SELECT ROW_NUMBER() OVER(ORDER BY CreatedDate DESC) AS RowNumber, * FROM Activities a INNER JOIN ( SELECT ActivityId, EntityType, EntityId FROM ActivityAssociations ) x ON x.ActivityId = a.Id PIVOT ( MAX(EntityId) FOR EntityType IN([Organization],[Individual],[Employee]) ) y WHERE ActivityType = 'Email' AND CreatedByGuid = '03764AC4-011B-B41B-CA1D-EC80F9B5CCBA' ) SELECT Count(1) OVER () AS TotalCount, a.ActivityType, u.Name as CreatedBy, o.Name AS Organization, i.Name AS Individual, e.Name AS Employee FROM Associations a LEFT JOIN Organizations o ON a.Organization = o.Id LEFT JOIN Individuals i ON a.Individual = i.Id LEFT JOIN Employees e ON a.Employee = e.Id JOIN Users u ON u.Guid = a.CreatedByGuid WHERE a.RowNumber >= @StartIndex AND a.RowNumber <= @EndIndex
Approach 4: paging with Fetch
This feature is available on SQL Server 2012 and usually performs faster than the previous techniques. Here is an example:
DECLARE @StartIndex INT, @MaxRecords INT; SET @StartIndex = 51 SET @MaxRecords = 50 SELECT TotalCount = COUNT(*) OVER(), * FROM ( SELECT y.Guid, u.Name as CreatedBy, y.CreatedDate, o.Name AS Organization, i.Name AS Individual, e.Name AS Employee FROM Activities a INNER JOIN ( SELECT ActivityId, EntityType, EntityId FROM ActivityAssociations ) x ON x.OwnerGuid = a.Guid PIVOT ( MAX(EntityId) FOR EntityType IN([Organization],[Individual],[Employee]) ) y LEFT JOIN Organizations o ON y.Organization = o.Id LEFT JOIN Individuals i ON y.Individual = i.Id LEFT JOIN Employees e ON y.Employee = e.Id JOIN Users u ON u.Guid = y.CreatedByGuid WHERE ActivityType = 'Email' AND CreatedByGuid = '03764AC4-011B-B41B-CA1D-EC80F9B5CCBA' ) AllRecords ORDER BY CreatedDate DESC OFFSET (@StartIndex-1) ROWS FETCH NEXT @MaxRecords ROWS ONLY
The ultimate approach for messy query
Sometimes the query along with its huge underlying data gets complicated enough to slower the performance and adding pagination techniques doesn't help that much. In that situation the best thing in my mind is to create a flat denormalized table containing all the information needed and to synchronize (eventually) the table from application/database when any relevant data get changed.
Also in certain cases we can use seperate simple COUNT() aggregate for getting total count, since getting the count using same query puts an overhead to its execution plan.
Also in certain cases we can use seperate simple COUNT() aggregate for getting total count, since getting the count using same query puts an overhead to its execution plan.
In conclusion, different approach can be used in diffrent scenarios. It's up to the developer to figure out the best possible technique for particular case.
Comments
Post a Comment