Monday 9 April 2018

Group by and Partition by - SQL Server

Difference between Group by and Partition by

Now we will list out below difference between two

Group by 
The SQL GROUP BY clause can be used in a SELECT statement to collect data across
multiple records and group the results by one or more columns.

In more simple words GROUP BY statement is used in conjunction with the
aggregate functions to group the result-set by one or more columns.
  1. Reduces the no. of records
  2. In select we need to use only columns which are used in group by. but we can use aggregate functions.
  3. In filter condition we need to use having clause instead of where clause.
Partition By
  1. No. of records will not be reduced. Instead of that it will add one extra column.
  2. In select we can use N no. of columns. No restrictions.
  3. We can use where clause in filter condition apart from partition column.
Now we will try to learn the difference by looking into below the examples.
Take below table data which is used in the the article.

id  firstname                   lastname                    Mark
-------------------------------------------------------------------
1   arun                        prasanth                    40
2   ann                         antony                      45
3   sruthy                      abc                         41
6   new                         abc                         47
1   arun                        prasanth                    45
1   arun                        prasanth                    49
2   ann                         antony                      49
Group by Syntax: 

SELECT expression1, expression2, ... expression_n, 
       aggregate_function (aggregate_expression)
FROM tables
WHERE conditions
GROUP BY expression1, expression2, ... expression_n;
We can apply GroupBy in our table

select SUM(Mark) marksum, firstname from TableA group by id,firstName
Result: 
marksum  firstname
----------------
94      ann                      
134     arun                     
47      new                      
41      sruthy   
In our real table we have 7 rows and when we apply group by id, the server group the results based on id

In simple words
here group by normally reduces the number of rows returned by rolling them up and calculating Sum for each row.

Partition by Query: 

select SUM(Mark) OVER (PARTITION BY id) AS marksum, firstname from TableA
marksum firstname 
-------------------
134     arun                     
134     arun                     
134     arun                     
94      ann                      
94      ann                      
41      sruthy                   
47      new  
look at the results it will partition the rows and results all rows not like group by.
****************************************************************

--BELOW IS A SAMPLE WHICH OUTLINES THE SIMPLE DIFFERENCES
--READ IT AND THEN EXECUTE IT
--THERE ARE THREE ROWS OF EACH COLOR INSERTED INTO THE TABLE
--CREATE A database called testDB

--CREATE DATABASE  testDB;

-- use testDB
USE [TestDB]
GO


-- create Paints table
CREATE TABLE [dbo].[Paints](
    [Color] [varchar](50) NULL,
    [glossLevel] [varchar](50) NULL
) ON [PRIMARY]

GO

-- Populate Table
insert into paints (color, glossLevel)
select 'red', 'eggshell'
union
select 'red', 'glossy'
union
select 'red', 'flat'
union
select 'blue', 'eggshell'
union
select 'blue', 'glossy'
union
select 'blue', 'flat'
union
select 'orange', 'glossy'
union
select 'orange', 'flat'
union
select 'orange', 'eggshell'
union
select 'green', 'eggshell'
union
select 'green', 'glossy'
union
select 'green', 'flat'
union
select 'black', 'eggshell'
union
select 'black', 'glossy'
union
select 'black', 'flat'
union
select 'purple', 'eggshell'
union
select 'purple', 'glossy'
union
select 'purple', 'flat'
union
select 'salmon', 'eggshell'
union
select 'salmon', 'glossy'
union
select 'salmon', 'flat'

-------------------------------------
SELECT * FROM paints

/*   COMPARE 'GROUP BY' color to 'OVER (PARTITION BY Color)'  */

-- GROUP BY Color 
-- row quantity defined by group by aggregate (count(*)) defined by group by
select count(*) as 'count',Color from paints group by color

-- OVER (PARTITION BY... Color 
-- row quantity defined by main query aggregate defined by OVER-PARTITION BY
select color, glossLevel, count(*) OVER (Partition by color) from paints

/* COMPARE 'GROUP BY' color, glossLevel to 'OVER (PARTITION BY Color, GlossLevel)'  */

-- GROUP BY Color, GlossLevel
-- row quantity defined by GROUP BY aggregate (count(*)) defined by GROUP BY
select count(*) from paints group by color, glossLevel

-- Partition by Color, GlossLevel
-- row quantity defined by main query
-- aggregate (count(*)) defined by OVER-PARTITION BY
select color, glossLevel, count(*) OVER (Partition by color, glossLevel) from paints




0 comments:

Post a Comment

Topics

ADFS (1) ADO .Net (1) Ajax (1) Angular (43) Angular Js (15) ASP .Net (14) Authentication (4) Azure (3) Breeze.js (1) C# (47) CD (1) CI (2) CloudComputing (2) Coding (7) CQRS (1) CSS (2) Design_Pattern (6) DevOps (4) DI (3) Dotnet (8) DotnetCore (16) Entity Framework (2) ExpressJS (4) Html (4) IIS (1) Javascript (17) Jquery (8) Lamda (3) Linq (11) microservice (3) Mongodb (1) MVC (46) NodeJS (8) React (11) SDLC (1) Sql Server (32) SSIS (3) SSO (1) TypeScript (1) UI (1) UnitTest (1) WCF (14) Web Api (15) Web Service (1) XMl (1)

Dotnet Guru Archives