ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Database Systems
    CC-215
    Progress0 / 34 topics
    Topics
    1. Basic Database Concepts2. Database Approach vs File Based System3. Database Architecture4. Three Level Schema Architecture5. Data Independence6. Relational Data Model7. Attributes8. Schemas9. Tuples10. Domains11. Relation Instances12. Keys of Relations13. Integrity Constraints14. Relational Algebra15. Selection in Relational Algebra16. Projection in Relational Algebra17. Cartesian Product in Relational Algebra18. Types of Joins19. Normalization20. Functional Dependencies21. Normal Forms22. Entity-Relationship Model23. Entity Sets24. Attributes in Entity-Relationship Model25. Relationship in Entity-Relationship Model26. Entity-Relationship Diagrams27. Structured Query Language (SQL)28. Joins in SQL29. Sub-Queries in SQL30. Grouping and Aggregation in SQL31. Concurrency Control32. Database Backup and Recovery33. Indexes34. NoSQL Systems
    CC-215›Grouping and Aggregation in SQL
    Database SystemsTopic 30 of 34

    Grouping and Aggregation in SQL

    6 minread
    1,070words
    Intermediatelevel

    Grouping and Aggregation in SQL

    Grouping and aggregation in SQL are used to group rows and perform calculations on them to summarize data. These operations are often performed together, especially when you're interested in summarizing large sets of data by certain categories, such as total sales per department or average salary per employee.

    SQL provides several important functions and clauses to perform these operations, such as GROUP BY, aggregation functions, and filtering with HAVING.


    1. The GROUP BY Clause

    The GROUP BY clause in SQL is used to group rows that have the same values in specified columns into summary rows, like finding the total number of employees in each department or calculating the average salary per department.

    Syntax:

    SELECT column1, column2, aggregation_function(column3)
    FROM table_name
    GROUP BY column1, column2;
    

    Example: Suppose we have an Employees table:

    EmployeeID Name DepartmentID Salary
    1 Alice 1 50000
    2 Bob 1 60000
    3 Charlie 2 70000
    4 David 2 80000

    To find the total salary for each department, we would use the GROUP BY clause:

    SELECT DepartmentID, SUM(Salary) AS TotalSalary
    FROM Employees
    GROUP BY DepartmentID;
    

    Result:

    DepartmentID TotalSalary
    1 110000
    2 150000

    Explanation:

    • The query groups rows by DepartmentID.
    • The SUM(Salary) function calculates the total salary for each department.

    2. Aggregation Functions

    SQL provides several aggregate functions to perform calculations on data. These are often used in conjunction with GROUP BY to summarize data.

    Common Aggregate Functions:

    1. COUNT():

      • Purpose: Returns the number of rows in a group.
      • Example: To count how many employees are in each department:
        SELECT DepartmentID, COUNT(EmployeeID) AS NumberOfEmployees
        FROM Employees
        GROUP BY DepartmentID;
        
    2. SUM():

      • Purpose: Returns the sum of values in a specified column.
      • Example: To calculate the total salary in each department:
        SELECT DepartmentID, SUM(Salary) AS TotalSalary
        FROM Employees
        GROUP BY DepartmentID;
        
    3. AVG():

      • Purpose: Returns the average value of a specified column.
      • Example: To find the average salary per department:
        SELECT DepartmentID, AVG(Salary) AS AverageSalary
        FROM Employees
        GROUP BY DepartmentID;
        
    4. MIN():

      • Purpose: Returns the minimum value in a specified column.
      • Example: To find the lowest salary in each department:
        SELECT DepartmentID, MIN(Salary) AS MinimumSalary
        FROM Employees
        GROUP BY DepartmentID;
        
    5. MAX():

      • Purpose: Returns the maximum value in a specified column.
      • Example: To find the highest salary in each department:
        SELECT DepartmentID, MAX(Salary) AS MaximumSalary
        FROM Employees
        GROUP BY DepartmentID;
        

    3. Filtering Groups with HAVING Clause

    While the WHERE clause is used to filter rows before the aggregation, the HAVING clause is used to filter the results after aggregation. This allows you to apply conditions to the aggregated results.

    Syntax:

    SELECT column1, aggregation_function(column2)
    FROM table_name
    GROUP BY column1
    HAVING aggregation_function(column2) condition;
    

    Example: If you want to find departments where the total salary exceeds 100,000:

    SELECT DepartmentID, SUM(Salary) AS TotalSalary
    FROM Employees
    GROUP BY DepartmentID
    HAVING SUM(Salary) > 100000;
    

    Result:

    DepartmentID TotalSalary
    2 150000

    Explanation:

    • The HAVING clause filters out departments where the total salary is not greater than 100,000 after aggregation.

    4. GROUP BY with Multiple Columns

    You can group data by multiple columns. In such cases, the result will be grouped by the combination of the values in those columns.

    Example: Suppose we add a JobTitle column to the Employees table:

    EmployeeID Name DepartmentID JobTitle Salary
    1 Alice 1 Manager 50000
    2 Bob 1 Developer 60000
    3 Charlie 2 Manager 70000
    4 David 2 Developer 80000

    To find the total salary in each department for each job title:

    SELECT DepartmentID, JobTitle, SUM(Salary) AS TotalSalary
    FROM Employees
    GROUP BY DepartmentID, JobTitle;
    

    Result:

    DepartmentID JobTitle TotalSalary
    1 Manager 50000
    1 Developer 60000
    2 Manager 70000
    2 Developer 80000

    Explanation:

    • The query groups the data by both DepartmentID and JobTitle, so you get a summary of salary for each job title within each department.

    5. Ordering Grouped Results

    You can use the ORDER BY clause to sort the results after applying GROUP BY and aggregation functions.

    Example: To get departments ordered by their total salary in descending order:

    SELECT DepartmentID, SUM(Salary) AS TotalSalary
    FROM Employees
    GROUP BY DepartmentID
    ORDER BY TotalSalary DESC;
    

    Result:

    DepartmentID TotalSalary
    2 150000
    1 110000

    Explanation:

    • The ORDER BY clause sorts the results based on the aggregated TotalSalary in descending order.

    6. Combining GROUP BY with DISTINCT

    You can also use the DISTINCT keyword to remove duplicates from the results before performing aggregation. This is particularly useful if you want to count unique items.

    Example: To find the number of distinct job titles in each department:

    SELECT DepartmentID, COUNT(DISTINCT JobTitle) AS NumberOfDistinctJobTitles
    FROM Employees
    GROUP BY DepartmentID;
    

    Result:

    DepartmentID NumberOfDistinctJobTitles
    1 2
    2 2

    Explanation:

    • The COUNT(DISTINCT JobTitle) counts the unique job titles in each department.

    7. The ROLLUP and CUBE Operators

    In addition to GROUP BY, SQL also supports ROLLUP and CUBE operators, which allow you to compute multiple levels of aggregation.

    • ROLLUP: Provides subtotals and a grand total by grouping values progressively from the most detailed to the least detailed.
    • CUBE: Computes all possible combinations of aggregates for the grouped columns, producing a more exhaustive set of summaries.

    ROLLUP Example:

    SELECT DepartmentID, JobTitle, SUM(Salary) AS TotalSalary
    FROM Employees
    GROUP BY DepartmentID, JobTitle
    WITH ROLLUP;
    

    Result:

    DepartmentID JobTitle TotalSalary
    1 Manager 50000
    1 Developer 60000
    1 NULL 110000
    2 Manager 70000
    2 Developer 80000
    2 NULL 150000
    NULL NULL 260000

    Explanation:

    • The ROLLUP adds subtotals for each department and a grand total at the bottom.

    CUBE Example:

    SELECT DepartmentID, JobTitle, SUM(Salary) AS TotalSalary
    FROM Employees
    GROUP BY DepartmentID, JobTitle
    WITH CUBE;
    

    Result:

    DepartmentID JobTitle TotalSalary
    1 Manager 50000
    1 Developer 60000
    1 NULL 110000
    2 Manager 70000
    2 Developer 80000
    2 NULL 150000
    NULL Manager 120000
    NULL Developer 140000
    NULL NULL 260000

    Explanation:

    • The CUBE produces all possible combinations of aggregations across both DepartmentID and JobTitle, including subtotals for each job title and each department, as well as the grand total.

    Conclusion

    SQL grouping and aggregation are essential for summarizing and analyzing large sets of data. By using the GROUP BY clause with aggregation functions such as COUNT(), SUM(), AVG(), MIN(), and MAX(), you can generate meaningful insights from your data. Additionally, filtering grouped results with HAVING, ordering results with ORDER BY, and performing advanced operations like ROLLUP and CUBE enhance your ability to analyze and report data efficiently.

    Previous topic 29
    Sub-Queries in SQL
    Next topic 31
    Concurrency Control

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time6 min
      Word count1,070
      Code examples0
      DifficultyIntermediate