• About Us
  • Privacy Policy
  • Disclaimer
  • Contact Us
AimactGrow
  • Home
  • Technology
  • AI
  • SEO
  • Coding
  • Gaming
  • Cybersecurity
  • Digital marketing
No Result
View All Result
  • Home
  • Technology
  • AI
  • SEO
  • Coding
  • Gaming
  • Cybersecurity
  • Digital marketing
No Result
View All Result
AimactGrow
No Result
View All Result

What performs higher, SQL FILTER or CASE?

Admin by Admin
May 3, 2025
Home Coding
Share on FacebookShare on Twitter


I’ve discovered an fascinating query on Twitter, just lately. Is there any efficiency influence of utilizing FILTER in SQL (PostgreSQL, particularly), or is it simply syntax sugar for a CASE expression in an mixture operate?

Is there a efficiency distinction between CASE and FILTER state of affairs or it’s only a syntax sugar?

— Ivan (@Ivan73965858) February 6, 2023

As a fast reminder, FILTER is an superior normal SQL extension to filter out values earlier than aggregating them in SQL. That is very helpful when aggregating a number of issues in a single question.

These two are the identical:

SELECT 
  fa.actor_id,

  -- These:
  SUM(size) FILTER (WHERE ranking = 'R'),
  SUM(size) FILTER (WHERE ranking = 'PG'),

  -- Are the identical as these:
  SUM(CASE WHEN ranking = 'R' THEN size END),
  SUM(CASE WHEN ranking = 'PG' THEN size END)
FROM film_actor AS fa 
LEFT JOIN movie AS f 
  ON f.film_id = fa.film_id
GROUP BY fa.actor_id

As of jOOQ 3.17, these SQL dialects are identified to help FILTER natively:

  • CockroachDB
  • Firebird
  • H2
  • HSQLDB
  • PostgreSQL
  • SQLite
  • YugabyteDB

Ought to it matter?

However again to the query. Does it actually matter by way of efficiency? Ought to it? Clearly, it shouldn’t matter. The 2 kinds of mixture operate expressions could be confirmed to imply precisely the identical factor. And actually, that’s what jOOQ does should you’re utilizing FILTER on another SQL dialect. Put the above question in our SQL translation instrument, translate to Oracle, for instance, and also you’ll be getting:

SELECT
  fa.actor_id,
  sum(CASE WHEN ranking = 'R' THEN size END),
  sum(CASE WHEN ranking = 'PG' THEN size END),
  sum(CASE WHEN ranking = 'R' THEN size END),
  sum(CASE WHEN ranking = 'PG' THEN size END)
FROM film_actor fa
  LEFT JOIN movie f
    ON f.film_id = fa.film_id
GROUP BY fa.actor_id

The opposite method needs to be attainable as effectively in an optimiser.

Does it matter?

However is that this being completed? Let’s attempt evaluating the next 2 queries on PostgreSQL, in opposition to the sakila database:

Question 1:

SELECT 
  fa.actor_id,
  SUM(size) FILTER (WHERE ranking = 'R'),
  SUM(size) FILTER (WHERE ranking = 'PG')
FROM film_actor AS fa 
LEFT JOIN movie AS f 
  ON f.film_id = fa.film_id
GROUP BY fa.actor_id

Question 2:

SELECT 
  fa.actor_id,
  SUM(CASE WHEN ranking = 'R' THEN size END),
  SUM(CASE WHEN ranking = 'PG' THEN size END)
FROM film_actor AS fa 
LEFT JOIN movie AS f 
  ON f.film_id = fa.film_id
GROUP BY fa.actor_id

I shall be utilizing this benchmark approach, and can submit the benchmark code on the finish of this weblog submit. The outcomes of working every question 500x are clear (much less time is best):

Run 1, Assertion 1: 00:00:00.786621
Run 1, Assertion 2: 00:00:00.839966

Run 2, Assertion 1: 00:00:00.775477
Run 2, Assertion 2: 00:00:00.829746

Run 3, Assertion 1: 00:00:00.774942
Run 3, Assertion 2: 00:00:00.834745

Run 4, Assertion 1: 00:00:00.776973
Run 4, Assertion 2: 00:00:00.836655

Run 5, Assertion 1: 00:00:00.775871
Run 5, Assertion 2: 00:00:00.845209

There’s a constant 8% efficiency penalty for utilizing the CASE syntax, in comparison with the FILTER syntax on my machine, working PostgreSQL 15 in docker. The precise distinction in a non-benchmark question might not be as spectacular, or extra spectacular, relying on {hardware} and information units. However clearly, one factor appears to be a bit higher on this case than the opposite.

Since these kind of syntaxes are usually utilized in a reporting context, the variations can positively matter.

Including an auxiliary predicate

You would possibly assume there’s extra optimisation potential, if we make the predicates on the RATING column redundant, like this:

Question 1:

SELECT 
  fa.actor_id,
  SUM(size) FILTER (WHERE ranking = 'R'),
  SUM(size) FILTER (WHERE ranking = 'PG')
FROM film_actor AS fa 
LEFT JOIN movie AS f 
  ON f.film_id = fa.film_id
  AND ranking IN ('R', 'PG') -- Redundant predicate right here
GROUP BY fa.actor_id

Question 2:

SELECT 
  fa.actor_id,
  SUM(CASE WHEN ranking = 'R' THEN size END),
  SUM(CASE WHEN ranking = 'PG' THEN size END)
FROM film_actor AS fa 
LEFT JOIN movie AS f 
  ON f.film_id = fa.film_id
  AND ranking IN ('R', 'PG')
GROUP BY fa.actor_id

Be aware it needs to be positioned within the LEFT JOIN‘s ON clause, so as to not tamper with the outcomes. It may well’t be positioned within the question’s WHERE clause. A proof for this distinction is right here.

What is going to the benchmark yield now?

Run 1, Assertion 1: 00:00:00.701943
Run 1, Assertion 2: 00:00:00.747103

Run 2, Assertion 1: 00:00:00.69377
Run 2, Assertion 2: 00:00:00.746252

Run 3, Assertion 1: 00:00:00.684777
Run 3, Assertion 2: 00:00:00.745419

Run 4, Assertion 1: 00:00:00.688584
Run 4, Assertion 2: 00:00:00.740979

Run 5, Assertion 1: 00:00:00.688878
Run 5, Assertion 2: 00:00:00.742864

So, certainly, the redundant predicate improved issues (in an ideal world, it shouldn’t, however right here we’re. The optimiser doesn’t optimise this in addition to it may). However nonetheless, the FILTER clause outperforms CASE clause utilization.

Conclusion

In an ideal world, two provably equal SQL syntaxes additionally carry out the identical method. However this isn’t all the time the case in the true world, the place optimisers make tradeoffs between:

  • Time spent optimising uncommon syntaxes
  • Time spent executing queries

In a earlier weblog submit (which might be outdated by now), I’ve proven plenty of these circumstances, the place the optimisation determination doesn’t rely on any value mannequin and information units and will all the time be completed, ideally. There was a bent of such optimisations being favoured by RDBMS which have an execution plan cache (e.g. Db2, Oracle, SQL Server), in case of which the optimisation must be completed solely as soon as per cached plan, after which the plan could be reused. In RDBMS that don’t have such a cache, optimisation time is extra expensive per question, so much less could be anticipated.

I believe it is a case the place it’s value trying into easy patterns of expressions in mixture capabilities. AGG(CASE ..) is such a preferred idiom, and eight% is kind of the numerous enchancment, that I believe PostgreSQL ought to repair this. We’ll see. In any case, since FILTER is already:

  • Higher performing
  • Higher trying

You’ll be able to safely change to this good normal SQL syntax already now.

Benchmarking

Whereas on this case, the development is value it regardless of precise measurements (as a result of efficiency can hardly be worse, and readability truly improves), all the time watch out with such benchmark outcomes. Do measure issues your self, and should you can’t reproduce a efficiency drawback, then don’t essentially contact completely sound logic simply because a weblog submit instructed you so.

Benchmark code

As promised, this was the benchmark code used for this weblog submit:

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 500;
  rec RECORD;
BEGIN

  -- Repeat the entire benchmark a number of occasions to keep away from warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT 
          fa.actor_id,
          SUM(size) FILTER (WHERE ranking = 'R'),
          SUM(size) FILTER (WHERE ranking = 'PG')
        FROM film_actor AS fa 
        LEFT JOIN movie AS f 
          ON f.film_id = fa.film_id
          AND ranking IN ('R', 'PG')
        GROUP BY fa.actor_id
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Assertion 1: %', r, (clock_timestamp() - v_ts);
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT 
          fa.actor_id,
          SUM(CASE WHEN ranking = 'R' THEN size END),
          SUM(CASE WHEN ranking = 'PG' THEN size END)
        FROM film_actor AS fa 
        LEFT JOIN movie AS f 
          ON f.film_id = fa.film_id
          AND ranking IN ('R', 'PG')
        GROUP BY fa.actor_id
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Assertion 2: %', r, (clock_timestamp() - v_ts);
    RAISE INFO '';
  END LOOP;
END$$;

The benchmark approach is described right here.

Like this:

Like Loading…



Tags: CaseFILTERperformsSQL
Admin

Admin

Next Post
HoYoverse lastly proclaims Honkai Star Rail x Destiny launch date with 5-star Archer and 5-star Saber – plus Archer is free

HoYoverse lastly proclaims Honkai Star Rail x Destiny launch date with 5-star Archer and 5-star Saber – plus Archer is free

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended.

Gemini as a common AI assistant

Gemini as a common AI assistant

May 23, 2025
4-word recommendation | Seth’s Weblog

The essence of industrialism | Seth’s Weblog

April 16, 2025

Trending.

Industrial-strength April Patch Tuesday covers 135 CVEs – Sophos Information

Industrial-strength April Patch Tuesday covers 135 CVEs – Sophos Information

April 10, 2025
Expedition 33 Guides, Codex, and Construct Planner

Expedition 33 Guides, Codex, and Construct Planner

April 26, 2025
How you can open the Antechamber and all lever places in Blue Prince

How you can open the Antechamber and all lever places in Blue Prince

April 14, 2025
Important SAP Exploit, AI-Powered Phishing, Main Breaches, New CVEs & Extra

Important SAP Exploit, AI-Powered Phishing, Main Breaches, New CVEs & Extra

April 28, 2025
Wormable AirPlay Flaws Allow Zero-Click on RCE on Apple Units by way of Public Wi-Fi

Wormable AirPlay Flaws Allow Zero-Click on RCE on Apple Units by way of Public Wi-Fi

May 5, 2025

AimactGrow

Welcome to AimactGrow, your ultimate source for all things technology! Our mission is to provide insightful, up-to-date content on the latest advancements in technology, coding, gaming, digital marketing, SEO, cybersecurity, and artificial intelligence (AI).

Categories

  • AI
  • Coding
  • Cybersecurity
  • Digital marketing
  • Gaming
  • SEO
  • Technology

Recent News

What’s going to influencer advertising and marketing appear to be in 2025? Knowledgeable predictions + new knowledge

What’s going to influencer advertising and marketing appear to be in 2025? Knowledgeable predictions + new knowledge

June 18, 2025
Yoast AI Optimize now out there for Basic Editor • Yoast

Replace on Yoast AI Optimize for Traditional Editor  • Yoast

June 18, 2025
  • About Us
  • Privacy Policy
  • Disclaimer
  • Contact Us

© 2025 https://blog.aimactgrow.com/ - All Rights Reserved

No Result
View All Result
  • Home
  • Technology
  • AI
  • SEO
  • Coding
  • Gaming
  • Cybersecurity
  • Digital marketing

© 2025 https://blog.aimactgrow.com/ - All Rights Reserved