Problem - Searching for data records by a function result
There are different real life problems where you have to use a function in the WHERE statement of your query which will often slow it down to a level where it's not acceptable anymore.
Go through the script below and you'll probably understand in which situations you'd want to use a deterministic function.
Recipe #1 - Using a deterministic function to improve your query
First, a quick word about deterministic. A deterministic function always returns the same value if the input parameters are identical. 1+1 is always equal to 2 but a function called Get_Customer_Name(4711) won't return the same value because it fetches data from the database which changes.
Let's create an example including a custom function to show this possible improvement.
After you've created that table, use the query below to insert some example data. 100'000 records took about one second on my server, if you're running it on a slow computer you'll have to wait for a bit more.
INSERT INTO deterministic_test
SELECT ROWNUM N, DBMS_RANDOM.STRING ('X', 16) FROM dual
CONNECT BY LEVEL <= 100000;
Now that we have our test able including some data in it, we have to create our test function. In this case we're going to create a simple function which will return the average letter of a string. For example: If you use AC it will return B since the average ASCII code of A and C is B. Probably not very useful but simple enough to show you the concept of deterministic functions.
CREATE OR REPLACE FUNCTION Get_Average_Char(input_ VARCHAR2) RETURN VARCHAR2
IS
i_ NUMBER;
sum_ NUMBER := 0;
BEGIN
FOR i_ IN 1..LENGTH(input_) LOOP
sum_ := sum_ + ASCII(SUBSTR(input_,i_,1));
END LOOP;
RETURN CHR(sum_/LENGTH(input_));
END Get_Average_Char;
Now let's try to find all rows where the average character of column B is equal to G:
Even on a fast computer this is going to take a while. Let's look at the explain plan:
Description | Object Name | Cost | Cardinality | Bytes |
---|---|---|---|---|
SELECT STATEMENT, GOAL = ALL_ROWS | 566 | 1 | 202 | |
SORT AGGREGATE | 1 | 202 | ||
TABLE ACCESS FULL | DETERMINISTIC_TEST | 566 | 862 | 174124 |
Let's create that function again but this time using the magic keyword DETERMINISTIC:
CREATE OR REPLACE FUNCTION Get_Average_Char(input_ VARCHAR2)
RETURN VARCHAR2 DETERMINISTIC
IS
i_ NUMBER;
sum_ NUMBER := 0;
BEGIN
FOR i_ IN 1..LENGTH(input_) LOOP
sum_ := sum_ + ASCII(SUBSTR(input_,i_,1));
END LOOP;
RETURN CHR(sum_/LENGTH(input_));
END Get_Average_Char;
Now that we have a deterministic function we can tell our database to put an index on the result of that function. It's pretty straight forward - works exactly the same as it does for a normal table column:
Run the same query from above again:
The query returns in no time and when we look at the explain plan we can clearly see why:
Description | Object Name | Cost | Cardinality | Bytes |
---|---|---|---|---|
SELECT STATEMENT, GOAL = ALL_ROWS | 1 | 1 | 202 | |
SORT AGGREGATE | 1 | 202 | ||
INDEX RANGE SCAN | DETERMINISTIC_TEST_IX1 | 1 | 862 | 174124 |
We were able to replace the FULL SCAN with a much better INDEX RANGE SCAN and suddenly the cost dropped from 566 to 1. In this case Oracle doesn't have to execute the function at all after it created the index. It simply looks for the value in the index and returns the data records matching the requested value.
This works as well for built Oracle methods like UPPER when you want to create a case insensitive query on a database which is set up to do case sensitive matches.
It's not something you'll need on a daily basis but if there's a chance to add the keyword DETERMINISTIC, do it and you'll get a much better result in no time!
0 comments:
Post a Comment