In PL/SQL, the BULK COLLECT function is used to increase the performance of SQL queries by fetching multiple rows at once and storing them in collections (arrays or nested tables) instead of processing one row at a time. This reduces the number of context switches between SQL and PL/SQL, resulting in greater efficiency. Bulk Collect is often used in conjunction with FORALL statements, allowing you to efficiently perform bulk DML operations. Basic command syntax BULK COLLECT:
DECLARE
TYPE collection_type IS TABLE OF table_name%ROWTYPE;
collection_name collection_type;
BEGIN
SELECT column1, column2, ...
BULK COLLECT INTO collection_name
FROM table_name
WHERE condition;
-- Process the data in the collection
END;
The syntax is PL/SQL collection_type (either array or nested table), which should be compatible with the structure of the selected columns.Improved performance
Limited content switching: By fetching multiple rows at once, BULK COLLECT minimizes the number of context switches between SQL and PL/SQL, improving performance.
Reduced SQL statement overhead: Executing a single BULK COLLECT statement is often more efficient than executing multiple individual queries.
Usage Considerations
Memory usage: since BULK COLLECT loads multiple rows into the collection, you should potentially increase memory usage, especially when working with large result sets.
Selectivity: It is most advantageous when the KDE clause is sufficiently selected, i.e. there are more parameters in the condition to reduce the number of rows loaded.
An example for BULK COLLECT looks like this:
DECLARE
TYPE employee_collection IS TABLE OF employees%ROWTYPE;
emp_data employee_collection;
BEGIN
SELECT *
BULK COLLECT INTO emp_data
FROM employees
WHERE department_id = 10;
-- Process the data in the emp_data collection
FOR i IN 1..emp_data.COUNT LOOP
DBMS_OUTPUT.PUT_LINE('Employee ID: ' || emp_data(i).employee_id || ', Name: ' || emp_data(i).first_name);
END LOOP;
END;
In this example, multiple rows from the employee tables are loaded into the emp_data collection, and the data is then processed in a loop. That is, first all the data are connected in the collection and the following ones are called through the cycle. Another option, but more complicated than the torch, is to use the cursor to read the data and then BULK COLLECT:
DECLARE
--definition of cursor
TYPE empcurtyp IS REF CURSOR;
--table all names
TYPE namelist IS TABLE OF employees.last_name%TYPE;
---table all salaries
TYPE sallist IS TABLE OF employees.salary%TYPE;
emp_cv empcurtyp;
names namelist;
sals sallist;
BEGIN
--opening of cursor
OPEN emp_cv FOR
SELECT last_name, salary FROM employees
WHERE job_id = 'SA_REP'
ORDER BY salary DESC;
--Fetch collection
FETCH emp_cv
BULK COLLECT INTO names, sals;
--closing cursor
CLOSE emp_cv;
-- loop through the names and sals collections
FOR i IN names.FIRST .. names.LAST
LOOP
DBMS_OUTPUT.PUT_LINE
('Name = ' || names(i) || ', salary = ' || sals(i));
END LOOP;
END;
A very important tool for working with BULK collect is to make changes in the tables based on the results and use it as e.g. merging only with much higher performance. An example looks like this:
DECLARE
P_MESSAGE_TEXT VARCHAR2(1500);
TYPE recData IS RECORD (
row_id ROWID,
employee_id hr.employees.employee_id%TYPE,
first_name hr.employees.first_name%TYPE,
last_name hr.employees.last_name%TYPE,
email hr.employees.email%TYPE,
phone_number hr.employees.phone_number%TYPE,
salary hr.employees.salary%TYPE,
hire_date hr.employees.hire_date%TYPE,
job_id hr.employees.job_id%TYPE,
manager_id hr.employees.manager_id%TYPE,
department_id hr.employees.department_id%TYPE
);
TYPE tabDataType IS TABLE OF recData INDEX BY PLS_INTEGER;
TYPE tabPointerType IS TABLE OF PLS_INTEGER;
TYPE curDataType IS REF CURSOR;
curData curDataType;
tabData tabDataType;
tabInsPointer tabPointerType := tabPointerType();
tabUpdPointer tabPointerType := tabPointerType();
vSqlStatm varchar(3000);
nRecProcessed number;
BEGIN
vSqlStatm := 'SELECT e.ROWID row_id,
a.employee_id,
a.first_name first_name,
a.last_name last_name,
a.email email,
a.phone_number phone_number,
a.salary salary,
a.hire_date hire_date,
d.job_id job_id,
a.manager_id manager_id,
b.department_id department_id
FROM hr.emptst a,
hr.departments b,
hr.jobs d,
hr.employees e
WHERE b.department_id = a.department_id
and a.job_id=d.job_id
and a.employee_id=e.employee_id
union all
SELECT NULL row_id,
a.employee_id,
a.first_name first_name,
a.last_name last_name,
a.email email,
a.phone_number phone_number,
a.salary salary,
a.hire_date hire_date,
d.job_id job_id,
a.manager_id manager_id,
b.department_id department_id
FROM hr.emp_temp a,
hr.departments b,
hr.jobs d
WHERE b.department_id = a.department_id
and a.job_id=d.job_id
and NOT EXISTS (select 1 from employees where employee_id=a.employee_id)';
OPEN curData FOR vSqlStatm;
LOOP
tabData.DELETE;
tabInsPointer.DELETE;
tabUpdPointer.DELETE;
FETCH curData
BULK COLLECT INTO tabData;
FOR i IN tabData.FIRST .. tabData.LAST LOOP
CASE
WHEN tabData(i).row_id IS NULL
THEN
tabInsPointer.EXTEND;
tabInsPointer(tabInsPointer.LAST) := i;
ELSE
tabUpdPointer.EXTEND;
tabUpdPointer(tabUpdPointer.LAST) := i;
END CASE;
END LOOP;
FORALL i IN VALUES OF tabInsPointer
INSERT INTO hr.employees (
employee_id,
first_name,
last_name,
email,
phone_number,
salary,
hire_date,
job_id,
manager_id,
department_id
)
VALUES (
tabData(i).employee_id,
tabData(i).first_name,
tabData(i).last_name,
tabData(i).email,
tabData(i).phone_number,
tabData(i).salary,
tabData(i).hire_date,
tabData(i).job_id,
tabData(i).manager_id,
tabData(i).department_id
);
nRecProcessed := nRecProcessed + SQL%ROWCOUNT;
FORALL i IN VALUES OF tabUpdPointer
UPDATE hr.employees SET
employee_id = tabData(i).employee_id,
first_name = tabData(i).first_name,
last_name = tabData(i).last_name,
email = tabData(i).email,
phone_number = tabData(i).phone_number,
salary = tabData(i).salary,
hire_date =tabData(i).hire_date,
--job_id = tabData(i).job_id,
manager_id = tabData(i).manager_id
--department_id = tabData(i).department_id
WHERE ROWID = tabData(i).row_id;
nRecProcessed := nRecProcessed + SQL%ROWCOUNT;
EXIT WHEN curData%NOTFOUND;
END LOOP;
CLOSE curData;
exception when others
then DBMS_OUTPUT.PUT_LINE(P_MESSAGE_TEXT);
END;
In the example, the records are added only if there is a match between the new records listed in the emp_temp table, and only those in the emp_test table are modified, i.e. it only makes a slice of the data that they process in bulk. This is how the logic of processing a large amount of data via DWH is built, where BULK fits.
BULK COLLECT is a powerful function in PL/SQL that can significantly increase the performance of data retrieval operations, especially when working with large data sets. However, it should be used judiciously, taking into account factors such as memory consumption and selectivity. The ideal memory database.