AHOY – The ad-hoc parallel asynchronous task yielder for Oracle

,

At times, it becomes necessary to execute multiple one-time tasks on an Oracle database in addition to the standard orchestration. To avoid makeshift solutions for such situations, a simple solution has been developed. This solution consists of only two tables and one package and requires the DBMS_SCHEDULER privilege.

Thread-based execution of tasks

Table AHOY_TASK

Table of tasks for the Ad-hoc yield processor. The table contains both task definitions and the log of execution.

ColumnData typeComment
ATASK_KEYINTEGER The surrogate key, also influents the priority of execution
ATASK_CMDCLOBSQL command to be executed
ATASK_STATEVARCHAR2(20)State of the task:
TBD … to be done,
OK … finished successfully,
ERR … finished unsuccessfully,
RUN … just running,
BAT … waiting batches
ATASK_SPACE VARCHAR2(200)Task-space of independently running tasks
ATASK_STARTTIMESTAMP(6)Timestamp of the start of execution
ATASK_ENDTIMESTAMP(6)Timestamp of the end of execution
ATASK_MSGCLOBEventual error message
ATASK_JOBVARCHAR2(100)Name of the realization Oracle Scheduler Job
The AHOY_TASK table

Table AHOY_PAR

Table of parameters for the Ad-hoc yield processor.

ColumnData typeComment
APAR_SPACEVARCHAR2(200)Taskspace of these parameters
APAR_THREADSINTEGERMaximum number of threads running parallelly
APAR_LAST_ERRCLOBInformation about possible processor error
The AHOY_PAR table

The solution operates in a FIFO queue mode, with the first task defined being the first to be processed. There are no other dependencies. The number of parallel threads can be increased or decreased during runtime to enable a “watch and regulate” approach.

create or replace package AHOY_PROC as
--AdHOc Yield processor
--------------------------------------------------------
  procedure threadAsync( --run thread asynchronously
    p_space varchar2:=null -- optional parameter of a space not to mess up with other users 
 );
...

The package PL/SQL code: https://github.com/bobjankovsky/metaswamp/blob/main/oracle/ahoy/ahoy_proc.sql

Since this solution is ad hoc, it is important to clean up the AHOY_TASK table regularly after completing the tasks, as no aging feature has been implemented for the sake of simplicity.

The maximum number of parallel tasks in the solution is limited by the AHOY_PAR.APAR_THREADS parameter. If this parameter is set to 0, only manually started threads will process a single task and then stop.

Setting the parameter to 0 in the runtime also causes finishing all started tasks and not starting new ones.

There are four possible states that can be associated with the tasks defined in the AHOY_TASK table:

  • TBD: The task is ready to be started.
  • RUN: The task is currently running.
  • OK: The task has finished successfully.
  • ERR: The task has finished unsuccessfully.

Use cases (guide)

There are the following use cases to work with the AHOY.

Define the set of tasks

First, we have to fill the table AHOY_TASK. It is reasonable to empty its content first. We fill two columns:

  • ATASK_CMD … the SQL or PL/SQL command to be executed,
  • ATASK_STATE … TBD (to be done).

Example: Iddle wait.

begin 
  for i in 1..50 loop
    Insert into AHOY_TASK(ATASK_CMD,ATASK_STATE)values('begin DBMS_LOCK.sleep(100); end;','TBD');
  end loop;
end ;
/

Example: We want to update a new column to the default value through several partitions of a huge table parallel.

begin 
  for r_part in (
    select PARTITION_NAME from USER_TAB_PARTITIONS where TABLE_NAME = 'HTAB'
  ) loop
    Insert into AHOY_TASK(ATASK_CMD, ATASK_STATE)
       values('Update HTAB partition('||r_part.PARTITION_NAME||') set NCOL=''XNA''','TBD');
  end loop;
end ;
/

Start the parallel processing

Let’s start the processing in 10 parallel threads

--Allow 10 tasks simultanously
update AHOY_PAR set APAR_THREADS = 10;
commit;
--Start tasks
begin 
  AHOY_PROC.threadAsync;
end;
/

Monitoring tasks status

Monitor tasks status (running, done, failed) using the following query:

select ATASK_STATE, count(1) 
from AHOY_TASK 
group by ATASK_STATE 
order by ATASK_STATE ;

Check errors:

select * from AHOY_TASK where ATASK_STATE='ERR' order by ATASK_KEY desc;

Check relevant scheduler jobs:

select  job_name, comments from user_scheduler_jobs where job_name like 'AHOY_JOB_%';

Changing the number of running threads

Realizing, the process consumes too much of resources and blocks other processes, you can reduce the number of threads.

On the other hand, if you want to boost your process in more idle time, you can increase the number of threads.

Slowing down

Let’s reduce the number of our threads to 3:

update AHOY_PAR set APAR_THREADS = 3;
commit;

Boosting

Or boost it up to 50 threads:

update AHOY_PAR set APAR_THREADS = 50;
commit;

In the case of boosting, there could be reasonable to call:

begin 
  AHOY_PROC.threadAsync;
end;
/

too, what speeds up the thread increasing.

Soft terminating of the process

To stop the process while we let every particular running task finish, but not start new ones, we use the following:

update AHOY_PAR set APAR_THREADS = 3;
commit;

Hard terminating

We can also hard (immediately) terminate any task. First, get the job name:

select ATASK_JOB from AHOY_TASK where ATASK_STATE='RUN';

Second, drop the job:

begin
  dbms_scheduler.drop_job(job_name=>'AHOY_JOB_21', force=>true) ;
end;
/

Failover

After analyzing the error, you can switch the failed task back to processing:

update AHOY_TASK set ATASK_STATE='TBD' where ATASK_STATE='ERR';
commit;

Batches

We created the AHOY tool for ad-hoc processing when we wanted to execute a heap of tasks parallelly without dependencies. The implemented batches principle is a workaround for how to divide the heap into batches, e.g. BAT1, BAT2, BAT3 and not to start the BAT2 or BAT3 tasks before all previous (by the numbering of the batch) have been done.

Instead of ATASK_STATE = ‘TBD’ we specify ATASK_STATE = ‘BAT1’ for some tasks. After there will be no TBD, nor RUN, or ERR tasks to be processed, the first BAT# state in the row will be renamed to TBD automatically and the process will continue.

That is convenient when we compress partitions and then we recreate indexes. We set the state of compressing partitions as TBD and the rebuild of the index as BAT1. That way we will never start to rebuild the index of just a compressed partition.

Batch processing

Taskspaces

Another diversion from the ONE HEAP concept. This allows us to process several independent tasks under one instance of the tool. There is enough to specify TASKSPACE and your processing will be independent of the task running under other TASKSPACES or under the null (default) TASKSPACE.

Setup of tasks in a TASKSPACE:

...
    Insert into AHOY_TASK(ATASK_CMD,ATASK_SPACE,ATASK_STATE)
      values('begin DBMS_LOCK.sleep(100); end;','SPACE1','TBD');
...

Setup of threads in a TASKSPACE:

update AHOY_PAR set APAR_THREADS = 3 where APAR_SPACE = 'SPACE1';
commit;

Start threads in a TASKSPACE:

begin 
  AHOY_PROC.threadAsync('SPACE1');
end;
/

Conclusion

I hope you enjoy the tool either for quick help or for inspiration. It has been done as simply as possible.