/
Task Queuing System [5.3.0-B1]
Task Queuing System [5.3.0-B1]
Tasks in general can be of 2 types:
- interactive - a progress bar is displayed showing how much % of task is done
- non-interactive - nothing is shown to user
- user triggered - user presses a button (or similar control) to trigger a task
- system triggered - tasks are created on their own
Here is the matrix of what's currently supported by In-Portal:
Interactive | Non-Interactive | |
---|---|---|
User Triggered |
|
|
System Triggered |
|
|
Thoughts:
- interactive tasks are blocking UI and therefore user can't do something else in Admin Console while they are running
- user don't create about real-time status update of interactive tasks in most (but not all) cases, but just wants to know when they're completed
- scheduled tasks (system triggered non-interactive tasks):
- provide limited insight about task execution status
- only retain last task execution status
- e-mail queue:
- successfully executed tasks are removed from queue (the "E-mail Logs" section sort of compensates for that currently)
- no error information is stored within task
Solution
Part 1 (db tables & units) - 4.5h (sum)
- create "
ITaskHandler
" interface with "handleTaskRun(kDBItem $task_run)
" method - 0.5h - create "
TaskQueue
" database table (unit name "task-queue
") with following columns: - 1hId
TaskHandlerClass
- FQCN of PHP class, which is responsible for processing this task queue record (must implement "ITaskHandler
" interface)TaskData
- JSON-encoded data, that is needed for task execution (e.g. e-mail recipient, IDs of records to be processed)ScheduledOn
- when task needs to be processed (set at queuing time)QueuedOn
- when task was queuedQueuedById
- who queued the taskLastStatus
- status from last attempt of this queue record processing (same options as for "TaskRuns.Status
" field)MaxRetries
- if task fails specified number of times (5 by default), then don't retry itTaskRunsFailed
- number failed task runs (number is reset, when task execution was successful)
- create "
TaskRuns
" database table (unit name: "task-run
") with following columns: - 1hId
TaskQueueId
- ID of record from "TaskQueue
" table, that is responsible for creation this runStartedOn
- when task run was started; NULL set to moment, when status changes from "scheduled
" to "processing
"PercentsCompleted
- "0
" by default, but will be updated as task is being processedFinishedOn
- when task was finished executing (regardless of status)Results
- JSON-encoded results in any format, that be later displayed in human-readable formStatus
- same statuses as for "TaskQueue.LastStatus" column- "
scheduled
" - initially, when record is created; - "
running
" - when somebody is processing the record; - "
success
" - when execution finished without errors; - "
error
" - when known error happened during processing; - "
timeout
" - when associated task runner died unexpectedly
- "
StandardOutput
- what was written to "stdout" stream during this task runErrorOutput
- what was written to "stderr" stream during this task runErrorCode
- non-empty when known error happenedErrorMessage
- non-empty when known error happenedTaskRunnerId
- NULL by default; ID of task runner that processing/processed given task run
- create "
TaskRunners
" database table (unit name: "task-runner
") with following columns: - 1hId
ProcessId
- the PID of process, that started/created task runnerStartedOn
- when process was startedFinishedOn
- when process was finished; NULL initiallyStatus
- status of task runner:- "
running
" - default; means task runner is running - "
success
" - set, when task runner decides to kill itself - "
timeout
" - set by overseer when task runner in "running
" status and associated process isn't running
- "
- in "
task-run:OnAfterItemUpdate
" event will: - 0.5h- load "
task-queue
" object associated with updated task run - get all "
task-run
" records for that "task-queue
" (via sql); then sort them from recent to old (via php) - set following fields on "task-queue" object:
- "
LastStatus
" to "Status" of most recent "task-run" - "
TaskRunsFailed
" to count of "task-run" records in "error" and "timeout" statuses (if last run is failed) - "
TaskRunsFailed
" to "0" (if last run was successful)
- "
- load "
- in "
task-runner::OnBeforeItemCreate
" event set "ProcessId
" to PID of current process - 0.1h - in "
task-runner:OnAfterItemUpdate
" event, when "Status
" changes from "running
" to "timeout
" set all "task-run
" status, that are processed by this task runner from "running
" to "timeout
" as well - 0.3h - the "
task-run
" would be sub-item of "task-queue
" unit - 0.1h
Part 2 (adding tasks & runs) - 5h (sum)
- create "
TaskQueue
" class (not item of "task-queue
" unit) - 0.1h - add protected "
TaskQueue::createTaskHandler($class_name)
" method, that will: - 0.5h- create instance of given class or throw an exception when failed
- if created object doesn't implement "
ITaskHandler
" interface, then throw an exception - return the object
- add public "
TaskQueue::addTask($task_handler_class, array $task_data, $scheduled_on, $max_retries = null)
" method, that will: - 0.4h- call "
TaskQueue::createTaskHandler
" method to verify, that class given in "$task_handler_class
" parameter is valid - consider "
$max_retries
" as "5
" when not given - create new db record using provided data using "
task-queue
" object
- call "
- add public "
TaskQueue::refreshTaskRunnersStatus()
" method that will: - 0.5h- get all "
task-runner
" in "running
" status - if associated process isn't running anymore, then set "
task-runner
" status from "running
" to "timeout
" (the "task-runner::OnAfterItemUpdate
" would update connected task runs)
- get all "
- add protected "
TaskQueue::createTaskRun(kDBItem $task_queue)
" method, that will: - 0.5h- create new task run (and return it's ID) for given queue record, when all of following rules aren't violated:
- only 1 (or less) running "
task-run
" can exist for one "task-queue
" record - "
TaskRunsFailed
" must be smaller, then "MaxRetries
" on associated "task-queue
" record
- only 1 (or less) running "
- return "
null
" otherwise
- create new task run (and return it's ID) for given queue record, when all of following rules aren't violated:
- add protected "
TaskQueue::createMissingTaskRuns()
" method, that will: - 0.5h- get all records from "
TaskQueue
" table, for which task runs can be created:ScheduledOn
<NOW()
LastStatus
is not "running
"TaskRunsFailed
must be smaller, thenMaxRetries
- call the "
TaskQueue::createTaskRun
" method on each of them (method can return "NULL" in some cases, but that's ok)
- get all records from "
- add protected "
TaskQueue::getTaskRunnerCount()
" method, that will return number of "task-runner
" in "running
" status - 0.3h - add protected "
TaskQueue::getMissingTaskRunnerCount()
" method, that will: - 0.2h- get value of "
TaskRunnerLimit
" system setting - call "
TaskQueue::getTaskRunnerCount
" method - return difference or 0, when difference is negative
- get value of "
- add public "
TaskQueue::runStandalone()
", that will: - 0.5h- call "
TaskQueue::getTaskRunner
" method - if object is returned call "
->process()
" method on it
- call "
- add public "
TaskQueue::createMissingTaskRunners()
" method, that will: - 0.5h- if in CLI:
- call "
TaskQueue::getMissingTaskRunnerCount
" method - if it returned "0" do nothing
- execute command (see last plan) X number of times in background processes (X - number returned above)
- call "
- if not in CLI:
- call "
TaskQueue::runStandalone
" method
- call "
- if in CLI:
- add public "
TaskQueue::processQueue()
" method, that will: - 0.4h- call "
TaskQueue::
" methodrefreshTaskRunnersStatus
- call "
TaskQueue::createMissingTaskRuns
" method
- call "
- create "
task-queue:OnProcess
" event, that: - 0.1h- would be called as Scheduled Task on a regular basis (e.g. each 5 minutes)
- would call "
TaskQueue::processQueue
" method
- create "
task-queue:OnCreateTaskRunners
" event, that: - 0.3h- would be called as Scheduled Task on a regular basis (e.g. each 5 minutes) - can be disabled if needed
- will call "
TaskQueue::createMissingTaskRunners
" method
- create "
task-queue:OnDebug
" event, that will call "TaskQueue::runStandalone
" method - 0.2h
Part 3 (running runs) - 4.5h (sum)
- add "
declare(ticks = 1);
" on top of "/tools/run_event.php
" file - 0.1h - create new "
TaskRunnerLimit
" system setting set to "8
" by default - 0.2h - create "
TaskRunner
" class with: - 0.2h- add "
TaskRunner::taskRunnerId
" property - add "
TaskRunner::lastSignal
" property
- add "
- add protected "
TaskRunner::signalHandler
" method, that will store received signal in the "TaskRunner::lastSignal
" property - 0.5h - add "
TaskRunner::__construct($task_runner_id)
" method, that will: - 0.5h- store given "
$task_runner_id
" into "TaskRunner:taskRunnerId
" property - if executed from CLI (PHP_SAPI constant check), then use "
pcntl_signal
" function to register "TaskRunner::signalHandler
" method as signal listeners for following signals:SIGINT
SIGTERM
SIGKILL
SIGHUP
- store given "
- add public "
TaskQueue::getTaskRunner()
" method, that will: - 0.5h- call "
TaskQueue::getMissingTaskRunnerCount
" method - if method returned "0", then return "null"
- create new "
task-runner
" object - return instance of "
TaskRunner
" class initialized with ID of just created task runner
- call "
- add protected "
TaskRunner::getNextTaskRunId()
" method, that will: - 0.5h- acquire WRITE lock "
TaskRuns
" database table (solves racing condition in parallel environment) - pick 1st available "
task-run
" in "scheduled
" status (FIFO logic) - release above acquired lock
- return found task run id or "null" when nothing was found
- acquire WRITE lock "
- add protected "
TaskRunner
", that will: - 1h::processTaskRun($task_run_id)
- load "
task-run
" by given ID from the database or throw an exception if wasn't found - if given "
task-run
" isn't in "scheduled
" status, then throw an exception - set following fields and save changes to db immediately:
- "
TaskRunnerId
" to value of "TaskRunner::taskRunnerId
" property - "
Status
" to "processing
"
- "
- create task handler by calling "
TaskQueue::createTaskHandler
" method - enable redirection of "stdout" and "stderr" into temp files
- call the "
handle
TaskRun
" (wrapped within try/catch block) on that object providing task run object (was loaded above) as an argument - store contents of above temp files into "
StandardOutput
" and "ErrorOutput
" fields of "task-run
" object - the above method can update given object fields at will and save to db (e.g. "
PercentsCompleted
" and "Results
") - when exception was caught, then:
- set "
Status
" to "error
" - set "
ErrorCode
" to exception code - set "
ErrorMessage
" to exception message
- set "
- when no exception was caught, then:
- set "
Status
" to "success
" - set "
ErrorCode
" and "ErrorMessage
" to empty value
- set "
- set "
FinishedOn
" to time, when task was finished (with error or not)
- save changes to db
- load "
- add public "
TaskRunner::process()
" method, that will consist of while loop, where each iteration will: - 0.5h- call "
TaskRunner::
getNextTaskRunId
" method - call "
TaskRunner::
processTaskRun
" with ID found above (if ID was found) - in either of following cases set "
FinishedOn
" to NOW() on associated "task-runner
" record and exit- "
TaskRunner::lastSignal
" is set - overall memory consumption is more than 100MB
- it's not CLI mode
- "
- sleep for X of seconds
- call "
- add "
task-runner:OnProcess
" event, that will: - 0.5h- call the "
TaskQueue::getTaskRunner
" method - if an object is returned, then call "
->process()
" method on it
- call the "
Part 4 (rotation + UI) - 3h (sum)
- in the "Configuration > Website > Scheduled Tasks" section: - 1.5h
- show "Scheduled Tasks" tab (would represent existing grid)
- add "Task Queue" tab with list of records from "task-queue" unit sorted by "ScheduledOn DESC"
- the "task-queue" record editing window will consist of 2 tabs:
- General - all "task-queue" fields
- Task Runs - grid of associated task runs; rows in "error" or "timeout" status would have red background
- add "Task Runners" tab with list of records from "task-runner" unit sorted by "StartedOn DESC"; rows in "timeout" status would have red background
- create "
TaskQueueRotationInterval
" setting (same configuration as for e-mail logs) - 0.5h - create "t
ask-queue:OnRotate
" event (scheduled task), that would delete old (same concert as for e-mail logs) successful "task-queue" records along with their "task-run" records and long ago finished "task-runner" records - 1h
Part 5 (usage)
configure either of following, but not as scheduled task, because it will block all other scheduled tasks:
command:
/usr/bin/env php /path/to/in-portal/tools/run_event.php task-runner:OnProcess password_here
setup "
upstart
" or "supervisord
" or any other tool to ensure presence of X processes powered by above commandadd X records to "crontab" file powered by above command
task can be created through calling "
TaskQueue::addTask
" method by whoever needs it, e.g.:- user presses a button
- scheduled task decides to offload some work
- etc.
Quote: 17h*1.4=24h