Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • interactive tasks are blocking UI and therefore user can't do something else in Admin Console while they are running
  • user don't create about real-time status update of interactive tasks in most (but not all) cases, but just wants to know when they're completed
  • scheduled tasks (system triggered non-interactive tasks):
    • provide limited insight about task execution status
    • only retain last task execution status
  • e-mail queue:
    • successfully executed tasks are removed from queue (the "E-mail Logs" section sort of compensates for that currently)
    • no error information is stored within task

Solution

...

Part 1 (db tables & units)

  1. create "TaskQueueRunner" interface with "processTaskQueueRun(kDBItem $task_queue_run)" method
  2. create "TaskQueue" database table with following columns:
    • Id
    • QueuedOn - when task was queued
    • QueuedById - who queued the task
    • ScheduledOn - when task needs to be processed (set at queuing time)
    • TaskClass - which FQCN of PHP class is , which is responsible for processing this task queue record (must implement "TaskQueueRunner" interface)
    • TaskData - JSON-encoded data, that is needed for task execution (e.g. e-mail recipient, IDs of records to be processed)
    • LastStatus - status from last attempt of this queue record processing - {scheduled (default), processing, success, failed, timeout}; when queue processed in parallel, then last ended task status
    • MaxRetries - if task fails specified number of times (5 by default), then don't retry it
    • FailedRetries - failed retry count (number is reset, when task execution was successful)
  3. create "TaskQueueDetailsTaskQueueRuns" database table with following columns:
    • Id
    • TaskQueueId - associated task queue record
    • StartedOn - when task was started executing
    • PercentsCompleted - "0" by default, but will be updated as task is being processed
    • FinishedOn - when task was finished executing (regardless of status)
    • Results - JSON-encoded results in any format, that be later displayed in human-readable form
    • Status - {processing (default), success, error, timeout} - create record right before execution start; error - same statuses as for "TaskQueue.LastStatus" column
      • "scheduled" - initially, when record is created;
      • "processing" - when somebody is processing the record;
      • "success" - when execution finished without errors;
      • "error" - when known error happened during processing;
      • "timeout" - when status wasn't
      created
      • updated within 1 day
      ErrorCode 
      • (configurable per-queue record or system-wide, but can't be empty)
    • ErrorCode - non-empty when known error happened
    • ErrorMessage - non-empty when known error happened
  4. probably there won't be any UI for this functionality, because it's too general to be usable by user, but specialized sections (e.g. "E-mail Queue") can read data from these tables to keep user updated
  5. task can be created by whoever needs it, e.g. user presses a button, scheduled task decides to offload some work, etc.
  6. a scheduled task needs to be created to process records from "TaskQueue" table:
    1. pick records with "Status = scheduled" and "ScheduledOn < NOW()"
    2. create class instance (from "TaskClass" field) responsible for task processing
    3. if class isn't found, then:
      1. create record in "TaskQueueDetails" table with "error" status and error message set to "Task class not found"
      2. set "FailedRetries" to "MaxRetries"
    4. if class found:
      1. create record in "TaskQueueDetails" table with "processing" status
      2. let class to process task
      3. update record in "TaskQueueDetails" table by setting "success" status (if no errors happened), "error" status (and ErrorCode/ErrorMessage fields) when error happened
  7. the "task-queue-detail" unit would automatically update associated "task-queue" record to indicate last status and such fields
  8. delete data about successfully executed tasks after 1 month (configurable as log rotation) since task was executed
    • ProcessId - the PID of the process, that is handling (or was handling in past) this task (process that starts running this task run will put own PID in here)
  9. create units (the "task-queue" and "task-queue-run"), that corresponding to above described database tables
  10. in "TaskQueueRunEventHandler::OnAfterItemUpdate" aggregate totals from all runs from associated task queue record and update it (task queue record)

Part 2 (adding tasks & runs)

  1. create new "ParallelTaskQueueRunCount" system setting set to "8" by default
  2. create "TaskQueueHelper" class
  3. add public "TaskQueueHelper::queueTask($task_class, $task_data, $scheduled_on, $max_retries = null)" method, that will:
    1. create task queue record with given settings
    2. consider "$max_retries" as "5" when not given
    3. throw an exception, when specified "$task_class" doesn't exist
  4. add protected "TaskQueueHelper::synchronizeTaskRunStatus" method that will:
    1. get all task runs, that are running currently
    2. get status of their PIDs
    3. for all task runs which PIDs are dead set their status to "timeout"
  5. add protected "TaskQueueHelper::createTaskRun(kDBItem $task_queue)" method, that:
    1. will create new task run (and return it's ID) for given queue record, when all of following rules aren't violated:
      1. only 1 active (status = processing) task run can exist at same time (for a given queue record)
      2. sequential failed task run count (both "error" and "timeout" statuses are considered as failed) can't be more than max allowed retry count
    2. return "null" otherwise
  6. add protected "TaskQueueHelper::createMissingTaskRuns()" method, that will:
    1. get value of "ParallelTaskQueueRunCount" system setting and compare it to number of currently running task runs
    2. if there are smaller number of tasks running, then possible:
      1. get all task queue records, for which runs can be created (ScheduledOn < NOW() + not running right now + failed retry count < max retry count)
      2. call the "TaskQueueHelper::createTaskRun" on each of them
  7. add public "TaskQueueHelper::createTaskRuns()" method, that will:
    1. call "TaskQueueHelper::synchronizeTaskRunStatus" method
    2. call "TaskQueueHelper::createMissingTaskRuns" method
  8. add "TaskQueueEventHandler::OnCreateTaskRuns" event, that would be called as Scheduled Task on a regular basis (e.g. each 5 minutes)

Part 3 (running runs)

  1. add public "TaskQueueHelper::processTaskRun($task_run_id)", that will:
    1. load task run by given ID from the database (if failed throw an exception)
    2. set following fields and save changes to db immediately:
      1. "ProcessId"  to current process id
      2. "Status" to "processing"
    3. create instance of class from "TaskClass" field of associated task queue record
    4. call the "processTaskQueueRun" (wrapped within try/catch block) on that object providing task run object (was loaded above) as an argument
    5. the above method can update given object fields at will and save to db (e.g. "PercentsCompleted" and "Results")
    6. when exception was caught, then:
      1. set "Status" to "error"
      2. set "ErrorCode" to exception code
      3. set "ErrorMessage" to exception message
    7. when no exception was caught, then:
      1. set "Status" to "success"
      2. set "ErrorCode" and "ErrorMessage" to empty value
    8. set "FinishedOn" to time, when task was finished (with error or not)
    1. save changes to db
  2. add "TaskQueueEventHandler::OnProcessTaskRun" event, that will:
    1. when executed from web context throw an exception
    2. acquire WRITE lock "TaskQueueRuns" database table (prevent 2 events executed at same time using same task run)
    3. pick 1st available run (FIFO logic)
    4. release above acquired lock
    5. if none found, then exit
    6. call "TaskQueueHelper::processTaskRun" method with found task run id

Part 4 (rotation)

  1. create "TaskQueueRotationInterval" setting (same configuration as for e-mail logs)
  2. create "TaskQueueEventHandler::OnRotate" event (scheduled task), that would delete old (same concert as for e-mail logs) successful task queue records along with their runs

Part 5 (usage)

  1. configure either of following, but not as scheduled task, because it will block all other scheduled tasks:

    • command: /usr/bin/env php /path/to/in-portal/tools/run_event.php task-queue:OnProcessTaskRun password_here

    • setup "upstart" or "supervisord" or any other tool to ensure presence of X processes powered by above command

    • add X records to "crontab" file powered by above command

  2. there won't be any built-in UI for this functionality, because it's too general to be usable by user, but specialized sections (e.g. "E-mail Queue") can read data from these tables to keep user informed

  3. task can be created through calling "TaskQueueHelper::queueTask" method by whoever needs it, e.g.:
    • user presses a button
    • scheduled task decides to offload some work
    • etc.

Related Discussions

...