Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Tasks in general can be of 2 types:

  • interactive - a progress bar is displayed showing how much % of task is done
  • non-interactive - nothing is shown to user
  • user triggered - user presses a button (or similar control) to trigger a task
  • system triggered - tasks are created on their own

Here is the matrix of what's currently supported by In-Portal:

 InteractiveNon-Interactive
User Triggered
  • all import scripts (e.g. product import in catalog)
  • rebuilding of category permissions cache
  • template parser template recompilation
  • e-mail queue
System Triggered
  • not supported
  • all scheduled tasks

Thoughts:

  • interactive tasks are blocking UI and therefore user can't do something else in Admin Console while they are running
  • user don't create about real-time status update of interactive tasks in most (but not all) cases, but just wants to know when they're completed
  • scheduled tasks (system triggered non-interactive tasks):
    • provide limited insight about task execution status
    • only retain last task execution status
  • e-mail queue:
    • successfully executed tasks are removed from queue (the "E-mail Logs" section sort of compensates for that currently)
    • no error information is stored within task

Solution

Part 1 (db tables & units)

  1. create "ITaskQueueRunner" interface with "processTaskQueueRun(kDBItem $task_queue_run)" method
  2. create "TaskQueue" database table with following columns:
    • Id
    • QueuedOn - when task was queued
    • QueuedById - who queued the task
    • ScheduledOn - when task needs to be processed (set at queuing time)
    • TaskClass - FQCN of PHP class, which is responsible for processing this task queue record (must implement "ITaskQueueRunner" interface)
    • TaskData - JSON-encoded data, that is needed for task execution (e.g. e-mail recipient, IDs of records to be processed)
    • LastStatus - status from last attempt of this queue record processing - {scheduled (default), processing, success, failed, timeout}; when queue processed in parallel, then last ended task status
    • MaxRetries - if task fails specified number of times (5 by default), then don't retry it
    • FailedRetries - failed retry count (number is reset, when task execution was successful)
  3. create "TaskQueueRuns" database table with following columns:
    • Id
    • TaskQueueId - associated task queue record
    • StartedOn - when task was started executing
    • PercentsCompleted - "0" by default, but will be updated as task is being processed
    • FinishedOn - when task was finished executing (regardless of status)
    • Results - JSON-encoded results in any format, that be later displayed in human-readable form
    • Status - same statuses as for "TaskQueue.LastStatus" column
      • "scheduled" - initially, when record is created;
      • "processing" - when somebody is processing the record;
      • "success" - when execution finished without errors;
      • "error" - when known error happened during processing;
      • "timeout" - when status wasn't updated within 1 day (configurable per-queue record or system-wide, but can't be empty)
    • ErrorCode - non-empty when known error happened
    • ErrorMessage - non-empty when known error happened
    • ProcessId - the process ID of process that runs this task run (can be a runner process id, or just a regular website visit process); NULL, when not being processed by anybody right now
  4. create "TaskQueueRunners" database table with following columns:
    1. Id
    2. ProcessId - the PID of associated task runner process
    3. StartedOn - when process was started
    4. FinishedOn - when process was finished; NULL initially
  5. create units (the "task-queue" and "task-queue-run", "task-queue-runner"), that corresponding to above described database tables
  6. in "TaskQueueRunEventHandler::OnAfterItemUpdate" aggregate totals from all runs from associated task queue record and update it (task queue record)

Part 2 (adding tasks & runs)

  1. create "TaskQueueHelper" class
  2. add public "TaskQueueHelper::queueTask($task_class, $task_data, $scheduled_on, $max_retries = null)" method, that will:
    1. create task queue record with given settings
    2. consider "$max_retries" as "5" when not given
    3. throw an exception, when specified "$task_class" doesn't exist
  3. add protected "TaskQueueHelper::synchronizeTaskRunStatus" method that will:
    1. get all task runs, that are running currently
    2. get status of their PIDs
    3. for all task runs which PIDs are dead set their status to "timeout"
  4. add protected "TaskQueueHelper::createTaskRun(kDBItem $task_queue)" method, that:
    1. will create new task run (and return it's ID) for given queue record, when all of following rules aren't violated:
      1. only 1 active (status = processing) task run can exist at same time (for a given queue record)
      2. sequential failed task run count (both "error" and "timeout" statuses are considered as failed) can't be more than max allowed retry count
    2. return "null" otherwise
  5. add protected "TaskQueueHelper::createMissingTaskRuns()" method, that will:
    1. get all records from "TaskQueue" table, for which task runs can be created:
      1. ScheduledOn < NOW()
      2. Status is not "processing"
      3. FailedRetries < MaxRetries
    2. call the "TaskQueueHelper::createTaskRun" on each of them (method can return "NULL" in some cases, but that's ok)
  6. add public "TaskQueueHelper::createTaskRuns()" method, that will:
    1. call "TaskQueueHelper::synchronizeTaskRunStatus" method
    2. call "TaskQueueHelper::createMissingTaskRuns" method
  7. add "TaskQueueEventHandler::OnCreateTaskRuns" event, that would be called as Scheduled Task on a regular basis (e.g. each 5 minutes)

Part 3 (running runs)

  1. create new "TaskQueueRunnerLimit" system setting set to "8" by default
  2. add public "TaskQueueHelper::processTaskRun($task_run_id)", that will:
    1. load task run by given ID from the database (if failed throw an exception)
    2. set following fields and save changes to db immediately:
      1. "ProcessId"  to current process id
      2. "Status" to "processing"
    3. create instance of class from "TaskClass" field of associated task queue record
    4. call the "processTaskQueueRun" (wrapped within try/catch block) on that object providing task run object (was loaded above) as an argument
    5. the above method can update given object fields at will and save to db (e.g. "PercentsCompleted" and "Results")
    6. when exception was caught, then:
      1. set "Status" to "error"
      2. set "ErrorCode" to exception code
      3. set "ErrorMessage" to exception message
    7. when no exception was caught, then:
      1. set "Status" to "success"
      2. set "ErrorCode" and "ErrorMessage" to empty value
    8. set "FinishedOn" to time, when task was finished (with error or not)
    1. save changes to db
  3. add protected "TaskQueueHelper::getQueueTaskRunnerCount()" method, that will:
    1. get all task queue runners, that are running currently ("FinishedOn IS NULL")
    2. get status of their PIDs
    3. for all task queue runners which PIDs are dead set their "FinishedOn" field to NOW()

    4. return number of running task runners (don't include ones updated above)
  4. add public "TaskQueueHelper::registerAsTaskQueueRunner()" method, that will:
    1. get value of  "TaskQueueRunnerLimit" system setting
    2. call "TaskQueueHelper::getTaskQueueRunnerCount" method
    3. if currently running task queue runner count is larger or equal to allowed count, then return "false"
    4. add record to "TaskQueueRunners" table with current PID
    5. return "true"
  5. add "TaskQueueEventHandler::OnProcessTaskRun" event, that will:
    1. call the "TaskQueueHelper::registerAsTaskQueueRunner" method
    2. if it returns "false", then do nothing and exit
    3. acquire WRITE lock "TaskQueueRuns" database table (prevent 2 events executed at same time using same task run)
    4. pick 1st available run (FIFO logic)
    5. release above acquired lock
    6. if none found, then exit
    7. call "TaskQueueHelper::processTaskRun" method with found task run id

Part 4 (rotation)

  1. create "TaskQueueRotationInterval" setting (same configuration as for e-mail logs)
  2. create "TaskQueueEventHandler::OnRotate" event (scheduled task), that would delete old (same concert as for e-mail logs) successful task queue records along with their runs

Part 5 (usage)

  1. configure either of following, but not as scheduled task, because it will block all other scheduled tasks:

    • command: /usr/bin/env php /path/to/in-portal/tools/run_event.php task-queue:OnProcessTaskRun password_here

    • setup "upstart" or "supervisord" or any other tool to ensure presence of X processes powered by above command

    • add X records to "crontab" file powered by above command

  2. there won't be any built-in UI for this functionality, because it's too general to be usable by user, but specialized sections (e.g. "E-mail Queue") can read data from these tables to keep user informed

  3. task can be created through calling "TaskQueueHelper::queueTask" method by whoever needs it, e.g.:
    • user presses a button
    • scheduled task decides to offload some work
    • etc.

Related Discussions

Related Tasks

  • No labels