Task Abstraction
This page explains the definition of tasks as a unit of work and how the system executes them.
Task
A Task
is a reusable and composable unit of work that is executed within the data plane.
Applications define and invoke one or more Tasks to achieve their goals.
A Task
logically describes what should be done.
A task can be either a Unit Task, or a Composite Task. The former is a single atomic unit of work, while the latter is a composition of multiple Unit Tasks.
Examples
A Task
can be:
- a single inference of a neural network on GPUs (e.g., Text encoder, Vision encoder, Audio encoder, LLM, DiT, VAE decoder, Vocoder)
- a composition of the above inference units (e.g., a Vision-Language model, a Thinker-Talker architecture model)
Concretely, MLLMTask
is a composition of EncoderTask
and LLMTask
.
Properties
Recursive Composition
Tasks are recursively composed; a Task can be a single inference of a neural network on GPUs or a DAG of other Tasks that make up a larger chunk of coherent work.
Data Forwarding
The inputs and outputs to a Task are defined by the Task itself, and both are stored in the App Driver. However, intermediate data (particularly tensors) are forwarded to next Tasks within the data plane via the Sidecar. That is, when a Task is composed of multiple sub-Tasks, the output of one sub-Task is forwarded to the next sub-Task in the DAG.
Static Graph Given Task Input
The concrete execution DAG of a Task must be statically determined at the time of invocation by a request.
That is, the DAG must be completely determined by the App Driver given the request to the app.
In other words, there can not be any dynamic control flow that depends on unmaterialized intermediate data.
For instance, the input to a Task may hold a field image_url: str | None
, and whether the execution DAG includes the image encoder can be determined by inspecting whether the image_url
field is None
or not.
Specification
The core specification of a Task is by its execution DAG. Each node is a Task instance that has:
- Task execution descriptor: Descriptor instance that describes how the Task is executed.
- The
invoke
method: The method that executes the Task. Input and output are Pydantic models. The Python code ininvoke
puts together the Tasks invocation to implicitly define the execution DAG.
TaskExecutionDescriptor
A TaskExecutionDescriptor
strategy class that describes how a Task is executed.
Each concrete Task
subclass is associated with one TaskExecutionDescriptor
subclasses and takes an instance of the descriptor as an argument to its constructor.
Examples
The LLMTask
is compatible with the VLLMDescriptor
, which describes how to execute the LLM task using vLLM.
Currently, only vLLM is implemented, but other executors like TensorRT-LLM or Dynamo can be implemented in the future.
Similarly, the EncoderTask
is compatible with the EricDescriptor
, which describes how to execute the encoder task using Eric.
Task Lifecycle
Registration
Unit Tasks classes (e.g., LLMTask
) are registered with the whole system.
Their source code (concrete class definition) should be available to all services in the system.
At the moment, we create multiple built-in Unit Task classes under cornserve.task.builtins
.
Deployment
A Unit Task class that is registered in the system can be deployed on the data plane as a Unit Task instance (e.g., LLMTask(model_id="llama")
).
- The Unit Task object is instantiated externally, and then serialized into JSON via Pydantic.
- The name of the Unit Task (as registered in the system) and the serialized JSON are sent to the Gateway service.
- The Gateway service sends the unit task instance to the Resource Manager, which ensures that the Task Manager for the Unit Task is running on the data plane.
- When the Task Manager is running, the Resource Manager notifies the Task Dispatcher with the unit task instance and Task Manager deployment information.
Deployed Unit Tasks become invocable, either as part of Composite Tasks or directly. Invocation can be driven by a static App driver registered in the Gateway service, a human user via our Jupyter Notebook interface.
Invocation
Task invocations go to the Task Dispatcher by calling and awaiting on the async __call__
method of the Task.
This internally calls all invoke
methods of Tasks in the DAG, where each unit Task constructs a TaskInvocation
object (task, input, and output) to add to a task-specific TaskContext
object.
The list of TaskInvocation
objects are sent to the Taks Dispatcher.
The Task Dispatcher is responsible for actually constructing requests, dispatching them to Task Executors, waiting for the results to come back, and then returning task outputs to the App Driver.
Deregistration
When an App is unregistered and if there are no other active Apps that require the Task, the Resource Manager will kill the Task Manager and free up the resources.