Task Abstraction

This page explains the definition of tasks as a unit of work and how the system executes them.

`Task`

A Task is a reusable and composable unit of work that is executed within the data plane. Applications define and invoke one or more Tasks to achieve their goals. A Task logically describes what should be done.

A task can be either a Unit Task, or a Composite Task. The former is a single atomic unit of work, while the latter is a composition of multiple Unit Tasks.

Examples

A Task can be:

a single inference of a neural network on GPUs (e.g., Text encoder, Vision encoder, Audio encoder, LLM, DiT, VAE decoder, Vocoder)
a composition of the above inference units (e.g., a Vision-Language model, a Thinker-Talker architecture model)

The former is called a Unit Task, and concrete unit tasks inherit from cornserve.task.base.UnitTask. On the other hand, the latter is called a Composite Task, and they simply inherit from cornserve.task.base.Task and declare their subtasks and their execution logic in the invoke method.

Concretely, MLLMTask is a composition of EncoderTask and LLMTask.

Properties

Recursive Composition

Tasks are recursively composed; a Task can be a single inference of a neural network on GPUs or a DAG of other Tasks that make up a larger chunk of coherent work.

Data Forwarding

The inputs and outputs to a Task are defined by the Task itself, and both are stored in the App Driver. However, intermediate data (particularly tensors) are forwarded to next Tasks within the data plane via the Sidecar. That is, when a Task is composed of multiple sub-Tasks, the output of one sub-Task is forwarded to the next sub-Task in the DAG.

Static Graph Given Task Input

The concrete execution DAG of a Task must be statically determined at the time of invocation by a request. That is, the DAG must be completely determined by the App Driver given the request to the app. In other words, there can not be any dynamic control flow that depends on unmaterialized intermediate data. For instance, the input to a Task may hold a field image_url: str | None, and whether the execution DAG includes the image encoder can be determined by inspecting whether the image_url field is None or not.

Specification

The core specification of a Task is by its execution DAG. Each node is a Task instance that has:

Task execution descriptor: Descriptor instance that describes how the Task is executed.
The invoke method: The method that executes the Task. Input and output are Pydantic models. The Python code in invoke puts together the Tasks invocation to implicitly define the execution DAG.

`TaskExecutionDescriptor`

A TaskExecutionDescriptor strategy class that describes how a Task is executed. Each concrete Task subclass is associated with one TaskExecutionDescriptor subclasses and takes an instance of the descriptor as an argument to its constructor.

Examples

The LLMTask is compatible with the VLLMDescriptor, which describes how to execute the LLM task using vLLM. Currently, only vLLM is implemented, but other executors like TensorRT-LLM or Dynamo can be implemented in the future. Similarly, the EncoderTask is compatible with the EricDescriptor, which describes how to execute the encoder task using Eric, and the GeneratorTask is compatible with the GeriDescriptor, which describes how to execute the generator task using Geri.

CRD-Based Task Management

Cornserve uses Kubernetes Custom Resource Definitions (CRDs) to enable dynamic task and execution descriptor management, moving away from statically built-in tasks to a more flexible runtime system.

Motivation

Rather than having tasks and execution descriptors statically compiled into the system, the CRD-based approach allows:

Dynamic registration: Add new task types and execution descriptors at runtime without redeploying services
Runtime flexibility: Deploy, update, and remove tasks during system operation
Fault tolerance: Services can recover task definitions and instances after restarts by reading from CRDs
Single source of truth: CRDs serve as the authoritative state for task definitions, descriptors, and instances

CRD Types

The system defines Custom Resources for:

Unit Task Classes: Definitions of atomic task types (e.g., LLMTask, EncoderTask)
Composite Task Classes: Definitions of task compositions (e.g., MLLMTask)
Task Execution Descriptors: Execution strategies for unit tasks (e.g., VLLMDescriptor, EricDescriptor)
Unit Task Instances: Specific instantiations of tasks with concrete parameters

How It Works

Control plane services (Gateway, Resource Manager, Task Manager, Task Dispatcher) use a TaskRegistry that:

Watches CRDs: Monitors Kubernetes for task and descriptor Custom Resources using the "list then watch" pattern
Loads into runtime: Dynamically loads task classes and descriptors into Python's module system when CRs are created or updated
Maintains registries: Keeps in-memory mappings from task/descriptor names to their runtime class definitions
References by name: Services pass task instance references (CR names) rather than serializing full task objects

This architecture enables eventual consistency across services while leveraging Kubernetes' strong consistency guarantees for the underlying CR storage.

Note

The CRD-based management system is under active development and subject to change, particularly around versioning strategies and tear-down semantics.

Task Lifecycle

Registration

Unit Tasks classes (e.g., LLMTask) are registered with the whole system. Their source code (concrete class definition) should be available to all services in the system. At the moment, we create multiple built-in Unit Task classes under cornserve_tasklib.task.unit and compose them under cornserve_tasklib.task.composite.

Deployment

A Unit Task class that is registered in the system can be deployed on the data plane as a Unit Task instance (e.g., LLMTask(model_id="llama")).

The Unit Task object is instantiated externally, and then serialized into JSON via Pydantic.
The name of the Unit Task (as registered in the system) and the serialized JSON are sent to the Gateway service.
The Gateway service sends the unit task instance to the Resource Manager, which ensures that the Task Manager for the Unit Task is running on the data plane.
When the Task Manager is running, the Resource Manager notifies the Task Dispatcher with the unit task instance and Task Manager deployment information.

Deployed Unit Tasks become invocable, either as part of Composite Tasks or directly. Invocation can be driven by a static App driver registered in the Gateway service, a human user via our Jupyter Notebook interface.

Invocation

Task invocations go to the Task Dispatcher by calling and awaiting on the async __call__ method of the Task. This internally calls all invoke methods of Tasks in the DAG, where each unit Task constructs a TaskInvocation object (task, input, and output) to add to a task-specific TaskContext object. The list of TaskInvocation objects are sent to the Taks Dispatcher.

The Task Dispatcher is responsible for actually constructing requests, dispatching them to Task Executors, waiting for the results to come back, and then returning task outputs to the App Driver.

Deregistration

When an App is unregistered and if there are no other active Apps that require the Task, the Resource Manager will kill the Task Manager and free up the resources.