Grid task interface defines a task that can be executed on the grid. Grid task
is responsible for splitting business logic into multiple grid jobs, receiving
results from individual grid jobs executing on remote nodes, and reducing
(aggregating) received jobs' results into final grid task result.
Grid Task Execution Sequence
-
Upon request to execute a grid task with given task name system will find
deployed task with given name. Task needs to be deployed prior to execution
(see
org.apache.ignite.IgniteCompute#localDeployTask(Class,ClassLoader) method), however if task does not specify
its name explicitly via
ComputeTaskName annotation, it
will be auto-deployed first time it gets executed.
-
System will create new distributed task session (see
ComputeTaskSession).
-
System will inject all annotated resources (including task session) into grid task instance.
See
org.apache.ignite.resources package for the list of injectable resources.
-
System will apply
#map(List,Object). This
method is responsible for splitting business logic of grid task into
multiple grid jobs (units of execution) and mapping them to
grid nodes. Method
#map(List,Object) returns
a map of with grid jobs as keys and grid node as values.
-
System will send mapped grid jobs to their respective nodes.
-
Upon arrival on the remote node a grid job will be handled by collision SPI
(see
org.apache.ignite.spi.collision.CollisionSpi) which will determine how a job will be executed
on the remote node (immediately, buffered or canceled).
-
Once job execution results become available method
#result(ComputeJobResult,List)will be called for each received job result. The policy returned by this method will
determine the way task reacts to every job result:
-
If
ComputeJobResultPolicy#WAIT policy is returned, task will continue to wait
for other job results. If this result is the last job result, then
#reduce(List) method will be called.
-
If
ComputeJobResultPolicy#REDUCE policy is returned, then method
#reduce(List) will be called right away without waiting for
other jobs' completion (all remaining jobs will receive a cancel request).
-
If
ComputeJobResultPolicy#FAILOVER policy is returned, then job will
be failed over to another node for execution. The node to which job will get
failed over is decided by
org.apache.ignite.spi.failover.FailoverSpi SPI implementation.
Note that if you use
ComputeTaskAdapter adapter for
ComputeTaskimplementation, then it will automatically fail jobs to another node for 2
known failure cases:
-
Job has failed due to node crash. In this case
ComputeJobResult#getException()method will return an instance of
org.apache.ignite.cluster.ClusterTopologyException exception.
-
Job execution was rejected, i.e. remote node has cancelled job before it got
a chance to execute, while it still was on the waiting list. In this case
ComputeJobResult#getException() method will return an instance of
ComputeExecutionRejectedException exception.
-
Once all results are received or
#result(ComputeJobResult,List)method returned
ComputeJobResultPolicy#REDUCE policy, method
#reduce(List)is called to aggregate received results into one final result. Once this method is finished the
execution of the grid task is complete. This result will be returned to the user through
ComputeTaskFuture#get() method.
Continuous Job Mapper
For cases when jobs within split are too large to fit in memory at once or when
simply not all jobs in task are known during
#map(List,Object) step,
use
ComputeTaskContinuousMapper to continuously stream jobs from task even after
map(...)step is complete. Usually with continuous mapper the number of jobs within task
may grow too large - in this case it may make sense to use it in combination with
ComputeTaskNoResultCache annotation.
Task Result Caching
Sometimes job results are too large or task simply has too many jobs to keep track
of which may hinder performance. In such cases it may make sense to disable task
result caching by attaching
ComputeTaskNoResultCache annotation to task class, and
processing all results as they come in
#result(ComputeJobResult,List) method.
When Ignite sees this annotation it will disable tracking of job results and
list of all job results passed into
#result(ComputeJobResult,List) or
#reduce(List) methods will always be empty. Note that list of
job siblings on
ComputeTaskSession will also be empty to prevent number
of job siblings from growing as well.
Resource Injection
Grid task implementation can be injected using IoC (dependency injection) with
ignite resources. Both, field and method based injection are supported.
The following ignite resources can be injected:
-
org.apache.ignite.resources.TaskSessionResource
-
org.apache.ignite.resources.IgniteInstanceResource
-
org.apache.ignite.resources.LoggerResource
-
org.apache.ignite.resources.SpringApplicationContextResource
-
org.apache.ignite.resources.SpringResource
Refer to corresponding resource documentation for more information.
Grid Task Adapters
ComputeTask comes with several convenience adapters to make the usage easier:
-
ComputeTaskAdapter provides default implementation for
ComputeTask#result(ComputeJobResult,List)method which provides automatic fail-over to another node if remote job has failed
due to node crash (detected by
org.apache.ignite.cluster.ClusterTopologyException exception) or due to job
execution rejection (detected by
ComputeExecutionRejectedException exception).
Here is an example of how a you would implement your task using
ComputeTaskAdapter:
public class MyFooBarTask extends ComputeTaskAdapter<String, String> {
// Inject load balancer.
@LoadBalancerResource
ComputeLoadBalancer balancer;
// Map jobs to grid nodes.
public Map<? extends ComputeJob, ClusterNode> map(List<ClusterNode> subgrid, String arg) throws IgniteCheckedException {
Map<MyFooBarJob, ClusterNode> jobs = new HashMap<MyFooBarJob, ClusterNode>(subgrid.size());
// In more complex cases, you can actually do
// more complicated assignments of jobs to nodes.
for (int i = 0; i < subgrid.size(); i++) {
// Pick the next best balanced node for the job.
jobs.put(new MyFooBarJob(arg), balancer.getBalancedNode())
}
return jobs;
}
// Aggregate results into one compound result.
public String reduce(List<ComputeJobResult> results) throws IgniteCheckedException {
// For the purpose of this example we simply
// concatenate string representation of every
// job result
StringBuilder buf = new StringBuilder();
for (ComputeJobResult res : results) {
// Append string representation of result
// returned by every job.
buf.append(res.getData().string());
}
return buf.string();
}
}
-
ComputeTaskSplitAdapter hides the job-to-node mapping logic from
user and provides convenient
ComputeTaskSplitAdapter#split(int,Object)method for splitting task into sub-jobs in homogeneous environments.
Here is an example of how you would implement your task using
ComputeTaskSplitAdapter:
public class MyFooBarTask extends ComputeTaskSplitAdapter<Object, String> {
@Override
protected Collection<? extends ComputeJob> split(int gridSize, Object arg) throws IgniteCheckedException {
List<MyFooBarJob> jobs = new ArrayList<MyFooBarJob>(gridSize);
for (int i = 0; i < gridSize; i++) {
jobs.add(new MyFooBarJob(arg));
}
// Node assignment via load balancer
// happens automatically.
return jobs;
}
// Aggregate results into one compound result.
public String reduce(List<ComputeJobResult> results) throws IgniteCheckedException {
// For the purpose of this example we simply
// concatenate string representation of every
// job result
StringBuilder buf = new StringBuilder();
for (ComputeJobResult res : results) {
// Append string representation of result
// returned by every job.
buf.append(res.getData().string());
}
return buf.string();
}
}