C
- type of context required by this workerpublic abstract class Worker<C extends WorkerContext<?>> extends Object implements Serializable
Mapper
and Reducer
.
Created during a MapReduceJob
to process a given shard. Processes
data for zero or more shards, each subdivided into zero or more slices, where
sharding and slicing is up to the caller.
Each shard starts with a call to beginShard()
, then processes zero
or more slices, then finishes with a call to endShard()
. Between two
shards, before the first shard, or after the final shard, the worker may go
through serialization and deserialization.
If a shard is aborted, there is no guarantee whether endShard()
will be called; however, if it is not called, the worker will not be
serialized or used further in any way. If the shard is retried later, the
worker serialized after the previous shard will be deserialized again.
A worker may be used for processing multiple shards in parallel; in that case, it is deserialized multiple times to produce multiple copies. Each copy will process only one shard at a time.
Each slice starts with a call to beginSlice()
, then performs the
actual work with zero or more calls to BaseMapper.map(I)
or
Reducer.reduce(K, com.google.appengine.tools.mapreduce.ReducerInput<V>)
, then endSlice()
. Between two slices, before
the first slice, or after the final slice, the worker may go through
serialization and deserialization.
If a slice is aborted, there is no guarantee whether endSlice()
will be called; however, if it is not called, the worker will not be
serialized or used further in any way. If the slice is retried later, the
Worker
serialized after the previous slice will be deserialized
again.
Slices of the same shard are processed by the same worker sequentially.
This class is really an interface that might be evolving. In order to avoid breaking users when we change the interface, we made it an abstract class.
Constructor and Description |
---|
Worker() |
Modifier and Type | Method and Description |
---|---|
boolean |
allowSliceRetry()
Indicate if a slice retry is allowed.
|
void |
beginShard()
Prepares the worker for processing a new shard, after possibly having gone
through serialization and deserialization.
|
void |
beginSlice()
Prepares the worker for processing a new slice, after possibly having gone
through serialization and deserialization.
|
void |
endShard()
Notifies the worker that there is no more data to process in the current
shard, and prepares it for possible serialization.
|
void |
endSlice()
Notifies the worker that there is no more data to process in the current
slice, and prepares it for possible serialization.
|
long |
estimateMemoryRequirement()
Returns an estimate of the amount of memory needed for this worker in bytes.
|
C |
getContext()
Returns the current context, or null if none.
|
void |
setContext(C context)
Used internally to sets the context to be used for the processing that follows.
|
public void setContext(C context)
public final C getContext()
public void beginShard()
public void endShard()
public void beginSlice()
public void endSlice()
public long estimateMemoryRequirement()
public boolean allowSliceRetry()
Copyright © 2015 Google. All rights reserved.