public abstract class Kernel
extends java.lang.Object
implements java.lang.Cloneable
To write a new kernel, a developer extends the Kernel
class and overrides the Kernel.run()
method.
To execute this kernel, the developer creates a new instance of it and calls Kernel.execute(int globalSize)
with a suitable 'global size'. At runtime
Aparapi will attempt to convert the Kernel.run()
method (and any method called directly or indirectly
by Kernel.run()
) into OpenCL for execution on GPU devices made available via the OpenCL platform.
Note that Kernel.run()
is not called directly. Instead,
the Kernel.execute(int globalSize)
method will cause the overridden Kernel.run()
method to be invoked once for each value in the range 0...globalSize
.
On the first call to Kernel.execute(int _globalSize)
, Aparapi will determine the EXECUTION_MODE of the kernel.
This decision is made dynamically based on two factors:
run()
method (and every method that can be called directly or indirectly from the run()
method)
can be converted into OpenCL.Below is an example Kernel that calculates the square of a set of input values.
class SquareKernel extends Kernel{ private int values[]; private int squares[]; public SquareKernel(int values[]){ this.values = values; squares = new int[values.length]; } public void run() { int gid = getGlobalID(); squares[gid] = values[gid]*values[gid]; } public int[] getSquares(){ return(squares); } }
To execute this kernel, first create a new instance of it and then call execute(Range _range)
.
int[] values = new int[1024]; // fill values array Range range = Range.create(values.length); // create a range 0..1024 SquareKernel kernel = new SquareKernel(values); kernel.execute(range);
When execute(Range)
returns, all the executions of Kernel.run()
have completed and the results are available in the squares
array.
int[] squares = kernel.getSquares(); for (int i=0; i< values.length; i++){ System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]); }
A different approach to creating kernels that avoids extending Kernel is to write an anonymous inner class:
final int[] values = new int[1024]; // fill the values array final int[] squares = new int[values.length]; final Range range = Range.create(values.length); Kernel kernel = new Kernel(){ public void run() { int gid = getGlobalID(); squares[gid] = values[gid]*values[gid]; } }; kernel.execute(range); for (int i=0; i< values.length; i++){ System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]); }
Modifier and Type | Class and Description |
---|---|
static interface |
Kernel.Constant
We can use this Annotation to 'tag' intended constant buffers.
|
class |
Kernel.Entry |
static class |
Kernel.EXECUTION_MODE
The execution mode ENUM enumerates the possible modes of executing a kernel.
|
class |
Kernel.KernelState
This class is for internal Kernel state management
|
static interface |
Kernel.Local
We can use this Annotation to 'tag' intended local buffers.
|
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
CONSTANT_SUFFIX
We can use this suffix to 'tag' intended constant buffers.
|
static java.lang.String |
LOCAL_SUFFIX
We can use this suffix to 'tag' intended local buffers.
|
Constructor and Description |
---|
Kernel() |
Modifier and Type | Method and Description |
---|---|
void |
addExecutionModes(Kernel.EXECUTION_MODE... platforms)
set possible fallback path for execution modes.
|
Kernel |
clone()
When using a Java Thread Pool Aparapi uses clone to copy the initial instance to each thread.
|
void |
dispose()
Release any resources associated with this Kernel.
|
Kernel |
execute(int _range)
Start execution of
_range kernels. |
Kernel |
execute(int _range,
int _passes)
Start execution of
_passes iterations over the _range of kernels. |
Kernel |
execute(Kernel.Entry _entry,
Range _range)
Start execution of
globalSize kernels for the given entrypoint. |
Kernel |
execute(Range _range)
Start execution of
_range kernels. |
Kernel |
execute(Range _range,
int _passes)
Start execution of
_passes iterations of _range kernels. |
Kernel |
execute(java.lang.String _entrypoint,
Range _range)
Start execution of
globalSize kernels for the given entrypoint. |
Kernel |
execute(java.lang.String _entrypoint,
Range _range,
int _passes)
Start execution of
globalSize kernels for the given entrypoint. |
Kernel |
get(boolean[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(byte[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(char[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(double[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(float[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(int[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(long[] array)
Enqueue a request to return this buffer from the GPU.
|
long |
getAccumulatedExecutionTime()
Determine the total execution time of all previous Kernel.execute(range) calls.
|
long |
getConversionTime()
Determine the time taken to convert bytecode to OpenCL for first Kernel.execute(range) call.
|
Kernel.EXECUTION_MODE |
getExecutionMode()
Return the current execution mode.
|
long |
getExecutionTime()
Determine the execution time of the previous Kernel.execute(range) call.
|
Kernel.KernelState |
getKernelState() |
static java.lang.String |
getMappedMethodName(ClassModel.ConstantPool.MethodReferenceEntry _methodReferenceEntry) |
java.util.List<ProfileInfo> |
getProfileInfo()
Get the profiling information from the last successful call to Kernel.execute().
|
boolean |
hasNextExecutionMode() |
boolean |
isExplicit()
For dev purposes (we should remove this for production) determine whether this Kernel uses explicit memory management
|
static boolean |
isMappedMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
static boolean |
isOpenCLDelegateMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
Kernel |
put(boolean[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(byte[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(char[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(double[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(float[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(int[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(long[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
abstract void |
run()
The entry point of a kernel.
|
void |
setExecutionMode(Kernel.EXECUTION_MODE _executionMode)
Set the execution mode.
|
void |
setExplicit(boolean _explicit)
For dev purposes (we should remove this for production) allow us to define that this Kernel uses explicit memory management
|
void |
setFallbackExecutionMode() |
void |
tryNextExecutionMode()
try the next execution path in the list if there aren't any more than give up
|
static boolean |
usesAtomic32(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
static boolean |
usesAtomic64(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
public static final java.lang.String LOCAL_SUFFIX
int[] buffer_$local$ = new int[1024];
Or use the Annotation form
@Local int[] buffer = new int[1024];
public static final java.lang.String CONSTANT_SUFFIX
int[] buffer_$constant$ = new int[1024];
Or use the Annotation form
@Constant int[] buffer = new int[1024];
public abstract void run()
Every kernel must override this method.
public Kernel clone()
If you choose to override clone()
you are responsible for delegating to super.clone();
clone
in class java.lang.Object
public Kernel.KernelState getKernelState()
public long getExecutionTime()
getConversionTime();
,
getAccumulatedExectutionTime();
public long getAccumulatedExecutionTime()
getExecutionTime();
,
getConversionTime();
public long getConversionTime()
getExecutionTime();
,
getAccumulatedExectutionTime();
public Kernel execute(Range _range)
_range
kernels.
When kernel.execute(globalSize)
is invoked, Aparapi will schedule the execution of globalSize
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
range
- The number of Kernels that we would like to initiate.public Kernel execute(int _range)
_range
kernels.
When kernel.execute(_range)
is invoked, Aparapi will schedule the execution of _range
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
Since adding the new Range class
this method offers backward compatibility and merely defers to return (execute(Range.create(_range), 1));
.
_range
- The number of Kernels that we would like to initiate.public Kernel execute(Range _range, int _passes)
_passes
iterations of _range
kernels.
When kernel.execute(_range, _passes)
is invoked, Aparapi will schedule the execution of _reange
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_globalSize
- The number of Kernels that we would like to initiate._passes
- The number of passes to makepublic Kernel execute(int _range, int _passes)
_passes
iterations over the _range
of kernels.
When kernel.execute(_range)
is invoked, Aparapi will schedule the execution of _range
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
Since adding the new Range class
this method offers backward compatibility and merely defers to return (execute(Range.create(_range), 1));
.
_range
- The number of Kernels that we would like to initiate.public Kernel execute(Kernel.Entry _entry, Range _range)
globalSize
kernels for the given entrypoint.
When kernel.execute("entrypoint", globalSize)
is invoked, Aparapi will schedule the execution of globalSize
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernel_globalSize
- The number of Kernels that we would like to initiate.public Kernel execute(java.lang.String _entrypoint, Range _range)
globalSize
kernels for the given entrypoint.
When kernel.execute("entrypoint", globalSize)
is invoked, Aparapi will schedule the execution of globalSize
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernel_globalSize
- The number of Kernels that we would like to initiate.public Kernel execute(java.lang.String _entrypoint, Range _range, int _passes)
globalSize
kernels for the given entrypoint.
When kernel.execute("entrypoint", globalSize)
is invoked, Aparapi will schedule the execution of globalSize
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernel_globalSize
- The number of Kernels that we would like to initiate.public void dispose()
When the execution mode is CPU
or GPU
, Aparapi stores some OpenCL resources in a data structure associated with the kernel instance. The
dispose()
method must be called to release these resources.
If execute(int _globalSize)
is called after dispose()
is called the results are undefined.
public Kernel.EXECUTION_MODE getExecutionMode()
After a Kernel executes, the return value will be the mode in which the Kernel actually executed.
setExecutionMode(EXECUTION_MODE)
public void setExecutionMode(Kernel.EXECUTION_MODE _executionMode)
This should be regarded as a request. The real mode will be determined at runtime based on the availability of OpenCL and the characteristics of the workload.
_executionMode
- the requested execution mode.getExecutionMode()
public void setFallbackExecutionMode()
public static java.lang.String getMappedMethodName(ClassModel.ConstantPool.MethodReferenceEntry _methodReferenceEntry)
public static boolean isMappedMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public static boolean isOpenCLDelegateMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public static boolean usesAtomic32(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public static boolean usesAtomic64(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public void setExplicit(boolean _explicit)
_explicit
- (true if we want explicit memory management)public boolean isExplicit()
public Kernel put(long[] array)
array
- public Kernel put(double[] array)
array
- public Kernel put(float[] array)
array
- public Kernel put(int[] array)
array
- public Kernel put(byte[] array)
array
- public Kernel put(char[] array)
array
- public Kernel put(boolean[] array)
array
- public Kernel get(long[] array)
array
- public Kernel get(double[] array)
array
- public Kernel get(float[] array)
array
- public Kernel get(int[] array)
array
- public Kernel get(byte[] array)
array
- public Kernel get(char[] array)
array
- public Kernel get(boolean[] array)
array
- public java.util.List<ProfileInfo> getProfileInfo()
public void addExecutionModes(Kernel.EXECUTION_MODE... platforms)
public boolean hasNextExecutionMode()
public void tryNextExecutionMode()