Layer Fusion¶

GPU kernel fusion API reference

miopenFusionDirection_t¶

enum miopenFusionDirection_t¶

Kernel fusion direction in the network.

Values:

enumerator miopenVerticalFusion¶: fuses layers vertically, current the only supported mode

enumerator miopenHorizontalFusion¶: fuses layers horizontally, this is unimplemented

miopenCreateFusionPlan¶

miopenStatus_t miopenCreateFusionPlan(miopenFusionPlanDescriptor_t *fusePlanDesc, const miopenFusionDirection_t fuseDirection, const miopenTensorDescriptor_t inputDesc)¶

Creates the kenrel fusion plan descriptor object.

Parameters

fusePlanDesc – Pointer to a fusion plan (output)
fuseDirection – Horizontal or Vertical fusion (input)
inputDesc – Descriptor to tensor for the input (input)

Returns

miopenStatus_t

miopenDestroyFusionPlan¶

miopenStatus_t miopenDestroyFusionPlan(miopenFusionPlanDescriptor_t fusePlanDesc)¶

Destroy the fusion plan descriptor object.

Parameters: fusePlanDesc – A fusion plan descriptor type
Returns: miopenStatus_t

miopenCompileFusionPlan¶

miopenStatus_t miopenCompileFusionPlan(miopenHandle_t handle, miopenFusionPlanDescriptor_t fusePlanDesc)¶

Compiles the fusion plan.

Parameters

handle – MIOpen handle (input)
fusePlanDesc – A fusion plan descriptor (input)

Returns

miopenStatus_t

miopenFusionPlanGetOp¶

miopenStatus_t miopenFusionPlanGetOp(miopenFusionPlanDescriptor_t fusePlanDesc, const int op_idx, miopenFusionOpDescriptor_t *op)¶

Allows access to the operators in a fusion plan.

This api call does bounds checking on the supplied op_idx and would return miopenStatusError if the index is out of bounds

Parameters

fusePlanDesc – A fusion plan descriptor (input)
op_idx – Index of the required operator in the fusion plan, in the order of insertion
op – returned pointer to the operator

Returns

miopenStatus_t

miopenFusionPlanGetWorkSpaceSize¶

miopenStatus_t miopenFusionPlanGetWorkSpaceSize(miopenHandle_t handle, miopenFusionPlanDescriptor_t fusePlanDesc, size_t *workSpaceSize, miopenConvFwdAlgorithm_t algo)¶

Query the workspace size required for the fusion plan.

Parameters

fusePlanDesc – A fusion plan descriptor (input)
workSpaceSize – Pointer to memory to return size in bytes (output)

Returns

miopenStatus_t

miopenFusionPlanConvolutionGetAlgo¶

miopenStatus_t miopenFusionPlanConvolutionGetAlgo(miopenFusionPlanDescriptor_t fusePlanDesc, const int requestAlgoCount, int *returnedAlgoCount, miopenConvFwdAlgorithm_t *returnedAlgos)¶

Returns the supported algorithms for the convolution operator in the Fusion Plan.

A Convolution operator in a fusion plan may be implemented by different algorithms representing different tradeoffs of memory and performance. The returned list of algorithms is sorted in decreasing order of priority. Therefore, if the user does not request an algorithm to be set using the miopenFusionPlanConvolutionSetAlgo call, the first algorithm in the list would be used to execute the convolution in the fusion plan. Moreover this call must be immediately preceded by the miopenCreateOpConvForward call for the op in question.

Parameters

fusePlanDesc – A fusion plan descriptor (input)
requestAlgoCount – Number of algorithms to return (input)
returnedAlgoCount – The actual number of returned algorithms; always be less than equal to requestAlgoCount (output)
returnedAlgos – Pointer to the list of supported algorithms

Returns

miopenStatus_t

miopenCreateOpConvForward¶

miopenStatus_t miopenCreateOpConvForward(miopenFusionPlanDescriptor_t fusePlanDesc, miopenFusionOpDescriptor_t *convOp, miopenConvolutionDescriptor_t convDesc, const miopenTensorDescriptor_t wDesc)¶

Creates forward convolution operator.

Parameters

fusePlanDesc – A fusion plan descriptor (input)
convOp – Pointer to an operator type (output)
convDesc – Convolution layer descriptor (input)
wDesc – Descriptor for the weights tensor (input)

Returns

miopenStatus_t

miopenCreateOpActivationForward¶

miopenStatus_t miopenCreateOpActivationForward(miopenFusionPlanDescriptor_t fusePlanDesc, miopenFusionOpDescriptor_t *activFwdOp, miopenActivationMode_t mode)¶

Creates a forward activation operator.

Parameters

fusePlanDesc – A fusion plan descriptor (input)
activFwdOp – Pointer to an operator type (output)
mode – Activation version (input)

Returns

miopenStatus_t

miopenCreateOpBiasForward¶

miopenStatus_t miopenCreateOpBiasForward(miopenFusionPlanDescriptor_t fusePlanDesc, miopenFusionOpDescriptor_t *biasOp, const miopenTensorDescriptor_t bDesc)¶

Creates a forward bias operator.

Parameters

fusePlanDesc – A fusion plan descriptor (input)
biasOp – Pointer to an operator type (output)
bDesc – bias tensor descriptor (input)

Returns

miopenStatus_t

miopenCreateOpBatchNormInference¶

miopenStatus_t miopenCreateOpBatchNormInference(miopenFusionPlanDescriptor_t fusePlanDesc, miopenFusionOpDescriptor_t *bnOp, const miopenBatchNormMode_t bn_mode, const miopenTensorDescriptor_t bnScaleBiasMeanVarDesc)¶

Creates a forward inference batch normalization operator.

Parameters

fusePlanDesc – A fusion plan descriptor (input)
bnOp – Pointer to an operator type (output)
bn_mode – Batch normalization layer mode (input)
bnScaleBiasMeanVarDesc – Gamma, beta, mean, variance tensor descriptor (input)

Returns

miopenStatus_t

miopenCreateOperatorArgs¶

miopenStatus_t miopenCreateOperatorArgs(miopenOperatorArgs_t *args)¶

Creates an operator argument object.

Parameters: args – Pointer to an operator argument type (output)
Returns: miopenStatus_t

miopenDestroyOperatorArgs¶

miopenStatus_t miopenDestroyOperatorArgs(miopenOperatorArgs_t args)¶

Destroys an operator argument object.

Parameters: args – An operator argument type (output)
Returns: miopenStatus_t

miopenSetOpArgsConvForward¶

miopenStatus_t miopenSetOpArgsConvForward(miopenOperatorArgs_t args, const miopenFusionOpDescriptor_t convOp, const void *alpha, const void *beta, const void *w)¶

Sets the arguments for forward convolution op.

Parameters

args – An arguments object type (output)
convOp – Forward convolution operator (input)
alpha – Floating point scaling factor, allocated on the host (input)
beta – Floating point shift factor, allocated on the host (input)
w – Pointer to tensor memory (input)

Returns

miopenStatus_t

miopenSetOpArgsBatchNormInference¶

miopenStatus_t miopenSetOpArgsBatchNormInference(miopenOperatorArgs_t args, const miopenFusionOpDescriptor_t bnOp, const void *alpha, const void *beta, const void *bnScale, const void *bnBias, const void *estimatedMean, const void *estimatedVariance, double epsilon)¶

Sets the arguments for inference batch normalization op.

Parameters

args – An arguments object type (output)
bnOp – Batch normalization inference operator (input)
alpha – Floating point scaling factor, allocated on the host (input)
beta – Floating point shift factor, allocated on the host (input)
bnScale – Pointer to the gamma tensor memory (input)
bnBias – Pointer to the beta tensor memory (input)
estimatedMean – Pointer to population mean memory (input)
estimatedVariance – Pointer to population variance memory (input)
epsilon – Scalar value for numerical stability (input)

Returns

miopenStatus_t

miopenSetOpArgsBiasForward¶

miopenStatus_t miopenSetOpArgsBiasForward(miopenOperatorArgs_t args, const miopenFusionOpDescriptor_t biasOp, const void *alpha, const void *beta, const void *bias)¶

Sets the arguments for forward bias op.

Parameters

args – An arguments object type (output)
biasOp – Forward bias operator (input)
alpha – Floating point scaling factor, allocated on the host (input)
beta – Floating point shift factor, allocated on the host (input)
bias – Pointer to the forward bias input tensor memory (input)

Returns

miopenStatus_t

miopenExecuteFusionPlan¶

miopenStatus_t miopenExecuteFusionPlan(const miopenHandle_t handle, const miopenFusionPlanDescriptor_t fusePlanDesc, const miopenTensorDescriptor_t inputDesc, const void *input, const miopenTensorDescriptor_t outputDesc, void *output, miopenOperatorArgs_t args)¶

Executes the fusion plan.

Parameters

handle – MIOpen handle (input)
fusePlanDesc – fused plan descriptor (input)
inputDesc – Descriptor of the input tensor (input)
input – Source data tensor (input)
outputDesc – Decriptor of the output tensor (input)
output – Destination data tensor (output)
args – An argument object of the fused kernel (input)

Returns

miopenStatus_t