opm-simulators
Loading...
Searching...
No Matches
Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy > Class Template Reference

The GpuSparseMatrixWrapper Checks CUDA/HIP version and dispatches a version either using the old or the generic CUDA API. More...

#include <GpuSparseMatrixWrapper.hpp>

Public Types

using field_type = T
using matrix_type = GpuSparseMatrix<T>

Public Member Functions

matrix_type * operator-> ()
const matrix_type * operator-> () const
 GpuSparseMatrixWrapper (const T *nonZeroElements, const int *rowIndices, const int *columnIndices, std::size_t numberOfNonzeroBlocks, std::size_t blockSize, std::size_t numberOfRows)
 Create the sparse matrix specified by the raw data.
 GpuSparseMatrixWrapper (const GpuVector< int > &rowIndices, const GpuVector< int > &columnIndices, std::size_t blockSize)
 Create a sparse matrix by copying the sparsity structure of another matrix, not filling in the values.
 GpuSparseMatrixWrapper (const GpuSparseMatrixWrapper &other)
GpuSparseMatrixWrapperoperator= (const GpuSparseMatrixWrapper &)=delete
const matrix_type & get () const
template<class M = matrix_type, typename = std::enable_if_t<std::is_same_v<M, GpuSparseMatrix<T>>>>
void setUpperTriangular ()
template<class M = matrix_type, typename = std::enable_if_t<std::is_same_v<M, GpuSparseMatrix<T>>>>
void setLowerTriangular ()
 setLowerTriangular sets the CuSparse flag that this is an lower diagonal (with non-unit diagonal) matrix.
template<class M = matrix_type, typename = std::enable_if_t<std::is_same_v<M, GpuSparseMatrix<T>>>>
void setUnitDiagonal ()
 setUnitDiagonal sets the CuSparse flag that this has unit diagional.
template<class M = matrix_type, typename = std::enable_if_t<std::is_same_v<M, GpuSparseMatrix<T>>>>
void setNonUnitDiagonal ()
 setNonUnitDiagonal sets the CuSparse flag that this has non-unit diagional.
std::size_t N () const
 N returns the number of rows (which is equal to the number of columns).
std::size_t nonzeroes () const
 nonzeroes behaves as the Dune::BCRSMatrix::nonzeros() function and returns the number of non zero blocks
GpuVector< T > & getNonZeroValues ()
 getNonZeroValues returns the GPU vector containing the non-zero values (ordered by block)
const GpuVector< T > & getNonZeroValues () const
 getNonZeroValues returns the GPU vector containing the non-zero values (ordered by block)
GpuVector< int > & getRowIndices ()
 getRowIndices returns the row indices used to represent the BSR structure.
const GpuVector< int > & getRowIndices () const
 getRowIndices returns the row indices used to represent the BSR structure.
GpuVector< int > & getColumnIndices ()
 getColumnIndices returns the column indices used to represent the BSR structure.
const GpuVector< int > & getColumnIndices () const
 getColumnIndices returns the column indices used to represent the BSR structure.
std::size_t dim () const
 dim returns the dimension of the vector space on which this matrix acts
std::size_t blockSize () const
 blockSize size of the blocks
detail::GpuSparseMatrixDescriptiongetDescription ()
 getDescription the cusparse matrix description.
virtual void mv (const GpuVector< T > &x, GpuVector< T > &y) const
 mv performs matrix vector multiply y = Ax
virtual void umv (const GpuVector< T > &x, GpuVector< T > &y) const
 umv computes y=Ax+y
virtual void usmv (T alpha, const GpuVector< T > &x, GpuVector< T > &y) const
 umv computes y=alpha * Ax + y
template<class MatrixType>
void updateNonzeroValues (const MatrixType &matrix, bool copyNonZeroElementsDirectly=false)
 updateNonzeroValues updates the non-zero values by using the non-zero values of the supplied matrix
template<bool OtherForceLegacy>
void updateNonzeroValues (const GpuSparseMatrixWrapper< T, OtherForceLegacy > &matrix)
 updateNonzeroValues updates the non-zero values by using the non-zero values of the supplied matrix
void setToZero ()
 setToZero resets the matrix to zero values.
template<class FunctionType>
auto dispatchOnBlocksize (FunctionType function) const
 Dispatches a function based on the block size of the matrix.

Static Public Member Functions

template<class MatrixType>
static GpuSparseMatrixWrapper< T, ForceLegacy > fromMatrix (const MatrixType &matrix, bool copyNonZeroElementsDirectly=false)
 fromMatrix creates a new matrix with the same block size and values as the given matrix

Static Public Attributes

static constexpr int max_block_size = 6
 Maximum block size supported by this implementation.

Detailed Description

template<typename T, bool ForceLegacy = false>
class Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >

The GpuSparseMatrixWrapper Checks CUDA/HIP version and dispatches a version either using the old or the generic CUDA API.

Note
we currently only support simple raw primitives for T (double and float). Block size is handled through the block size parameter
Template Parameters
Tthe type to store. Can be either float, double or int.
Note
we only support square matrices.
We only support Block Compressed Sparse Row Format (BSR) for now.
This class uses the legacy cuSPARSE API, to be compatible with CuSparse's ilu0 preconditioner. However, this preconditioner is deprecated and will be removed in future versions of CuSparse. So we should migrate to the new cuSPARSE generic API in the future.
To also support block size 1, we use the GpuSparseMatrixGeneric class which uses the new cuSPARSE generic API. This is a temporary solution, and we should migrate to the new API for all block sizes in the future by replacing this class with GpuSparseMatrixGeneric.

Constructor & Destructor Documentation

◆ GpuSparseMatrixWrapper() [1/2]

template<typename T, bool ForceLegacy = false>
Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::GpuSparseMatrixWrapper ( const T * nonZeroElements,
const int * rowIndices,
const int * columnIndices,
std::size_t numberOfNonzeroBlocks,
std::size_t blockSize,
std::size_t numberOfRows )
inline

Create the sparse matrix specified by the raw data.

Note
Prefer to use the constructor taking a const reference to a matrix instead.
Parameters
[in]nonZeroElementsthe non-zero values of the matrix
[in]rowIndicesthe row indices of the non-zero elements
[in]columnIndicesthe column indices of the non-zero elements
[in]numberOfNonzeroBlocksnumber of nonzero blocks
[in]blockSizesize of each block matrix (typically 3)
[in]numberOfRowsthe number of rows
Note
We assume numberOfNonzeroBlocks, blockSize and numberOfRows all are representable as int due to restrictions in the current version of cusparse. This might change in future versions.

◆ GpuSparseMatrixWrapper() [2/2]

template<typename T, bool ForceLegacy = false>
Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::GpuSparseMatrixWrapper ( const GpuVector< int > & rowIndices,
const GpuVector< int > & columnIndices,
std::size_t blockSize )
inline

Create a sparse matrix by copying the sparsity structure of another matrix, not filling in the values.

Note
Prefer to use the constructor taking a const reference to a matrix instead.
Parameters
[in]rowIndicesthe row indices of the non-zero elements
[in]columnIndicesthe column indices of the non-zero elements
[in]blockSizesize of each block matrix (typically 3)
Note
We assume numberOfNonzeroBlocks, blockSize and numberOfRows all are representable as int due to restrictions in the current version of cusparse. This might change in future versions.

Member Function Documentation

◆ dim()

template<typename T, bool ForceLegacy = false>
std::size_t Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::dim ( ) const
inline

dim returns the dimension of the vector space on which this matrix acts

This is equivalent to matrix.N() * matrix.blockSize()

Returns
matrix.N() * matrix.blockSize()

◆ dispatchOnBlocksize()

template<typename T, bool ForceLegacy = false>
template<class FunctionType>
auto Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::dispatchOnBlocksize ( FunctionType function) const
inline

Dispatches a function based on the block size of the matrix.

This method allows executing different code paths depending on the block size of the matrix, up to the maximum block size specified by max_block_size.

Use this function if you need the block size to be known at compile time.

Template Parameters
FunctionTypeType of the function to be dispatched
Parameters
functionThe function to be executed based on the block size
Returns
The result of the function execution

You can use this function as

◆ fromMatrix()

template<typename T, bool ForceLegacy = false>
template<class MatrixType>
GpuSparseMatrixWrapper< T, ForceLegacy > Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::fromMatrix ( const MatrixType & matrix,
bool copyNonZeroElementsDirectly = false )
inlinestatic

fromMatrix creates a new matrix with the same block size and values as the given matrix

Parameters
matrixthe matrix to copy from
copyNonZeroElementsDirectlyif true will do a memcpy from matrix[0][0][0][0], otherwise will build up the non-zero elements by looping over the matrix. Note that setting this to true will yield a performance increase, but might not always yield correct results depending on how the matrix has been initialized. If unsure, leave it as false.
Template Parameters
MatrixTypeis assumed to be a Dune::BCRSMatrix compatible matrix.

◆ getColumnIndices() [1/2]

template<typename T, bool ForceLegacy = false>
GpuVector< int > & Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::getColumnIndices ( )
inline

getColumnIndices returns the column indices used to represent the BSR structure.

Returns
Read the CuSPARSE documentation on Block Compressed Sparse Row Format (BSR) for the exact ordering.

◆ getColumnIndices() [2/2]

template<typename T, bool ForceLegacy = false>
const GpuVector< int > & Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::getColumnIndices ( ) const
inline

getColumnIndices returns the column indices used to represent the BSR structure.

Returns
Read the CuSPARSE documentation on Block Compressed Sparse Row Format (BSR) for the exact ordering.

◆ getDescription()

template<typename T, bool ForceLegacy = false>
detail::GpuSparseMatrixDescription & Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::getDescription ( )
inline

getDescription the cusparse matrix description.

This description is needed for most calls to the CuSparse library

◆ getNonZeroValues() [1/2]

template<typename T, bool ForceLegacy = false>
GpuVector< T > & Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::getNonZeroValues ( )
inline

getNonZeroValues returns the GPU vector containing the non-zero values (ordered by block)

Note
Read the CuSPARSE documentation on Block Compressed Sparse Row Format (BSR) for the exact ordering.

◆ getNonZeroValues() [2/2]

template<typename T, bool ForceLegacy = false>
const GpuVector< T > & Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::getNonZeroValues ( ) const
inline

getNonZeroValues returns the GPU vector containing the non-zero values (ordered by block)

Note
Read the CuSPARSE documentation on Block Compressed Sparse Row Format (BSR) for the exact ordering.

◆ getRowIndices() [1/2]

template<typename T, bool ForceLegacy = false>
GpuVector< int > & Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::getRowIndices ( )
inline

getRowIndices returns the row indices used to represent the BSR structure.

Note
Read the CuSPARSE documentation on Block Compressed Sparse Row Format (BSR) for the exact ordering.

◆ getRowIndices() [2/2]

template<typename T, bool ForceLegacy = false>
const GpuVector< int > & Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::getRowIndices ( ) const
inline

getRowIndices returns the row indices used to represent the BSR structure.

Note
Read the CuSPARSE documentation on Block Compressed Sparse Row Format (BSR) for the exact ordering.

◆ mv()

template<typename T, bool ForceLegacy = false>
virtual void Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::mv ( const GpuVector< T > & x,
GpuVector< T > & y ) const
inlinevirtual

mv performs matrix vector multiply y = Ax

Parameters
[in]xthe vector to multiply the matrix with
[out]ythe output vector

◆ nonzeroes()

template<typename T, bool ForceLegacy = false>
std::size_t Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::nonzeroes ( ) const
inline

nonzeroes behaves as the Dune::BCRSMatrix::nonzeros() function and returns the number of non zero blocks

Returns
number of non zero blocks.

◆ umv()

template<typename T, bool ForceLegacy = false>
virtual void Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::umv ( const GpuVector< T > & x,
GpuVector< T > & y ) const
inlinevirtual

umv computes y=Ax+y

Parameters
[in]xthe vector to multiply with A
[in,out]ythe vector to add and store the output in

◆ updateNonzeroValues() [1/2]

template<typename T, bool ForceLegacy = false>
template<bool OtherForceLegacy>
void Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::updateNonzeroValues ( const GpuSparseMatrixWrapper< T, OtherForceLegacy > & matrix)
inline

updateNonzeroValues updates the non-zero values by using the non-zero values of the supplied matrix

Parameters
matrixthe matrix to extract the non-zero values from
Note
This assumes the given matrix has the same sparsity pattern.

◆ updateNonzeroValues() [2/2]

template<typename T, bool ForceLegacy = false>
template<class MatrixType>
void Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::updateNonzeroValues ( const MatrixType & matrix,
bool copyNonZeroElementsDirectly = false )
inline

updateNonzeroValues updates the non-zero values by using the non-zero values of the supplied matrix

Parameters
matrixthe matrix to extract the non-zero values from
copyNonZeroElementsDirectlyif true will do a memcpy from matrix[0][0][0][0], otherwise will build up the non-zero elements by looping over the matrix. Note that setting this to true will yield a performance increase, but might not always yield correct results depending on how the matrix matrix has been initialized. If unsure, leave it as false.
Note
This assumes the given matrix has the same sparsity pattern.
Template Parameters
MatrixTypeis assumed to be a Dune::BCRSMatrix compatible matrix.

◆ usmv()

template<typename T, bool ForceLegacy = false>
virtual void Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::usmv ( T alpha,
const GpuVector< T > & x,
GpuVector< T > & y ) const
inlinevirtual

umv computes y=alpha * Ax + y

Parameters
[in]alphaScaling factor
[in]xthe vector to multiply with A
[in,out]ythe vector to add and store the output in

Member Data Documentation

◆ max_block_size

template<typename T, bool ForceLegacy = false>
int Opm::gpuistl::GpuSparseMatrixWrapper< T, ForceLegacy >::max_block_size = 6
staticconstexpr

Maximum block size supported by this implementation.

This constant defines an upper bound on the block size to ensure reasonable compilation times. While this class itself could support larger values, functions that call dispatchOnBlocksize() might have limitations. This value can be increased if needed, but will increase compilation time due to template instantiations.


The documentation for this class was generated from the following files: