A NUMA-aware creator for matrix sparsity patterns. More...

#include <threadedMatrix.hh>

Detailed Description

template<class Index = size_t>
class Kaskade::NumaCRSPatternCreator< Index >

A NUMA-aware creator for matrix sparsity patterns.

This supports a two-phase creation of sparsity pattern. Nonzero entry positions can be added to the creator at will. At any time, a NumaCRSPattern sparsity pattern can be constructed efficiently by supplying this creator.

Template Parameters

Index the integral type used for row/column indices (defaults to size_t)

Definition at line 1616 of file threadedMatrix.hh.

Public Member Functions
	NumaCRSPatternCreator (Index rows_, Index cols, bool symmetric=false, int nnzPerRow=0)
	Constructs a rows times cols matrix sparsity structure creator. More...

	~NumaCRSPatternCreator ()

void	addElement (Index row, Index col)
	Enters a single entry into the sparsity pattern. More...

void	addDenseBlock (Index fromRow, Index toRow, Index fromCol, Index toCol)
	Enters a contiguous dense block into the sparsity pattern. More...

template<class IterRow , class IterCol >
void	addElements (IterRow const fromRow, IterRow const toRow, IterCol const fromCol, IterCol const toCol, bool colIsSorted=false)
	Enters entries into the sparsity pattern. More...

template<class RowRangeSequence , class ColRangeSequence >
void	addElements (RowRangeSequence const &rrs, ColRangeSequence const &crs, bool colsAreSorted=false)
	Enters elements into the sparsity pattern. More...

void	addAllElements ()
	Enters all possible elements (defining a dense matrix). More...

void	addDiagonal ()
	Enters the diagonal elements. More...

size_t	nonzeroes () const
	The number of structurally nonzero elements. More...

Index	cols () const
	The number of columns. More...

void	balance ()
	Redistributes the rows to the NUMA chunks in order to have the same number of entries in each chunk. More...

int	nodes () const
	Returns the number of NUMA nodes/chunks used. More...

ChunkCreator const &	creator (int node) const
	Returns the chunk creator. More...

bool	isSymmetric () const
	Returns the symmetry status of the pattern. More...

Constructor & Destructor Documentation

◆ NumaCRSPatternCreator()

template<class Index = size_t>

Kaskade::NumaCRSPatternCreator< Index >::NumaCRSPatternCreator	(	Index	rows_,
		Index	cols,
		bool	symmetric = `false`,
		int	nnzPerRow = `0`
	)

inline

Constructs a rows times cols matrix sparsity structure creator.

Parameters

rows	the number of rows.
cols	the number of columns.
symmetric	if true, the matrix is symmetric and only the lower triangular part is actually stored (only for square matrices).
nnzPerRow	hint for the expected number of nonzeroes per row

The rows are distributed evenly among the NUMA nodes in case of unsymmetric patterns, and roughly proportional to \( i^{-1/2} \) for symmetric ones. This balances the number of elements in each chunk if there is the same number of elements in each row.

Providing a sufficiently large nnzPerRow hint can reduce the number of memory reallocations performed during element insertion and hence speed up the insertion significantly (a factor of 40 has been observed!) at the cost of potentially increased memory footprint. A guideline to what is "sufficiently large": The memory buffer of a row needs to temporarily store the number of entries in this row plus the entries that are inserted by the addElements operation to the row. E.g., for 2D linear finite elements on a triangular grid, this would be 7 (number of entries per row in a regular grid, say 8 or 9 as a precaution for unstructured grids) + 3 (elemental matrices are 3x3, so 3 elements are added by an addElements operation), in total 10 to 12. If memory consumption is not the prime concern, err on the larger side.

The effect need not be present, and in fact, specifying a nonzero nnzPerRow may actually be slower since all chunks request memory at the same time and put a high load on the memory management system. This has been observed to parallelize rather badly.

In the end, choose a value based on profiling your application.

Warning: Memory usage may be exhaustive if there are few essentially dense rows into which entries are added alternatingly with entries into many sparse rows!

Definition at line 1652 of file threadedMatrix.hh.

◆ ~NumaCRSPatternCreator()

template<class Index = size_t>

Kaskade::NumaCRSPatternCreator< Index >::~NumaCRSPatternCreator ( )

inline

Definition at line 1674 of file threadedMatrix.hh.

Member Function Documentation

◆ addAllElements()

template<class Index = size_t>

void Kaskade::NumaCRSPatternCreator< Index >::addAllElements ( )

inline

Enters all possible elements (defining a dense matrix).

Sometimes (e.g. with spatially constant variables), FE matrices are actually dense. In this case we need not create a sparsity pattern, but can simplify the process.

Todo:: This should be statically encoded in a different matrix type.

Definition at line 1770 of file threadedMatrix.hh.

◆ addDenseBlock()

template<class Index = size_t>

void Kaskade::NumaCRSPatternCreator< Index >::addDenseBlock	(	Index	fromRow,
		Index	toRow,
		Index	fromCol,
		Index	toCol
	)

inline

Enters a contiguous dense block into the sparsity pattern.

Parameters

fromRow	first row in the block
toRow	one behind last row in the block
fromCol	first column in the block
toCol	one behind last column in the block

Definition at line 1701 of file threadedMatrix.hh.

Referenced by Kaskade::BDDC::KKTSolver< Domain >::KKTSolver().

◆ addDiagonal()

template<class Index = size_t>

void Kaskade::NumaCRSPatternCreator< Index >::addDiagonal ( )

inline

Enters the diagonal elements.

Definition at line 1780 of file threadedMatrix.hh.

Referenced by Kaskade::NumaBCRSMatrix< Entry, Index >::conjugation().

◆ addElement()

template<class Index = size_t>

void Kaskade::NumaCRSPatternCreator< Index >::addElement	(	Index	row,
		Index	col
	)

inline

Enters a single entry into the sparsity pattern.

Note that entering individual entries has a significant overhead. Use one of the other methods if possible.

Definition at line 1688 of file threadedMatrix.hh.

Referenced by Kaskade::NumaCRSPatternCreator< Index >::addDenseBlock(), Kaskade::NumaCRSPatternCreator< Index >::addDiagonal(), Kaskade::NumaBCRSMatrix< Entry, Index >::diagcat(), Kaskade::MGProlongation::galerkinProjection(), Kaskade::getContactConstraints(), Kaskade::NumaBCRSMatrix< Entry, Index >::horzcat(), Kaskade::nonNestedProlongation(), Kaskade::operator+(), Kaskade::prolongation(), Kaskade::NumaBCRSMatrix< Entry, Index >::reshapeBlocks(), Kaskade::sparseUnitMatrix(), Kaskade::twoGridProlongation(), and Kaskade::NumaBCRSMatrix< Entry, Index >::vertcat().

◆ addElements() [1/2]

template<class Index = size_t>

template<class IterRow , class IterCol >

void Kaskade::NumaCRSPatternCreator< Index >::addElements	(	IterRow const	fromRow,
		IterRow const	toRow,
		IterCol const	fromCol,
		IterCol const	toCol,
		bool	colIsSorted = `false`
	)

inline

Enters entries into the sparsity pattern.

Template Parameters

IterRow	random access iterator for a range of row indices
IterCol	random access iterator for a range of column indices

Parameters

fromRow	start of row indices
toRow	one behind of last row index
fromCol	start of sorted column indices
toCol	one behind of last column index
colIsSorted	true if the column index range is sorted ascendingly

All entries \( (i,j) \) with \( i \) in the given row indices and \( j \) in the given column indices are entered into the sparsity structure. Note that the column indices have to be sorted. No index in the row range shall occur twice.

Definition at line 1726 of file threadedMatrix.hh.

Referenced by Kaskade::NumaCRSPatternCreator< Index >::addElement(), Kaskade::NumaCRSPatternCreator< Index >::addElements(), Kaskade::MGProlongation::asMatrix(), Kaskade::NumaBCRSMatrix< Entry, Index >::conjugation(), Kaskade::MGProlongation::galerkinProjection(), Kaskade::insertMatrixBlockIndices(), and Kaskade::NumaBCRSMatrix< Entry, Index >::operator*().

◆ addElements() [2/2]

template<class Index = size_t>

template<class RowRangeSequence , class ColRangeSequence >

void Kaskade::NumaCRSPatternCreator< Index >::addElements	(	RowRangeSequence const &	rrs,
		ColRangeSequence const &	crs,
		bool	colsAreSorted = `false`
	)

inline

Enters elements into the sparsity pattern.

Template Parameters

RowRangeSequence	a sequence type with value type representing a row index range
ColRangeSequence	a sequence type with value type representing a column index range

Parameters

rrs	the sequence of row ranges.
crs	the sequence of column ranges. The size has to be the same as that of rrs.
colsAreSorted	true if all the column index ranges are sorted ascendingly

For each corresponding pair of row and column ranges in the sequences

rrs and
crs, the elements of the cartesian product of the row and column ranges are entered into the sparsity structure.

The insertion of indices is done in parallel on the NUMA chunks. In order to maintain efficiency, the number or size of the ranges provided in the sequences should not be too small (a couple of ten ranges should be ok).

Definition at line 1749 of file threadedMatrix.hh.

◆ balance()

template<class Index = size_t>

void Kaskade::NumaCRSPatternCreator< Index >::balance ( )

Redistributes the rows to the NUMA chunks in order to have the same number of entries in each chunk.

The association of matrix rows to chunks is recalculated in order to have approximately the same number of nonzeroes in every chunk. While this is not an extremely expensive operation, it does several reallocations. Call this once before creating a sparsity pattern.

◆ cols()

template<class Index = size_t>

Index Kaskade::NumaCRSPatternCreator< Index >::cols ( ) const

inline

The number of columns.

Definition at line 1802 of file threadedMatrix.hh.

Referenced by Kaskade::insertMatrixBlockIndices(), and Kaskade::NumaCRSPatternCreator< Index >::NumaCRSPatternCreator().

◆ creator()

template<class Index = size_t>

ChunkCreator const & Kaskade::NumaCRSPatternCreator< Index >::creator ( int node ) const

inline

Returns the chunk creator.

Parameters

node	the number of the NUMA node / chunk (0<=node<nodes()).

Definition at line 1824 of file threadedMatrix.hh.

Referenced by Kaskade::NumaCRSPattern< Index >::NumaCRSPattern().

◆ isSymmetric()

template<class Index = size_t>

bool Kaskade::NumaCRSPatternCreator< Index >::isSymmetric ( ) const

inline

Returns the symmetry status of the pattern.

If true, the matrix is symmetric, and only the lower triangular entries are actually stored.

Definition at line 1831 of file threadedMatrix.hh.

Referenced by Kaskade::insertMatrixBlockIndices().

◆ nodes()

template<class Index = size_t>

int Kaskade::NumaCRSPatternCreator< Index >::nodes ( ) const

inline

Returns the number of NUMA nodes/chunks used.

This is guaranteed not to exceed the number of NUMA nodes, but can be less.

Definition at line 1818 of file threadedMatrix.hh.

Referenced by Kaskade::NumaCRSPatternCreator< Index >::addAllElements(), Kaskade::NumaCRSPatternCreator< Index >::addElements(), Kaskade::NumaCRSPattern< Index >::NumaCRSPattern(), Kaskade::NumaCRSPatternCreator< Index >::NumaCRSPatternCreator(), and Kaskade::NumaCRSPatternCreator< Index >::~NumaCRSPatternCreator().

◆ nonzeroes()

template<class Index = size_t>

size_t Kaskade::NumaCRSPatternCreator< Index >::nonzeroes ( ) const

inline

The number of structurally nonzero elements.

For symmetric storage without superdiagonal elements stored, this counts subdiagonal elements twice.

Definition at line 1791 of file threadedMatrix.hh.

The documentation for this class was generated from the following file:

threadedMatrix.hh

Detailed Description

Public Member Functions

Constructor & Destructor Documentation

◆ NumaCRSPatternCreator()

◆ ~NumaCRSPatternCreator()

Member Function Documentation

◆ addAllElements()

◆ addDenseBlock()

◆ addDiagonal()

◆ addElement()

◆ addElements() [1/2]

◆ addElements() [2/2]

◆ balance()

◆ cols()

◆ creator()

◆ isSymmetric()

◆ nodes()

◆ nonzeroes()