In this section we will focus on the PDE in it’s weak form, i.e. we seek (first order) stationary points of the variational functional \(J \). We therefore need to solve the variational equality, here in the general case of a linear PDE
\[ \langle J’(u), v \rangle = \langle Du - f, v \rangle = 0 \quad \forall v \in V. \]
Notation:
For the numerical treatment of such systems we don’t look for solutions in the space \( V \), but we compute approximate solutions \( u_h \) in a finite dimensional subspace \( V_h \subset V\), i.e. we solve
\[ \langle Du_h, v \rangle = \langle f, v \rangle \quad \forall v \in V_h. \]
If \( \{ \varphi_1, \varphi_2, \dots, \varphi_N \} \) denotes a basis of \( V_h \) it is enough to test the equation only with the basis functions, so we can equivalently write
\[ \langle Du_h, \varphi_i \rangle = \langle f ,\varphi_i \rangle \quad \text{for} \quad i=1, \dots, N. \]
If we expand the discrete solution in the basis \( u_h(x) = \sum_{k=1}^N \mu_k \varphi_k(x) \in V_h \) and plug this into the previous equation we get a system of linear equations, which we can also write in matrix form
\[ \sum_{k=1}^N \langle D\varphi_k, \varphi_i \rangle \mu_k = \langle f, \varphi_i \rangle, \quad i=1,\dots,N \qquad \qquad \longleftrightarrow \qquad \qquad A \mu = F, \]
with the system matrix or stiffness matrix defined as \( A = \big( \langle D\varphi_k, \varphi_i \rangle \big)_{i,k=1}^N \), the right-hand side vector \( F = \big( \langle f, \varphi_i \rangle \big)_{i=1}^N \) and the vector of coefficients \( \mu = (\mu_i)_{i=1}^N \). For a fixed choice of subspace \( V_h \) the basis expansion defines a bijective map between \( V_h \) and \( \mathbb{R}^N \) and therefore the coefficient vector \( \mu \) uniquely determines the discrete solution \( u_h \).
The numerical solution of an elliptic PDE basically consist of setting up the matrix \( A \) and the right-hand side vector \( F \) (assembly) and solving the corresponding system for the coefficient vector \( \mu \). For a more in-depth discussion and related topics we refer to DeuflhardWeiser2012.
As a first step to compute PDE problems a discrete representation of the domain \( \Omega \) is required. Therefore we sub-divide the domain into finitely many polygonal and disjoint sub-domains \( T \) called elements, often triangles or quadrilaterals in 2D and tetrahedra or hexahedra in 3D. The union of the elements is called a triangulation \( \mathcal{T}_h = \{ T_1, \dots, T_N \} \) and it should cover the whole domain \( \displaystyle \bar{\Omega} = \cup_{i=1}^N T_i \). The parameter \( h \) is called mesh width and it is defined as the maximum diameter of all the elements in the triangulation. Based on the triangulation, finite element spaces are defined as local polynomial spaces
\[ V_h^p = \{ u \in V \colon u|_{T} \in \mathbb{P}_p(T) \quad \forall T \in \mathcal{T}_h \}, \]
i.e., they consist of all functions which are polynomials of order \( \leq p \) on each element \( T \). For \( h \rightarrow 0 \) the triangulation gets finer and the number of elements increases. This makes the ansatz space \( V_h^p \) larger and allows for more accurate approximations of solutions. Similarly, increasing the polynomial order \( p \) will lead to more accurate results.
The choice of basis functions for \( V_h^p \) is crucial since it determines the properties of the system matrix \( A \). This again has an influence on the efficiency of storing and solving the discrete system \( A \mu = F\). In particular basis functions with small/local support lead to a sparse system matrix \( A \). A popular choice are nodal Lagrange basis functions. Again we refer to DeuflhardWeiser2012 for a detailed discussion on the topic.
Basis functions \( \varphi \) are defined in terms of polynomial shape functions \( \phi \) which in turn are defined on reference elements \( \bar T \), usually the unit simplex or unit square/cube. Shape functions can be scalar (for heat transfer and solid/fluid mechanics problems) or vectorial (e.g., for Maxwell’s equations or first order least squares methods).
Ansatz functions \( \varphi_i |_T \) restricted to some actual element \( T \) are given as linear combinations of modified shape functions as \[ \varphi_i|_T (x) = \sum_j K_{T,ij} \psi_T(x,\phi_j(\xi(x))) . \] We will break down this representation in the following, and refer to GötschelSchielaWeiser2021 for details.
The mapping \( \xi: T \to \bar T \) is the bijective coordinate transform between the actual element \( T \) and the reference element \( \bar T \), usually an affine map. Conversely, we may write \( x(\xi) \) for its inverse. Consequently, we denote the global coordinates by \( x \) and the local coordinates relative to the reference element by \( \xi \).
Note that moving the shape functions to the actual element by evaluating \( \phi(\xi(x)) \) does not change the values of the shape function, but affects the gradients due to the chain rule: \( \partial_x \phi(\xi(x)) = \phi_\xi \xi_x \).
The mapping \( \psi_T \) is a converter with the task to modify the values of shape functions when mapping them to the actual element. This is mostly relevant for vector-valued shape functions such as used for edge elements or for \( H(\mathrm{div}) \) discretizations. There, shape functions need to be orthogonal to certain element edges or faces - a property that gets lost when simply transforming from the reference element to the actual element via evaluating \( \phi(\xi(x)) \). These properties can be restored by the converter.
For scalar shape functions, the converter is generally trivial, i.e. it is just the identity.
In some cases, ansatz functions \( \varphi_i \) do not correspond one-to-one to standard shape functions \( \phi_j \). One example are hierarchic finite element spaces, where cubic elements are formed by basis extension. Along edges, cubic shape functions then have a positive part and a negative part. Depending on the ordering and orientation of elements in the grid, these orientations need not match - leading to discontinuous ansatz functions if simply glued together. This can be prevented by multiplying one of the shape functions with -1, which can be expressed as the combiner \( K_T \) being a diagonal matrix with entries +1 or -1 depending on the actual grid structure. For Morley elements, where restricted ansatz functions are characterized by values at triangle vertices and normal derivative at edge midpoints, these ansatz functions can be expressed as element-specific linear combination of standard quadratic Lagrangian shape functions.
For the most widely used scalar Lagrange elements, however, the combiner is always the identity matrix.