Everyone is familiar with the standard full-factorial Design of Experiment (DoE):
In this DoE, all the variables are sampled independently of each other and uniformly between their respective minimum and maximum values.
In this post, we consider a far-reaching generalization of this structure. Observe the following DoE in 3D (you can rotate it and view it from different angles; an up-to-date browser supporting WebGL is required):
What are the characteristic properties of this set?
More generally, for any collection of variables \(S=\{x_1, x_2, \dots, x_N\}\), we may:
Why does this construction deserve our attention? Two reasons:
Given a training set (collection of input-output pairs), the Tensor Approximation (TA) feature of pSeven Core GT Approx allows us to:
Mathematically, most conventional methods of constructing approximations (linear regression, splines, kriging, neural networks) boil down to representing them as linear combinations of some basic functions. Tensor approximations can then be roughly described as those in which:
Accordingly, tensor approximations "inherit" to some extent the properties of the elementary approximation methods chosen for their factors. pSeven Core GT Approx automatically chooses a reasonable elementary method for each factor, but the user can also set them manually if needed.
Factorization of DoE is very often accompanied by DoE's anisotropy that can have many forms:
Standard approximation methods are usually not flexible enough to handle such issues and the resulting approximation models turn out to be very inaccurate.
In 1D, splines are a very efficient, fast and accurate method that can handle millions of training points, but splines do not directly extend to multi-dimensional scattered data. On the other hand, linear regression can be efficiently used in any dimension with any data, but it will be very inaccurate for nonlinear functions. An alternative, nonlinear methods such as kriging or neural networks are accurate for nonlinear data, but they are much slower and usually limited to smaller sets due to memory limitations.
Tensor Approximations can efficiently combine the strengths of different methods by exploiting the factored structure of the data set and choosing (or letting the user specify) an appropriate modeling method for each individual factor.
Even if the data is not anisotropic, Tensor Approximations substantially augment the modeling options -- for example, the tensor product of splines, if applicable, is often at the same time faster, more accurate, and has smaller memory footprint than alternative approximations.
One of the applications of pSeven Core Tensor Approximations was to approximation modeling of an axisymmetric aircraft engine.
The engine model in question could be mathematically represented as a function \(y=f(x_1,x_2,x_3,x_4,x_5,x_6,x_7)\).
The goal of the project was to create an approximation model for \(f\), as simulations tended to be lengthy and expensive. The function \(f\) was highly anisotropic: on the one hand, it had a highly nontrivial dependence on the spatial position:
(here, different curves correspond to different tuples \((x_3,x_4,x_5,x_6,x_7)\)). On the other hand, an examination had shown that\(f\) was close to linear with respect to the parameters \((x_3,x_4,x_5,x_6)\).
In this problem, we constructed a tensor approximation with 3 factors:
For the training set, factor 2 was reduced to just 10 randomly chosen tuples (out of the 150 available in the database). All the remaining data was used as a test set, and the approximation was found to be very accurate, as seen on this scatter plot:
Factorization of DoE is a rather strong requirement, and we would like to relax it so as to extend TA to a wider range of problems.
Sometimes, the DoE has an incomplete partially factorial structure. This means a factored DoE with some points missing, like in this figure:
For such problems, pSeven Core GT Approx offers the incomplete Tensor Approximation (iTA) method. Mathematically, it is also based on tensor products of factorial base functions, but the algorithm is more involved since we deal now with not-so-well-structured and less abundant training data. iTA is usually quite fast and, if there are not too many omissions, it tends to produce accurate approximations.
Currently, pSeven Core iTA can only work with 1D factors and tensor products of splines.
We consider the problem of reconstructing the pressure distribution on a wing. An airflow around an aircraft wing has been simulated in a 3D aerodynamic solver, producing a pressure field on the wing (the pressure distribution is used by engineers to compute forces and moments acting on the wing). For simplicity, we will consider only the upper surface of the left half wing.
The pressure field is defined on a mesh of 200×29 points:
Though the actual 3D locations of the mesh points do not form a regular grid, we use such a grid to parametrize them, which makes iTA perfectly applicable to this problem. Now suppose that we are given the pressure field, but with its values erased at some mesh points:
There are exactly 50% lost pressure field values here. The bluish area in the middle of the wing has lower pressure, and the yellowish/reddish near the edges have higher pressure.
Now, let us apply iTA and compare the true distribution with the reconstructed one:
We can barely see the difference! Now let us try to increase the number of lost values to 95%:
This time we get the following result:
The difference is small but clearly present - the approximation is smoother than the original field (which is not necessarily a drawback).
Now let us push the loss of date to the limit and consider the 99% loss rate -- that leaves just 58 values out of the original 5800! Note that there are now big areas on the wing's surface without any pressure values.
Here is the result of iTA:
The approximation has suffered a noticeable loss of accuracy, but still correctly reflects the general trends.
We remark that in all these cases the training time for the approximation was no longer than a few seconds.
Accuracy Evaluation for an approximation in the case of factorial Design of Experiments (DoE) can be made using one of pSeven Core GT Approx techniques - Tensor Gaussian Processes. It is another approximation technique designed for factorial Design of Experiments and based on the Gaussian Processes (GP) technique in GT Approx. It is an adaptation of GP technique for factorial DoE. TGP takes into account the special structure of the training sample, which makes this technique extremely efficient and accurate.
TGP technique features are:
Factorization of DoE is often accompanied by DoE's anisotropy of various forms:
GP technique is not flexible enough to provide an accurate approximation for such kind of DoE and is able to handle only relatively small samples. TGP technique perfectly fits to solve this problem.
Accuracy Evaluation for factorial DoE – example
Let's consider the rotating disc problem as Tensor Gaussian Processes’ application example. In this problem, a disc of an impeller is considered. It is rotated around the shaft. The geometrical shape of the disc considered here is parameterized by 6 variables h1, h2, h3, h4, r2, r3. The task is to model maximum radial stress which can be expressed as a function of this 6 variables: y = f (h1,h2,h3,h4,r2,r3).
The training sample is a factorial DoE. After an approximation model is constructed, we need to assess the quality of approximation. For this purpose, TGP technique is used. It allows building confidence interval using Accuracy Evaluation (AE) feature.
location_on France, 31100 Toulouse, Av. du Général de Croutte 42
phone +33 (0) 5 82-95-59-68
mail_outline info@pseven.io
Contact us navigate_next Resellers navigate_next