## Name:

fpminimax computes a good polynomial approximation with fixed-point or floating-point coefficients

## Library names:

sollya_obj_t sollya_lib_fpminimax(sollya_obj_t, sollya_obj_t, sollya_obj_t,                                   sollya_obj_t, ...) sollya_obj_t sollya_lib_v_fpminimax(sollya_obj_t, sollya_obj_t,                                     sollya_obj_t, sollya_obj_t, va_list)

## Usage:

fpminimax(f, n, formats, range, indic1, indic2, indic3, P) : (function, integer, list, range, absolute|relative | fixed|floating | function, absolute|relative | fixed|floating | function, absolute|relative | fixed|floating | function, function) -> function fpminimax(f, monomials, formats, range, indic1, indic2, indic3, P) : (function, list, list, range, absolute|relative | fixed|floating | function, absolute|relative | fixed|floating | function, absolute|relative | fixed|floating | function, function) -> function fpminimax(f, n, formats, L, indic1, indic2, indic3, P) : (function, integer, list, list, absolute|relative | fixed|floating | function, absolute|relative | fixed|floating | function, absolute|relative | fixed|floating | function, function) -> function fpminimax(f, monomials, formats, L, indic1, indic2, indic3, P) : (function, list, list, list, absolute|relative | fixed|floating | function, absolute|relative | fixed|floating | function, absolute|relative | fixed|floating | function, function) -> function

## Parameters:

• f is the function to be approximated
• n is the degree of the polynomial that must approximate f
• monomials is a list of integers or a list of function. It indicates the basis for the approximation of f
• formats is a list indicating the formats that the coefficients of the polynomial must have
• range is the interval where the function must be approximated
• L is a list of interpolation points used by the method
• indic1 (optional) is one of the optional indication parameters. See the detailed description below.
• indic2 (optional) is one of the optional indication parameters. See the detailed description below.
• indic3 (optional) is one of the optional indication parameters. See the detailed description below.
• P (optional) is the minimax polynomial to be considered for solving the problem.

## Description:

• fpminimax uses a heuristic (but practically efficient) method to find a good polynomial approximation of a function f on an interval range. It implements the method published in the article:
Efficient polynomial L^\infty - approximations Nicolas Brisebarre and Sylvain Chevillard
Proceedings of the 18th IEEE Symposium on Computer Arithmetic (ARITH 18)
pp. 169-176
• The basic usage of this command is fpminimax(f, n, formats, range). It computes a polynomial approximation of f with degree at most n on the interval range. formats is a list of integers or format types (such as double, doubledouble, etc.). The polynomial returned by the command has its coefficients that fit the formats indications. For instance, if formats[0] is 35, the coefficient of degree 0 of the polynomial will fit a floating-point format of 35 bits. If formats[1] is D, the coefficient of degree 1 will be representable by a floating-point number with a precision of 53 bits (which is not necessarily an IEEE 754 double precision number. See the remark below), etc.
• In its most general form, fpminimax takes a function f to be approximated, a list of basis functions g0, ..., gn, a list of points L, and a constrained part q. It returns an expression p of the form p = q+a0*g0+...+an*gn such that (p-f) or (p-f)/f (depending whether the error to be considered is the absolute of relative error) is as small as possible on the points of L and with coefficients a0, ..., an fitting on given formats. fpminimax admits several syntaxes listed above and described below to conveniently handle usual cases (such as when g0, ..., gn is the standard monomial basis X^0, ..., X^n) and to automatically compute a suitable list of points L on a given range.
• The second argument may be either an integer, a list of integers or a list of functions. An integer indicates the degree of the desired polynomial approximation. A list of integers indicates the list of desired monomials. For instance, the list [|0,2,4,6|] indicates that the polynomial must be even and of degree at most 6. Giving an integer n as second argument is equivalent as giving [|0,...,n|]. Finally, a list of function g_k indicates that the desired approximation must be a linear combination of the g_k.
The list of formats is interpreted with respect to the list of monomials. For instance, if the list of monomials is [|0,2,4,6|] and the list of formats is [|161,107,53,24|], the coefficients of degree 0 is searched as a floating-point number with precision 161, the coefficient of degree 2 is searched as a number of precision 107, and so on.
• The list of formats may contain either integers or format types (halfprecision, single, double, doubledouble, tripledouble, doubleextended and quad). The list may be too large or even infinite. Only the first indications will be considered. For instance, for a degree n polynomial, formats[n+1] and above will be discarded. This lets one use elliptical indications for the last coefficients.
• The floating-point coefficients considered by fpminimax do not have an exponent range. In particular, in the format list, double is an exact synonym for 53. Currently, fpminimax only ensures that the corresponding coefficient has at most 53 bits of mantissa. It does not imply that it is an IEEE-754 double.
• By default, the list of formats is interpreted as a list of floating-point formats. This may be changed by passing fixed as an optional argument (see below). Let us take an example: fpminimax(f, 2, [|107, DD, 53|], [0;1]). Here the optional argument is missing (we could have set it to floating). Thus, fpminimax will search for a polynomial of degree 2 with a constant coefficient that is a 107 bits floating-point number, etc.
Currently, doubledouble is just a synonym for 107 and tripledouble a synonym for 161. This behavior may change in the future (taking into account the fact that some double-doubles are not representable with 107 bits).
Second example: fpminimax(f, 2, [|25, 18, 30|], [0;1], fixed). In this case, fpminimax will search for a polynomial of degree 2 with a constant coefficient of the form m/2^25 where m is an integer. In other words, it is a fixed-point number with 25 bits after the point. Note that even with argument fixed, the formats list is allowed to contain halfprecision, single, double, doubleextended, doubledouble, quad or tripledouble. In this this case, it is just a synonym for 11, 24, 53, 64, 107, 113 or 161. This is deprecated and may change in the future.
• The fourth argument may be a range or a list. Lists are for advanced users that know what they are doing. The core of the method is a kind of approximated interpolation. The list given here is a list of points that must be considered for the interpolation. It must contain at least as many points as unknown coefficients. If you give a list, it is also recommended that you provide the minimax polynomial as last argument. If you give a range, the list of points will be automatically computed.
• The fifth, sixth and seventh arguments are optional. By default, fpminimax will approximate f while optimizing the relative error, and interpreting the list of formats as a list of floating-point formats.
This default behavior may be changed with these optional arguments. You may provide zero, one, two or three of the arguments in any order. This lets the user indicate only the non-default arguments.
The three possible arguments are: The constrained part lets the user assign in advance some of the coefficients. For instance, for approximating exp(x), it may be interesting to search for a polynomial p of the form p = 1 + x + x^2/2 + a3 x^3 + a4 x^4. Thus, there is a constrained part q = 1 + x + x^2/2 and the unknown polynomial should be considered in the monomial basis [|3, 4|]. Calling fpminimax with monomial basis [|3,4|] and constrained part q, will return a polynomial with the right form.
Notice that it is perfectly allowed that the monomial basis contains monomials also present in the constrained part. This can be useful for instance to fix the upper part of a doubledouble and look only for the second double of the expansion. For instance, in the previous example, we could use the same constrained part q = 1 + x + x^2/2 but with the full monomial basis [|0,..., 4|]. The resulting polynomial will be of the form p = 1 + x + x^2/2 + a0 + a1 x + a2 x^2 + a3 x^3 + a4 x^4, or equivalently (1+a0) + (1+a1) x + (1/2 + a2) x^2 + a3 x^3 + a4 x^4. Therefore, if for instance the coefficients a0 to a4 are asked to be double, the resulting polynomial p has its three first coefficients that are doubledouble with fixed upper parts.
• The last argument is for advanced users. It is the minimax polynomial that approximates the function f in the given basis. If it is not given this polynomial will be automatically computed by fpminimax.
This minimax polynomial is used to compute the list of interpolation points required by the method. It is also used, when floating-point coefficients are desired, to give an initial assumption for the exponents of the coeffcients. In general, you do not have to provide this argument. But if you want to obtain several polynomials of the same degree that approximate the same function on the same range, just changing the formats, you should probably consider computing only once the minimax polynomial and the list of points instead of letting fpminimax recompute them each time.
Note that in the case when a constrained part is given, the minimax polynomial must take that into account. For instance, in the previous example, the minimax would be obtained by the following command: P = remez(1-(1+x+x^2/2)/exp(x), [|3,4|], range, 1/exp(x)); Note that the constrained part is not to be added to P.
In the case when the second argument is an integer or a list of integers, there is no restriction for P, as long as it is a polynomial. However, when the second argument is a list of functions, and even if these functions are all polynomials, P must be expanded in the given basis. For instance, if the second argument is 2 or [|0, 1, 2|], P can be given in Horner form. However, if the second argument is [|1, x, x^2|], P must be written as a linear combination of 1, x and x^2, otherwise, the algorithm will fail to recover the coefficients of P and will fail with an error message.
Please also note that recovering the coefficients of P in an arbitrary basis is performed heuristically and no verification is performed to check that P does not contain other functions than the functions of the basis.
• Note that fpminimax internally computes a minimax polynomial (using the same algorithm as remez command). Thus fpminimax may encounter the same problems as remez. In particular, it may be very slow when Haar condition is not fulfilled. Another consequence is that currently fpminimax has to be run with a sufficiently high working precision.

## Example 1:

> P = fpminimax(cos(x),6,[|DD, DD, D...|],[-1b-5;1b-5]);
> printexpansion(P);
(0x3ff0000000000000 + 0xbc09fda15e029b00) + x * ((0x3af9eb57163024a8 + 0x37942c2f3f3e3839) + x * (0xbfdfffffffffff98 + x * (0xbbd1693f9c028849 + x * (0x3fa5555555145337 + x * (0x3c7a25f610ad9ebc + x * 0xbf56c138142da5b0)))))

## Example 2:

> P = fpminimax(sin(x),6,[|32...|],[-1b-5;1b-5], fixed, absolute);
> display = powers!;
> P;
x * (1 + x^2 * (-357913941 * 2^(-31) + x^2 * (35789873 * 2^(-32))))

## Example 3:

> P = fpminimax(exp(x), [|3,4|], [|D,24|], [-1/256; 1/246], 1+x+x^2/2);
> display = powers!;
> P;
1 + x * (1 + x * (1 * 2^(-1) + x * (375300225001191 * 2^(-51) + x * (5592621 * 2^(-27)))))

## Example 4:

> I = [-1b-20,  1b-22];
> P0 = fpminimax(exp(x), 4, [|DD,DD,DD,D,SG|], I);
> P1 = fpminimax(exp(x), 4, [|D,D,D,D,SG|], I, 1+x+x^2/2);
> for k from 0 to 2 do { DD(coeff(P0,k)) == coeff(P0,k); };
true
true
true
> for k from 0 to 2 do { DD(coeff(P1,k)) == coeff(P1,k); };
true
true
true
> dirtyinfnorm(P0/exp(x)-1, I);
6.2744565211592155928616387674345117702132844508175e-35
> dirtyinfnorm(P1/exp(x)-1, I);
5.8284777124834959383794298527146965060524845952998e-35

## Example 5:

> f = cos(exp(x));
> pstar = remez(f, 5, [-1b-7;1b-7]);
> listpoints = dirtyfindzeros(f-pstar, [-1b-7; 1b-7]);
> P1 = fpminimax(f, 5, [|DD...|], listpoints, absolute, default, default, pstar);
> P2 = fpminimax(f, 5, [|D...|], listpoints, absolute, default, default, pstar);
> P3 = fpminimax(f, 5, [|D, D, D, 24...|], listpoints, absolute, default, default, pstar);
> print("Error of pstar: ", dirtyinfnorm(f-pstar, [-1b-7; 1b-7]));
Error of pstar:  7.9048441259903026332577436001060063099817726177425e-16
> print("Error of P1:    ", dirtyinfnorm(f-P1, [-1b-7; 1b-7]));
Error of P1:     7.9048441259903026580081299123420463921479618202064e-16
> print("Error of P2:    ", dirtyinfnorm(f-P2, [-1b-7; 1b-7]));
Error of P2:     8.2477144579950871737109573536791331686347620955985e-16
> print("Error of P3:    ", dirtyinfnorm(f-P3, [-1b-7; 1b-7]));
Error of P3:     1.08454277156993282593701156841863009789063333951055e-15

## Example 6:

> L = [|exp(x), sin(x), cos(x)-1, sin(x^3)|];
> g = (2^x-1)/x;
> p = fpminimax(g, L, [|D...|], [-1/16;1/16],absolute);
> display = powers!;
> p;
-3267884797436153 * 2^(-54) * sin(x^3) + 5247089102535885 * 2^(-53) * (cos(x) - 1) + -8159095033730771 * 2^(-54) * sin(x) + 6243315658446641 * 2^(-53) * exp(x)

## Example 7:

> n = 9;
> T = [|1, x|];
> for i from 2 to n do T[i] = canonical(2*x*T[i-1]-T[i-2]);
> g = (2^x-1)/x;
> PCheb = fpminimax(g, T, [|DD,DE...|], [-1/16;1/16],absolute);
> display = dyadic!;
> print(PCheb);
8733930098894247371b-98 * (9 * x + -120 * x^3 + 432 * x^5 + -576 * x^7 + 256 * x^9) + 15750497046710770365b-94 * (1 + -32 * x^2 + 160 * x^4 + -256 * x^6 + 128 * x^8) + 6467380330985872933b-88 * (-7 * x + 56 * x^3 + -112 * x^5 + 64 * x^7) + 9342762606926218927b-84 * (-1 + 18 * x^2 + -48 * x^4 + 32 * x^6) + 11814521367456461131b-80 * (5 * x + -20 * x^3 + 16 * x^5) + 6405479474328570593b-75 * (1 + -8 * x^2 + 8 * x^4) + 11584457324781949889b-72 * (-3 * x + 4 * x^3) + 16779705312447201161b-69 * (-1 + 2 * x^2) + 18265014280997359319b-66 * x + 117054497448175143902009975397253b-107
Go back to the list of commands