Evaluating type-A uncertainty¶

Type-A evaluation of uncertainty involves statistical analysis of data (in contrast to type-B evaluation, which uses some means other than statistical analysis).

The shorter name ta has been defined as an alias for type_a, to resolve the names of objects defined in this module.

Sample estimates¶

estimate() returns an uncertain number defined from the statistics of a sample of data.

multi_estimate_real() returns a sequence of related uncertain real numbers defined from the multivariate statistics calculated from a sample of data.

multi_estimate_complex() returns a sequence of related uncertain complex numbers defined from the multivariate statistics of a sample of data.

estimate_digitized() returns an uncertain number for the mean of a sample of digitized data.

mean() returns the mean of a sample of data.

standard_uncertainty() evaluates the standard uncertainty associated with the sample mean.

standard_deviation() evaluates the standard deviation of a sample of data.

variance_covariance_complex() evaluates the variance and covariance associated with the mean real component and mean imaginary component of the data.

Least squares regression¶

line_fit() performs an ordinary least-squares straight line fit to a sample of data.

line_fit_wls() performs a weighted least-squares straight line fit to a sample of data. The weights are assumed to be exact.

line_fit_rwls() performs a weighted least-squares straight line fit to a sample of data. The weights are only assumed normalise the variability of observations.

line_fit_wtls() performs a weighted total least-squares straight line fit to a sample of data.

Merging uncertain components¶

merge() combines results from a type-A and type-B analysis of the same data.

Note

Many functions in type_a treat data as pure numbers. Sequences of uncertain numbers can be passed to these functions, but only the uncertain-number values will be used.

merge() is provided so that the results of type-A and type-B analyses on the same data can be combined.

Module contents¶

estimate(seq, label=None)¶

Return an uncertain number for the mean of the data in seq

Parameters:	seq – a sequence of data label (str) – a label for the returned uncertain number
Return type:	`UncertainReal` or `UncertainComplex`

The elements of seq may be real numbers, complex numbers, or uncertain real or complex numbers. Note that only the value of uncertain numbers will be used.

The function returns an UncertainReal when the mean of the data is real, and an UncertainComplex when the mean of the data is complex.

In a type-A evaluation, the sample mean provides an estimate of the quantity of interest. The uncertainty in this estimate is the standard deviation of the sample mean (or the sample covariance of the mean, in the complex case).

Examples:

>>> data = range(15)
>>> type_a.estimate(data)
ureal(7.0,1.1547005383792515,14)

>>> data = [(0.91518731126816899+1.5213442955575518j),
... (0.96572684493613492-0.18547192979059401j),
... (0.23216598132006649+1.6951311687588568j),
... (2.1642786101267397+2.2024333895672563j),
... (1.1812532664590505+0.59062101107787357j),
... (1.2259264339405165+1.1499373179910186j),
... (-0.99422341300318684+1.7359338393131392j),
... (1.2122867690240853+0.32535154897909946j),
... (2.0122536479379196-0.23283009302603963j),
... (1.6770229536619197+0.77195994890476838j)]

>>> type_a.estimate(data)
ucomplex((1.059187840567141+0.9574410497332932j), u=[0.28881665310241805,0.2655555630050262], r=-0.3137404512459525, df=9)

estimate_digitized(seq, delta, label=None, truncate=False)¶

Return an uncertain number for the mean of digitized data in seq

Parameters:	seq (float, `UncertainReal` or `UncertainComplex`) – data delta (float) – digitization step size label (str) – label for uncertain number returned truncate (bool) – if `True`, truncation, rather than rounding, is assumed
Return type:	`UncertainReal` or `UncertainComplex`

A sequence of data that has been formatted with fixed precision can completely conceal a small amount of variability in the original values, or merely obscure that variability.

This function recognises the possible interaction between truncation, or rounding, errors and random errors in the underlying data. The function evaluates the mean of the data and evaluates the uncertainty in this mean.

Set the argument truncate to True if data have been truncated, instead of rounded.

See reference: R Willink, Metrologia, 44 (2007) 73-81

Examples:

# LSD = 0.0001, data varies between -0.0055 and -0.0057
>>> seq = (-0.0056,-0.0055,-0.0056,-0.0056,-0.0056,
...      -0.0057,-0.0057,-0.0056,-0.0056,-0.0057,-0.0057)
>>> type_a.estimate_digitized(seq,0.0001)
ureal(-0.005627272727272...,1.9497827808661...e-05,10)

# LSD = 0.0001, data varies between -0.0056 and -0.0057
>>> seq = (-0.0056,-0.0056,-0.0056,-0.0056,-0.0056,
... -0.0057,-0.0057,-0.0056,-0.0056,-0.0057,-0.0057)
>>> type_a.estimate_digitized(seq,0.0001)
ureal(-0.005636363636363...,1.52120004824377...e-05,10)

# LSD = 0.0001, no spread in data values
>>> seq = (-0.0056,-0.0056,-0.0056,-0.0056,-0.0056,
... -0.0056,-0.0056,-0.0056,-0.0056,-0.0056,-0.0056)
>>> type_a.estimate_digitized(seq,0.0001)
ureal(-0.0056,2.8867513459481...e-05,10)

# LSD = 0.0001, no spread in data values, fewer points
>>> seq = (-0.0056,-0.0056,-0.0056)
>>> type_a.estimate_digitized(seq,0.0001)
ureal(-0.0056,3.2914029430219...e-05,2)

multi_estimate_real(seq_of_seq, labels=None)¶

Return a sequence of uncertain real numbers

Parameters:	seq_of_seq – a sequence of sequences of data labels – a sequence of str labels
Return type:	seq of `UncertainReal`

The sequences in seq_of_seq must all be the same length. Each sequence contains a sample of data associated with a particular quantity. An uncertain number will be created for the quantity from sample statistics. The covariance between the different quantities will also be evaluated.

A sequence of elementary uncertain numbers is returned. These uncertain numbers are considered to be related, allowing a degrees-of-freedom calculations to be performed on derived quantities.

Example:

# From Appendix H2 in the GUM

>>> V = [5.007,4.994,5.005,4.990,4.999]
>>> I = [19.663E-3,19.639E-3,19.640E-3,19.685E-3,19.678E-3]
>>> phi = [1.0456,1.0438,1.0468,1.0428,1.0433]
>>> v,i,p = type_a.multi_estimate_real((V,I,phi),labels=('V','I','phi'))
>>> v
ureal(4.999000...,0.0032093613071761...,4, label='V')
>>> i
ureal(0.019661,9.471008394041335...e-06,4, label='I')
>>> p
ureal(1.044460...,0.0007520638270785...,4, label='phi')

>>> r = v/i*cos(p)
>>> r
ureal(127.732169928102...,0.071071407396995...,4.0)

multi_estimate_complex(seq_of_seq, labels=None)¶

Return a sequence of uncertain complex numbers

Parameters:	seq_of_seq – a sequence of sequences of data labels – a sequence of str labels
Return type:	a sequence of `UncertainComplex`

The sequences in seq_of_seq must all be the same length. Each sequence contains data that is associated with a particular quantity. An uncertain number for that quantity will be created from sample statistics. The covariance between the different quantities will also be evaluated.

A sequence of elementary uncertain complex numbers is returned. These uncertain numbers are considered to be related, allowing a degrees-of-freedom calculations to be performed on derived quantities.

Example:

# From Appendix H2 in the GUM

>>> I = [ complex(x) for x in (19.663E-3,19.639E-3,19.640E-3,19.685E-3,19.678E-3) ]
>>> V = [ complex(x) for x in (5.007,4.994,5.005,4.990,4.999)]
>>> P = [ complex(0,p) for p in (1.0456,1.0438,1.0468,1.0428,1.0433) ]

>>> v,i,p = type_a.multi_estimate_complex( (V,I,P) )

>>> get_correlation(v.real,i.real)
-0.355311219817512

>>> z = v/i*exp(p)
>>> z.real
ureal(127.732169928102...,0.071071407396995...,4.0)
>>> get_correlation(z.real,z.imag)
-0.588429784423515...

mean(seq, *args, **kwargs)¶

Return the arithmetic mean of data in seq

Parameters:	seq – a sequence, `ndarray`, or iterable, of numbers or uncertain numbers args – optional arguments when `seq` is an `ndarray` kwargs – optional keyword arguments when `seq` is an `ndarray`

If seq contains real or uncertain real numbers, a real number is returned.

If seq contains complex or uncertain complex numbers, a complex number is returned.

Example:

>>> data = range(15)
>>> type_a.mean(data)
7.0

standard_deviation(seq, mu=None)¶

Return the sample standard deviation

Parameters:	seq – sequence of data mu – the arithmetic mean of `seq`

If seq contains real or uncertain real numbers, the sample standard deviation is returned.

If seq contains complex or uncertain complex numbers, the standard deviation in the real and imaginary components is evaluated, as well as the correlation coefficient between the components. The results are returned in a pair of objects: a StandardDeviation namedtuple and a correlation coefficient.

Only the values of uncertain numbers are used in calculations.

Examples:

>>> data = range(15)
>>> type_a.standard_deviation(data)
4.47213595499958

>>> data = [(0.91518731126816899+1.5213442955575518j),
... (0.96572684493613492-0.18547192979059401j),
... (0.23216598132006649+1.6951311687588568j),
... (2.1642786101267397+2.2024333895672563j),
... (1.1812532664590505+0.59062101107787357j),
... (1.2259264339405165+1.1499373179910186j),
... (-0.99422341300318684+1.7359338393131392j),
... (1.2122867690240853+0.32535154897909946j),
... (2.0122536479379196-0.23283009302603963j),
... (1.6770229536619197+0.77195994890476838j)]
>>> sd,r = type_a.standard_deviation(data)
>>> sd
StandardDeviation(real=0.913318449990377, imag=0.8397604244242309)
>>> r
-0.31374045124595246

standard_uncertainty(seq, mu=None)¶

Return the standard uncertainty associated with the sample mean

Parameters:	seq – sequence of data mu – the arithmetic mean of `seq`
Return type:	float or `StandardUncertainty`

If seq contains real or uncertain real numbers, the standard uncertainty of the sample mean is returned.

If seq contains complex or uncertain complex numbers, the standard uncertainties of the real and imaginary components are evaluated, as well as the sample correlation coefficient are returned in a StandardUncertainty namedtuple

Only the values of uncertain numbers are used in calculations.

Example:

>>> data = range(15)
>>> type_a.standard_uncertainty(data)
1.1547005383792515

>>> data = [(0.91518731126816899+1.5213442955575518j),
... (0.96572684493613492-0.18547192979059401j),
... (0.23216598132006649+1.6951311687588568j),
... (2.1642786101267397+2.2024333895672563j),
... (1.1812532664590505+0.59062101107787357j),
... (1.2259264339405165+1.1499373179910186j),
... (-0.99422341300318684+1.7359338393131392j),
... (1.2122867690240853+0.32535154897909946j),
... (2.0122536479379196-0.23283009302603963j),
... (1.6770229536619197+0.77195994890476838j)]
>>> u,r = type_a.standard_uncertainty(data)
>>> u
StandardUncertainty(real=0.28881665310241805, imag=0.2655555630050262)
>>> u.real
0.28881665310241805
>>> r
-0.31374045124595246

variance_covariance_complex(seq, mu=None)¶

Return the sample variance-covariance matrix

Parameters:	seq – sequence of data mu – the arithmetic mean of `seq`
Returns:	a 4-element sequence

If mu is None the mean will be evaluated by mean().

seq may contain numbers or uncertain numbers. Only the values of uncertain numbers are used in calculations.

Variance-covariance matrix elements are returned in a VarianceCovariance namedtuple; they can be accessed using the attributes .rr, .ri, ,ir and .ii.

Example:

>>> data = [(0.91518731126816899+1.5213442955575518j),
... (0.96572684493613492-0.18547192979059401j),
... (0.23216598132006649+1.6951311687588568j),
... (2.1642786101267397+2.2024333895672563j),
... (1.1812532664590505+0.59062101107787357j),
... (1.2259264339405165+1.1499373179910186j),
... (-0.99422341300318684+1.7359338393131392j),
... (1.2122867690240853+0.32535154897909946j),
... (2.0122536479379196-0.23283009302603963j),
... (1.6770229536619197+0.77195994890476838j)]
>>> type_a.variance_covariance_complex(data)
VarianceCovariance(rr=0.8341505910928249, ri=-0.24062910264062262, ir=-0.24062910264062262, ii=0.7051975704291644)

>>> v = type_a.variance_covariance_complex(data)
>>> v[0]
0.8341505910928249
>>> v.rr
0.8341505910928249
>>> v.ii
0.7051975704291644

line_fit(x, y, label=None)¶

Return a least-squares straight-line fit to the data

New in version 1.2.

Parameters:	x – sequence of stimulus data (independent-variable) y – sequence of response data (dependent-variable) label – suffix to label the uncertain numbers a and b
Returns:	an object containing regression results
Return type:	`LineFitOLS`

Performs an ordinary least-squares regression of y to x.

Example:

>>> x = [1,2,3,4,5,6,7,8,9]
>>> y = [15.6,17.5,36.6,43.8,58.2,61.6,64.2,70.4,98.8]
>>> result = type_a.line_fit(x,y)
>>> a,b = result.a_b
>>> a
ureal(4.8138888888888...,4.8862063121833...,7)
>>> b
ureal(9.4083333333333...,0.8683016476563...,7)

>>> y_p = a + b*5.5
>>> dof(y_p)
7.0

line_fit_wls(x, y, u_y, label=None)¶

Return a weighted least-squares straight-line fit

New in version 1.2.

Parameters:	x – sequence of stimulus data (independent-variable) y – sequence of response data (dependent-variable) u_y – sequence of uncertainties in the response data label – suffix to label the uncertain numbers a and b
Returns:	an object containing regression results
Return type:	`LineFitWLS`

Example:

>>> x = [1,2,3,4,5,6]
>>> y = [3.2, 4.3, 7.6, 8.6, 11.7, 12.8]
>>> u_y = [0.5,0.5,0.5,1.0,1.0,1.0]

>>> fit = type_a.line_fit_wls(x,y,u_y)
>>> fit.a_b
 InterceptSlope(a=ureal(0.8852320675105...,0.5297081435088...,inf),
 b=ureal(2.056962025316...,0.177892016741...,inf))

line_fit_rwls(x, y, s_y, label=None)¶

Return a relative weighted least-squares straight-line fit

New in version 1.2.

The s_y values are used to scale variability in the y data. It is assumed that the standard deviation of each y value is proportional to the corresponding s_y scale factor. The unknown common factor in the uncertainties is estimated from the residuals.

Parameters:	x – sequence of stimulus data (independent-variable) y – sequence of response data (dependent-variable) s_y – sequence of scale factors label – suffix to label the uncertain numbers a and b
Returns:	an object containing regression results
Return type:	`LineFitRWLS`

Example:

>>> x = [1,2,3,4,5,6]
>>> y = [3.014,5.225,7.004,9.061,11.201,12.762]
>>> s_y = [0.2,0.2,0.2,0.4,0.4,0.4]
>>> fit = type_a.line_fit_rwls(x,y,s_y)
>>> a, b = fit.a_b
>>>
>>> print(fit)

Relative Weighted Least-Squares Results:

  Intercept: 1.14(12)
  Slope: 1.973(41)
  Correlation: -0.87
  Sum of the squared residuals: 1.3395217958...
  Number of points: 6

line_fit_wtls(x, y, u_x, u_y, a0_b0=None, r_xy=None, label=None)¶

Return a total least-squares straight-line fit

New in version 1.2.

Parameters:	x – sequence of independent-variable data y – sequence of dependent-variable data u_x – sequence of uncertainties in `x` u_y – sequence of uncertainties in `y` a0_b0 – a pair of initial estimates for the intercept and slope r_xy – correlation between x-y pairs label – suffix labeling the uncertain numbers a and b
Returns:	an object containing the fitting results
Return type:	`LineFitWTLS`

The optional argument a_b can be used to provide a pair of initial estimates for the intercept and slope.

Based on paper by M Krystek and M Anton, Meas. Sci. Technol. 22 (2011) 035101 (9pp)

Example:

# Pearson-York test data see, e.g.,
# Lybanon, M. in Am. J. Phys 52 (1) 1984
>>> x=[0.0,0.9,1.8,2.6,3.3,4.4,5.2,6.1,6.5,7.4]
>>> wx=[1000.0,1000.0,500.0,800.0,200.0,80.0,60.0,20.0,1.8,1.0]

>>> y=[5.9,5.4,4.4,4.6,3.5,3.7,2.8,2.8,2.4,1.5]
>>> wy=[1.0,1.8,4.0,8.0,20.0,20.0,70.0,70.0,100.0,500.0]

# standard uncertainties required for weighting
>>> ux=[1./math.sqrt(wx_i) for wx_i in wx ]
>>> uy=[1./math.sqrt(wy_i) for wy_i in wy ]

>>> result = ta.line_fit_wtls(x,y,ux,uy)
>>> intercept, slope = result.a_b
>>> intercept
ureal(5.47991018...,0.29193349...,8)
>>> slope
ureal(-0.48053339...,0.057616740...,8)

class LineFitOLS(a, b, ssr, N)¶

Class to hold the results of an ordinary least-squares regression to data.

It can also be used to apply the results of a regression analysis.

New in version 1.2.

N¶: The number of points in the sample

a_b¶: Return the intercept a and slope b as a tuple of uncertain numbers

intercept¶: Return the intercept as an uncertain number.

slope¶: Return the slope as an uncertain number.

ssr¶

Sum of the squared residuals

The sum of the squared deviations between values predicted by the model and the actual data.

If weights are used during the fit, the squares of weighted deviations are summed.

x_from_y(yseq, x_label=None, y_label=None)¶

Estimate the stimulus x corresponding to the responses in yseq

Parameters:	yseq – a sequence of `y` observations x_label – a label for the return uncertain number x y_label – a label for the estimate of y based on `yseq`

..note::: When x_label is defined, the uncertain number returned will be declared an intermediate result (using result())

Example

>>> x_data = [0.1, 0.1, 0.1, 0.3, 0.3, 0.3, 0.5, 0.5, 0.5,
...                 0.7, 0.7, 0.7, 0.9, 0.9, 0.9]
>>> y_data = [0.028, 0.029, 0.029, 0.084, 0.083, 0.081, 0.135, 0.131,
...                 0.133, 0.180, 0.181, 0.183, 0.215, 0.230, 0.216]

>>> fit = type_a.line_fit(x_data,y_data)

>>> x0 = fit.x_from_y( [0.0712, 0.0716] )
>>> x0
ureal(0.2601659751037...,0.01784461112558...,13.0)

y_from_x(x, s_label=None, y_label=None)¶

Return an uncertain number y that predicts the response to x

Parameters:	x – a real number, or an uncertain real number s_label – a label for an elementary uncertain number associated with observation variability y_label – a label for the return uncertain number y

This is a prediction of a single future response y to a stimulus x

The variability in observations is based on residuals obtained during regression.

An uncertain real number can be used for x, in which case the associated uncertainty will also be propagated into y.

..note::: When y_label is defined, the uncertain number returned will be declared an intermediate result (using result())

class LineFitRWLS(a, b, ssr, N)¶

Class to hold the results of a relative weighted least-squares regression. The weights provided normalise the variability of observations.

New in version 1.2.

N¶: The number of points in the sample

a_b¶: Return the intercept a and slope b as a tuple of uncertain numbers

intercept¶: Return the intercept as an uncertain number.

slope¶: Return the slope as an uncertain number.

ssr¶

Sum of the squared residuals

The sum of the squared deviations between values predicted by the model and the actual data.

If weights are used during the fit, the squares of weighted deviations are summed.

x_from_y(yseq, s_y, x_label=None, y_label=None)¶

Estimate the stimulus x corresponding to the responses in yseq

Parameters:	yseq – a sequence of further observations of `y` s_y – a scale factor for the uncertainty of the `yseq` x_label – a label for the return uncertain number x y_label – a label for the estimate of y based on `yseq`

..note::: When x_label is defined, the uncertain number returned will be declared an intermediate result (using result())

y_from_x(x, s_y, s_label=None, y_label=None)¶

Return an uncertain number y that predicts the response to x

Parameters:	x – a real number, or an uncertain real number s_y – a scale factor for the response uncertainty s_label – a label for an elementary uncertain number associated with observation variability y_label – a label for the return uncertain number y

Returns a single future response y predicted for a stimulus x.

Because there is different variability in the response to different stimuli, the scale factor s_y is required. It is assumed that the standard deviation in the y value is proportional to s_y.

An uncertain real number can be used for x, in which case the associated uncertainty is also propagated into y.

..note::: When y_label is defined, the uncertain number returned will be declared an intermediate result (using result())

class LineFitWLS(a, b, ssr, N)¶

Class to hold the results of a weighted least-squares regression. The weight factors provided are assumed to correspond exactly to the variability of observations.

New in version 1.2.

N¶: The number of points in the sample

a_b¶: Return the intercept a and slope b as a tuple of uncertain numbers

intercept¶: Return the intercept as an uncertain number.

slope¶: Return the slope as an uncertain number.

ssr¶

Sum of the squared residuals

The sum of the squared deviations between values predicted by the model and the actual data.

If weights are used during the fit, the squares of weighted deviations are summed.

x_from_y(yseq, u_yseq, x_label=None, y_label=None)¶

Estimate the stimulus x corresponding to the responses in yseq

Parameters:	yseq – a sequence of further observations of `y` u_yseq – the standard uncertainty of the `yseq` data x_label – a label for the return uncertain number x y_label – a label for the estimate of y based on `yseq`

The variations in yseq values are assumed to result from independent random effects.

..note::: When x_label is defined, the uncertain number returned will be declared an intermediate result (using result())

y_from_x(x, s_y, s_label=None, y_label=None)¶

Return an uncertain number y that predicts the response to x

Parameters:	x – a real number, or an uncertain real number s_y – response variability uncertainty s_label – a label for an elementary uncertain number associated with response variability y_label – a label for the return uncertain number y

Returns a single future response y predicted for a stimulus x.

The standard uncertainty s_y is used to create an additive component of uncertainty associated with variability in the y value.

An uncertain real number can be used for x, in which case the associated uncertainty is also propagated into y.

..note::: When y_label is defined, the uncertain number returned will be declared an intermediate result (using result())

class LineFitWTLS(a, b, ssr, N)¶

This object holds results from a TLS linear regression to data.

New in version 1.2.

N¶: The number of points in the sample

a_b¶: Return the intercept a and slope b as a tuple of uncertain numbers

intercept¶: Return the intercept as an uncertain number.

slope¶: Return the slope as an uncertain number.

ssr¶

Sum of the squared residuals

The sum of the squared deviations between values predicted by the model and the actual data.

If weights are used during the fit, the squares of weighted deviations are summed.

merge(a, b)¶

Combine the uncertainty components of a and b

Parameters:	a – an uncertain real or complex number b – an uncertain real or complex number
Returns:	an uncertain number that combines the uncertainty components of `a` and `b`

The values of a and b must be equal and the components of uncertainty associated with a and b must be distinct, otherwise a RuntimeError will be raised.

Use this function to combine results from type-A and type-B uncertainty analyses performed on a common sequence of data.

Note

Some judgement will be required as to when it is appropriate to merge uncertainty components.

There is a risk of ‘double-counting’ uncertainty if type-B components are contributing to the variability observed in the data, and therefore assessed in a type-A analysis.

Example:

# From Appendix H3 in the GUM

# Thermometer readings (degrees C)
t = (21.521,22.012,22.512,23.003,23.507,
    23.999,24.513,25.002,25.503,26.010,26.511)

# Observed differences with calibration standard (degrees C)
b = (-0.171,-0.169,-0.166,-0.159,-0.164,
    -0.165,-0.156,-0.157,-0.159,-0.161,-0.160)

# Arbitrary offset temperature (degrees C)
t_0 = 20.0

# Calculate the temperature relative to t_0
t_rel = [ t_k - t_0 for t_k in t ]

# A common systematic error in all differences
e_sys = ureal(0,0.01)

b_type_b = [ b_k + e_sys for b_k in b ]

# Type-A least-squares regression
y_1_a, y_2_a = type_a.line_fit(t_rel,b_type_b).a_b

# Type-B least-squares regression
y_1_b, y_2_b = type_b.line_fit(t_rel,b_type_b)

# `y_1` and `y_2` have uncertainty components
# related to the type-A analysis as well as the
# type-B systematic error
y_1 = type_a.merge(y_1_a,y_1_b)
y_2 = type_a.merge(y_2_a,y_2_b)