Evaluating type-A uncertainty¶
Type-A evaluation of uncertainty involves statistical analysis of data (in contrast to type-B evaluation, which uses some means other than statistical analysis).
The shorter name ta
has been defined as an alias for type_a
,
to resolve the names of objects defined in this module.
Sample estimates¶
estimate()
returns an uncertain number defined from the statistics of a sample of data.multi_estimate_real()
returns a sequence of related uncertain real numbers defined from the multivariate statistics calculated from a sample of data.multi_estimate_complex()
returns a sequence of related uncertain complex numbers defined from the multivariate statistics of a sample of data.estimate_digitized()
returns an uncertain number for the mean of a sample of digitized data.mean()
returns the mean of a sample of data.standard_uncertainty()
evaluates the standard uncertainty associated with the sample mean.standard_deviation()
evaluates the standard deviation of a sample of data.variance_covariance_complex()
evaluates the variance and covariance associated with the mean real component and mean imaginary component of the data.
Least squares regression¶
line_fit()
performs an ordinary least-squares straight line fit to a sample of data.line_fit_wls()
performs a weighted least-squares straight line fit to a sample of data. The weights are assumed to be exact.line_fit_rwls()
performs a weighted least-squares straight line fit to a sample of data. The weights are only assumed normalise the variability of observations.line_fit_wtls()
performs a weighted total least-squares straight line fit to a sample of data.
Merging uncertain components¶
merge()
combines results from a type-A and type-B analysis of the same data.
Module contents¶
-
estimate
(seq, label=None)¶ Return an uncertain number for the mean of the data in
seq
Parameters: - seq – a sequence of data
- label (str) – a label for the returned uncertain number
Return type: The elements of
seq
may be real numbers, complex numbers, or uncertain real or complex numbers. Note that only the value of uncertain numbers will be used.The function returns an
UncertainReal
when the mean of the data is real, and anUncertainComplex
when the mean of the data is complex.In a type-A evaluation, the sample mean provides an estimate of the quantity of interest. The uncertainty in this estimate is the standard deviation of the sample mean (or the sample covariance of the mean, in the complex case).
Examples:
>>> data = range(15) >>> type_a.estimate(data) ureal(7.0,1.1547005383792515,14) >>> data = [(0.91518731126816899+1.5213442955575518j), ... (0.96572684493613492-0.18547192979059401j), ... (0.23216598132006649+1.6951311687588568j), ... (2.1642786101267397+2.2024333895672563j), ... (1.1812532664590505+0.59062101107787357j), ... (1.2259264339405165+1.1499373179910186j), ... (-0.99422341300318684+1.7359338393131392j), ... (1.2122867690240853+0.32535154897909946j), ... (2.0122536479379196-0.23283009302603963j), ... (1.6770229536619197+0.77195994890476838j)] >>> type_a.estimate(data) ucomplex((1.059187840567141+0.9574410497332932j), u=[0.28881665310241805,0.2655555630050262], r=-0.3137404512459525, df=9)
-
estimate_digitized
(seq, delta, label=None, truncate=False)¶ Return an uncertain number for the mean of digitized data in
seq
Parameters: - seq (float,
UncertainReal
orUncertainComplex
) – data - delta (float) – digitization step size
- label (str) – label for uncertain number returned
- truncate (bool) – if
True
, truncation, rather than rounding, is assumed
Return type: A sequence of data that has been formatted with fixed precision can completely conceal a small amount of variability in the original values, or merely obscure that variability.
This function recognises the possible interaction between truncation, or rounding, errors and random errors in the underlying data. The function evaluates the mean of the data and evaluates the uncertainty in this mean.
Set the argument
truncate
toTrue
if data have been truncated, instead of rounded.See reference: R Willink, Metrologia, 44 (2007) 73-81
Examples:
# LSD = 0.0001, data varies between -0.0055 and -0.0057 >>> seq = (-0.0056,-0.0055,-0.0056,-0.0056,-0.0056, ... -0.0057,-0.0057,-0.0056,-0.0056,-0.0057,-0.0057) >>> type_a.estimate_digitized(seq,0.0001) ureal(-0.005627272727272...,1.9497827808661...e-05,10) # LSD = 0.0001, data varies between -0.0056 and -0.0057 >>> seq = (-0.0056,-0.0056,-0.0056,-0.0056,-0.0056, ... -0.0057,-0.0057,-0.0056,-0.0056,-0.0057,-0.0057) >>> type_a.estimate_digitized(seq,0.0001) ureal(-0.005636363636363...,1.52120004824377...e-05,10) # LSD = 0.0001, no spread in data values >>> seq = (-0.0056,-0.0056,-0.0056,-0.0056,-0.0056, ... -0.0056,-0.0056,-0.0056,-0.0056,-0.0056,-0.0056) >>> type_a.estimate_digitized(seq,0.0001) ureal(-0.0056,2.8867513459481...e-05,10) # LSD = 0.0001, no spread in data values, fewer points >>> seq = (-0.0056,-0.0056,-0.0056) >>> type_a.estimate_digitized(seq,0.0001) ureal(-0.0056,3.2914029430219...e-05,2)
- seq (float,
-
multi_estimate_real
(seq_of_seq, labels=None)¶ Return a sequence of uncertain real numbers
Parameters: - seq_of_seq – a sequence of sequences of data
- labels – a sequence of str labels
Return type: seq of
UncertainReal
The sequences in
seq_of_seq
must all be the same length. Each sequence contains a sample of data associated with a particular quantity. An uncertain number will be created for the quantity from sample statistics. The covariance between the different quantities will also be evaluated.A sequence of elementary uncertain numbers is returned. These uncertain numbers are considered to be related, allowing a degrees-of-freedom calculations to be performed on derived quantities.
Example:
# From Appendix H2 in the GUM >>> V = [5.007,4.994,5.005,4.990,4.999] >>> I = [19.663E-3,19.639E-3,19.640E-3,19.685E-3,19.678E-3] >>> phi = [1.0456,1.0438,1.0468,1.0428,1.0433] >>> v,i,p = type_a.multi_estimate_real((V,I,phi),labels=('V','I','phi')) >>> v ureal(4.999000...,0.0032093613071761...,4, label='V') >>> i ureal(0.019661,9.471008394041335...e-06,4, label='I') >>> p ureal(1.044460...,0.0007520638270785...,4, label='phi') >>> r = v/i*cos(p) >>> r ureal(127.732169928102...,0.071071407396995...,4.0)
-
multi_estimate_complex
(seq_of_seq, labels=None)¶ Return a sequence of uncertain complex numbers
Parameters: - seq_of_seq – a sequence of sequences of data
- labels – a sequence of str labels
Return type: a sequence of
UncertainComplex
The sequences in
seq_of_seq
must all be the same length. Each sequence contains data that is associated with a particular quantity. An uncertain number for that quantity will be created from sample statistics. The covariance between the different quantities will also be evaluated.A sequence of elementary uncertain complex numbers is returned. These uncertain numbers are considered to be related, allowing a degrees-of-freedom calculations to be performed on derived quantities.
Example:
# From Appendix H2 in the GUM >>> I = [ complex(x) for x in (19.663E-3,19.639E-3,19.640E-3,19.685E-3,19.678E-3) ] >>> V = [ complex(x) for x in (5.007,4.994,5.005,4.990,4.999)] >>> P = [ complex(0,p) for p in (1.0456,1.0438,1.0468,1.0428,1.0433) ] >>> v,i,p = type_a.multi_estimate_complex( (V,I,P) ) >>> get_correlation(v.real,i.real) -0.355311219817512 >>> z = v/i*exp(p) >>> z.real ureal(127.732169928102...,0.071071407396995...,4.0) >>> get_correlation(z.real,z.imag) -0.588429784423515...
-
mean
(seq, *args, **kwargs)¶ Return the arithmetic mean of data in
seq
Parameters: If
seq
contains real or uncertain real numbers, a real number is returned.If
seq
contains complex or uncertain complex numbers, a complex number is returned.Example:
>>> data = range(15) >>> type_a.mean(data) 7.0
-
standard_deviation
(seq, mu=None)¶ Return the sample standard deviation
Parameters: - seq – sequence of data
- mu – the arithmetic mean of
seq
If
seq
contains real or uncertain real numbers, the sample standard deviation is returned.If
seq
contains complex or uncertain complex numbers, the standard deviation in the real and imaginary components is evaluated, as well as the correlation coefficient between the components. The results are returned in a pair of objects: aStandardDeviation
namedtuple and a correlation coefficient.Only the values of uncertain numbers are used in calculations.
Examples:
>>> data = range(15) >>> type_a.standard_deviation(data) 4.47213595499958 >>> data = [(0.91518731126816899+1.5213442955575518j), ... (0.96572684493613492-0.18547192979059401j), ... (0.23216598132006649+1.6951311687588568j), ... (2.1642786101267397+2.2024333895672563j), ... (1.1812532664590505+0.59062101107787357j), ... (1.2259264339405165+1.1499373179910186j), ... (-0.99422341300318684+1.7359338393131392j), ... (1.2122867690240853+0.32535154897909946j), ... (2.0122536479379196-0.23283009302603963j), ... (1.6770229536619197+0.77195994890476838j)] >>> sd,r = type_a.standard_deviation(data) >>> sd StandardDeviation(real=0.913318449990377, imag=0.8397604244242309) >>> r -0.31374045124595246
-
standard_uncertainty
(seq, mu=None)¶ Return the standard uncertainty associated with the sample mean
Parameters: - seq – sequence of data
- mu – the arithmetic mean of
seq
Return type: float or
StandardUncertainty
If
seq
contains real or uncertain real numbers, the standard uncertainty of the sample mean is returned.If
seq
contains complex or uncertain complex numbers, the standard uncertainties of the real and imaginary components are evaluated, as well as the sample correlation coefficient are returned in aStandardUncertainty
namedtupleOnly the values of uncertain numbers are used in calculations.
Example:
>>> data = range(15) >>> type_a.standard_uncertainty(data) 1.1547005383792515 >>> data = [(0.91518731126816899+1.5213442955575518j), ... (0.96572684493613492-0.18547192979059401j), ... (0.23216598132006649+1.6951311687588568j), ... (2.1642786101267397+2.2024333895672563j), ... (1.1812532664590505+0.59062101107787357j), ... (1.2259264339405165+1.1499373179910186j), ... (-0.99422341300318684+1.7359338393131392j), ... (1.2122867690240853+0.32535154897909946j), ... (2.0122536479379196-0.23283009302603963j), ... (1.6770229536619197+0.77195994890476838j)] >>> u,r = type_a.standard_uncertainty(data) >>> u StandardUncertainty(real=0.28881665310241805, imag=0.2655555630050262) >>> u.real 0.28881665310241805 >>> r -0.31374045124595246
-
variance_covariance_complex
(seq, mu=None)¶ Return the sample variance-covariance matrix
Parameters: - seq – sequence of data
- mu – the arithmetic mean of
seq
Returns: a 4-element sequence
If
mu
isNone
the mean will be evaluated bymean()
.seq
may contain numbers or uncertain numbers. Only the values of uncertain numbers are used in calculations.Variance-covariance matrix elements are returned in a
VarianceCovariance
namedtuple; they can be accessed using the attributes.rr
,.ri
,,ir
and.ii
.Example:
>>> data = [(0.91518731126816899+1.5213442955575518j), ... (0.96572684493613492-0.18547192979059401j), ... (0.23216598132006649+1.6951311687588568j), ... (2.1642786101267397+2.2024333895672563j), ... (1.1812532664590505+0.59062101107787357j), ... (1.2259264339405165+1.1499373179910186j), ... (-0.99422341300318684+1.7359338393131392j), ... (1.2122867690240853+0.32535154897909946j), ... (2.0122536479379196-0.23283009302603963j), ... (1.6770229536619197+0.77195994890476838j)] >>> type_a.variance_covariance_complex(data) VarianceCovariance(rr=0.8341505910928249, ri=-0.24062910264062262, ir=-0.24062910264062262, ii=0.7051975704291644) >>> v = type_a.variance_covariance_complex(data) >>> v[0] 0.8341505910928249 >>> v.rr 0.8341505910928249 >>> v.ii 0.7051975704291644
-
line_fit
(x, y, label=None)¶ Return a least-squares straight-line fit to the data
New in version 1.2.
Parameters: - x – sequence of stimulus data (independent-variable)
- y – sequence of response data (dependent-variable)
- label – suffix to label the uncertain numbers a and b
Returns: an object containing regression results
Return type: Performs an ordinary least-squares regression of
y
tox
.Example:
>>> x = [1,2,3,4,5,6,7,8,9] >>> y = [15.6,17.5,36.6,43.8,58.2,61.6,64.2,70.4,98.8] >>> result = type_a.line_fit(x,y) >>> a,b = result.a_b >>> a ureal(4.8138888888888...,4.8862063121833...,7) >>> b ureal(9.4083333333333...,0.8683016476563...,7) >>> y_p = a + b*5.5 >>> dof(y_p) 7.0
-
line_fit_wls
(x, y, u_y, label=None)¶ Return a weighted least-squares straight-line fit
New in version 1.2.
Parameters: - x – sequence of stimulus data (independent-variable)
- y – sequence of response data (dependent-variable)
- u_y – sequence of uncertainties in the response data
- label – suffix to label the uncertain numbers a and b
Returns: an object containing regression results
Return type: Example:
>>> x = [1,2,3,4,5,6] >>> y = [3.2, 4.3, 7.6, 8.6, 11.7, 12.8] >>> u_y = [0.5,0.5,0.5,1.0,1.0,1.0] >>> fit = type_a.line_fit_wls(x,y,u_y) >>> fit.a_b InterceptSlope(a=ureal(0.8852320675105...,0.5297081435088...,inf), b=ureal(2.056962025316...,0.177892016741...,inf))
-
line_fit_rwls
(x, y, s_y, label=None)¶ Return a relative weighted least-squares straight-line fit
New in version 1.2.
The
s_y
values are used to scale variability in they
data. It is assumed that the standard deviation of eachy
value is proportional to the correspondings_y
scale factor. The unknown common factor in the uncertainties is estimated from the residuals.Parameters: - x – sequence of stimulus data (independent-variable)
- y – sequence of response data (dependent-variable)
- s_y – sequence of scale factors
- label – suffix to label the uncertain numbers a and b
Returns: an object containing regression results
Return type: Example:
>>> x = [1,2,3,4,5,6] >>> y = [3.014,5.225,7.004,9.061,11.201,12.762] >>> s_y = [0.2,0.2,0.2,0.4,0.4,0.4] >>> fit = type_a.line_fit_rwls(x,y,s_y) >>> a, b = fit.a_b >>> >>> print(fit) Relative Weighted Least-Squares Results: Intercept: 1.14(12) Slope: 1.973(41) Correlation: -0.87 Sum of the squared residuals: 1.3395217958... Number of points: 6
-
line_fit_wtls
(x, y, u_x, u_y, a0_b0=None, r_xy=None, label=None)¶ Return a total least-squares straight-line fit
New in version 1.2.
Parameters: - x – sequence of independent-variable data
- y – sequence of dependent-variable data
- u_x – sequence of uncertainties in
x
- u_y – sequence of uncertainties in
y
- a0_b0 – a pair of initial estimates for the intercept and slope
- r_xy – correlation between x-y pairs
- label – suffix labeling the uncertain numbers a and b
Returns: an object containing the fitting results
Return type: The optional argument
a_b
can be used to provide a pair of initial estimates for the intercept and slope.Based on paper by M Krystek and M Anton, Meas. Sci. Technol. 22 (2011) 035101 (9pp)
Example:
# Pearson-York test data see, e.g., # Lybanon, M. in Am. J. Phys 52 (1) 1984 >>> x=[0.0,0.9,1.8,2.6,3.3,4.4,5.2,6.1,6.5,7.4] >>> wx=[1000.0,1000.0,500.0,800.0,200.0,80.0,60.0,20.0,1.8,1.0] >>> y=[5.9,5.4,4.4,4.6,3.5,3.7,2.8,2.8,2.4,1.5] >>> wy=[1.0,1.8,4.0,8.0,20.0,20.0,70.0,70.0,100.0,500.0] # standard uncertainties required for weighting >>> ux=[1./math.sqrt(wx_i) for wx_i in wx ] >>> uy=[1./math.sqrt(wy_i) for wy_i in wy ] >>> result = ta.line_fit_wtls(x,y,ux,uy) >>> intercept, slope = result.a_b >>> intercept ureal(5.47991018...,0.29193349...,8) >>> slope ureal(-0.48053339...,0.057616740...,8)
-
class
LineFitOLS
(a, b, ssr, N)¶ Class to hold the results of an ordinary least-squares regression to data.
It can also be used to apply the results of a regression analysis.
New in version 1.2.
-
N
¶ The number of points in the sample
-
a_b
¶ Return the intercept
a
and slopeb
as a tuple of uncertain numbers
-
intercept
¶ Return the intercept as an uncertain number.
-
slope
¶ Return the slope as an uncertain number.
-
ssr
¶ Sum of the squared residuals
The sum of the squared deviations between values predicted by the model and the actual data.
If weights are used during the fit, the squares of weighted deviations are summed.
-
x_from_y
(yseq, x_label=None, y_label=None)¶ Estimate the stimulus
x
corresponding to the responses inyseq
Parameters: - yseq – a sequence of
y
observations - x_label – a label for the return uncertain number x
- y_label – a label for the estimate of y based on
yseq
- ..note::
- When
x_label
is defined, the uncertain number returned will be declared an intermediate result (usingresult()
)
Example
>>> x_data = [0.1, 0.1, 0.1, 0.3, 0.3, 0.3, 0.5, 0.5, 0.5, ... 0.7, 0.7, 0.7, 0.9, 0.9, 0.9] >>> y_data = [0.028, 0.029, 0.029, 0.084, 0.083, 0.081, 0.135, 0.131, ... 0.133, 0.180, 0.181, 0.183, 0.215, 0.230, 0.216] >>> fit = type_a.line_fit(x_data,y_data) >>> x0 = fit.x_from_y( [0.0712, 0.0716] ) >>> x0 ureal(0.2601659751037...,0.01784461112558...,13.0)
- yseq – a sequence of
-
y_from_x
(x, s_label=None, y_label=None)¶ Return an uncertain number
y
that predicts the response tox
Parameters: - x – a real number, or an uncertain real number
- s_label – a label for an elementary uncertain number associated with observation variability
- y_label – a label for the return uncertain number y
This is a prediction of a single future response
y
to a stimulusx
The variability in observations is based on residuals obtained during regression.
An uncertain real number can be used for
x
, in which case the associated uncertainty will also be propagated intoy
.- ..note::
- When
y_label
is defined, the uncertain number returned will be declared an intermediate result (usingresult()
)
-
-
class
LineFitRWLS
(a, b, ssr, N)¶ Class to hold the results of a relative weighted least-squares regression. The weights provided normalise the variability of observations.
New in version 1.2.
-
N
¶ The number of points in the sample
-
a_b
¶ Return the intercept
a
and slopeb
as a tuple of uncertain numbers
-
intercept
¶ Return the intercept as an uncertain number.
-
slope
¶ Return the slope as an uncertain number.
-
ssr
¶ Sum of the squared residuals
The sum of the squared deviations between values predicted by the model and the actual data.
If weights are used during the fit, the squares of weighted deviations are summed.
-
x_from_y
(yseq, s_y, x_label=None, y_label=None)¶ Estimate the stimulus
x
corresponding to the responses inyseq
Parameters: - yseq – a sequence of further observations of
y
- s_y – a scale factor for the uncertainty of the
yseq
- x_label – a label for the return uncertain number x
- y_label – a label for the estimate of y based on
yseq
- ..note::
- When
x_label
is defined, the uncertain number returned will be declared an intermediate result (usingresult()
)
- yseq – a sequence of further observations of
-
y_from_x
(x, s_y, s_label=None, y_label=None)¶ Return an uncertain number
y
that predicts the response tox
Parameters: - x – a real number, or an uncertain real number
- s_y – a scale factor for the response uncertainty
- s_label – a label for an elementary uncertain number associated with observation variability
- y_label – a label for the return uncertain number y
Returns a single future response
y
predicted for a stimulusx
.Because there is different variability in the response to different stimuli, the scale factor
s_y
is required. It is assumed that the standard deviation in they
value is proportional tos_y
.An uncertain real number can be used for
x
, in which case the associated uncertainty is also propagated intoy
.- ..note::
- When
y_label
is defined, the uncertain number returned will be declared an intermediate result (usingresult()
)
-
-
class
LineFitWLS
(a, b, ssr, N)¶ Class to hold the results of a weighted least-squares regression. The weight factors provided are assumed to correspond exactly to the variability of observations.
New in version 1.2.
-
N
¶ The number of points in the sample
-
a_b
¶ Return the intercept
a
and slopeb
as a tuple of uncertain numbers
-
intercept
¶ Return the intercept as an uncertain number.
-
slope
¶ Return the slope as an uncertain number.
-
ssr
¶ Sum of the squared residuals
The sum of the squared deviations between values predicted by the model and the actual data.
If weights are used during the fit, the squares of weighted deviations are summed.
-
x_from_y
(yseq, u_yseq, x_label=None, y_label=None)¶ Estimate the stimulus
x
corresponding to the responses inyseq
Parameters: - yseq – a sequence of further observations of
y
- u_yseq – the standard uncertainty of the
yseq
data - x_label – a label for the return uncertain number x
- y_label – a label for the estimate of y based on
yseq
The variations in
yseq
values are assumed to result from independent random effects.- ..note::
- When
x_label
is defined, the uncertain number returned will be declared an intermediate result (usingresult()
)
- yseq – a sequence of further observations of
-
y_from_x
(x, s_y, s_label=None, y_label=None)¶ Return an uncertain number
y
that predicts the response tox
Parameters: - x – a real number, or an uncertain real number
- s_y – response variability uncertainty
- s_label – a label for an elementary uncertain number associated with response variability
- y_label – a label for the return uncertain number y
Returns a single future response
y
predicted for a stimulusx
.The standard uncertainty
s_y
is used to create an additive component of uncertainty associated with variability in they
value.An uncertain real number can be used for
x
, in which case the associated uncertainty is also propagated intoy
.- ..note::
- When
y_label
is defined, the uncertain number returned will be declared an intermediate result (usingresult()
)
-
-
class
LineFitWTLS
(a, b, ssr, N)¶ This object holds results from a TLS linear regression to data.
New in version 1.2.
-
N
¶ The number of points in the sample
-
a_b
¶ Return the intercept
a
and slopeb
as a tuple of uncertain numbers
-
intercept
¶ Return the intercept as an uncertain number.
-
slope
¶ Return the slope as an uncertain number.
-
ssr
¶ Sum of the squared residuals
The sum of the squared deviations between values predicted by the model and the actual data.
If weights are used during the fit, the squares of weighted deviations are summed.
-
-
merge
(a, b)¶ Combine the uncertainty components of
a
andb
Parameters: - a – an uncertain real or complex number
- b – an uncertain real or complex number
Returns: an uncertain number that combines the uncertainty components of
a
andb
The values of
a
andb
must be equal and the components of uncertainty associated witha
andb
must be distinct, otherwise aRuntimeError
will be raised.Use this function to combine results from type-A and type-B uncertainty analyses performed on a common sequence of data.
Note
Some judgement will be required as to when it is appropriate to merge uncertainty components.
There is a risk of ‘double-counting’ uncertainty if type-B components are contributing to the variability observed in the data, and therefore assessed in a type-A analysis.
Example:
# From Appendix H3 in the GUM # Thermometer readings (degrees C) t = (21.521,22.012,22.512,23.003,23.507, 23.999,24.513,25.002,25.503,26.010,26.511) # Observed differences with calibration standard (degrees C) b = (-0.171,-0.169,-0.166,-0.159,-0.164, -0.165,-0.156,-0.157,-0.159,-0.161,-0.160) # Arbitrary offset temperature (degrees C) t_0 = 20.0 # Calculate the temperature relative to t_0 t_rel = [ t_k - t_0 for t_k in t ] # A common systematic error in all differences e_sys = ureal(0,0.01) b_type_b = [ b_k + e_sys for b_k in b ] # Type-A least-squares regression y_1_a, y_2_a = type_a.line_fit(t_rel,b_type_b).a_b # Type-B least-squares regression y_1_b, y_2_b = type_b.line_fit(t_rel,b_type_b) # `y_1` and `y_2` have uncertainty components # related to the type-A analysis as well as the # type-B systematic error y_1 = type_a.merge(y_1_a,y_1_b) y_2 = type_a.merge(y_2_a,y_2_b)