Validity and reliability of test results
Factors
affecting test results
A number of variables impact on the validity of laboratory results. Preanalytical
variables include physiological factors, patient preparation, specimen collection
and transport. Analytical variables include the precision and accuracy of the
test method and factors which may interfere with a particular assay eg lipaemia,
in vitro haemolysis, medications. Post-analytical variables include data entry
and calculations by laboratory staff, result validation, interpretation of the result,
data transfer and the method used to report the results (electronic, paper or telephone).
Pathology laboratories control the impact of these variables, as far as possible,
by two processes:
Quality control
On each occasion that patient samples are tested, the laboratory also tests “standards”,
with known concentrations of analytes or cells, or known reactivity in the test system.
Although these standards are developed to be comparable to the test samples, they
are commercial products and have many unavoidable limitations eg the presentation
of a specific analyte in aqueous solution for comparison to protein-rich plasma samples
and the use of preserved cells for cell counts. This practice does, however assist
to achieve consistency in test results.
Quality assurance
Laboratories also participate in external quality assurance programs, in which the
results from each laboratory are compared to the results obtained by a group of laboratories;
national programs are in place in both Australia and New Zealand and this process
is a major factor in achieving reasonable consistency between all laboratories and
performance levels close to the best obtainable. It has the added advantage that
it encourages uniformity of the units in which results are reported eg all Australasian
laboratories now report the prothrombin time, for patients on oral anticoagulants,
as an International Normalised Ratio (INR), avoiding the earlier confusion when such
results were variously reported as a prothrombin ratio or index. Quality assurance
also requires laboratories to attend to preanalytical and postanalytical variables,
although many of these factors may not be under their direct control.
Quality control and quality assurance help maintain both the accuracy and the consistency
of laboratory results but absolute accuracy is not technically possible. Variables
cannot be entirely avoided and the interpretation of any result must take these factors
into consideration.
Selection of the test(s)
Discretionary testing is the selection of a single test, or a small number of tests,
on the basis of the clinical findings. Profile testing is the ordering of tests as
“screening tests” and may involve a “battery” of tests. In a normal individual, the
greater the number of tests performed, the greater is the chance of finding an abnormal
result. For example, if a 20 test profile is performed on a completely normal individual,
there is a 64% chance that at least one test will be “abnormal”. These “abnormal”
tests are usually of no clinical significance, but lead to unnecessary follow-up
testing and considerable patient anxiety.
The specimen
The laboratory result is dependent on the quality of the specimen which it receives.
An inadequate biopsy or a poor cervical smear results in an incomplete and possibly
inaccurate opinion; with an improperly collected urine specimen, a pathogen may be
masked by contamination with commensals; an incorrect volume of blood in a citrate
tube will result in incorrect results of coagulation tests; refrigeration of blood
will result in an elevated potassium level.
The request
Provision of appropriate clinical information is essential if the pathology laboratory
is to assess the results and their likely significance. Any difficulty in obtaining
the specimen should also be noted on the request form, as this may affect the test
result.
Quantitative test results and the reference interval
Reference intervals represent the test results which would be obtained in the normal
population and are based on the results obtained (mean + 2 standard deviations)
on a series of normal “healthy” individuals. The levels quoted thus lie between the
2.5 and 97.5 centiles for the group from which they were derived. Age, gender, race
and test methodology are important variables, so the reference intervals quoted in
the literature may not be generally applicable. Wherever possible, laboratories establish
their own reference intervals, but this is not always feasible.
As the reference interval represents the 2.5 to 97.5 centiles, inevitably 5% of entirely
normal people will have test results outside the reference interval. Minor variations
should thus be interpreted with caution. The ability of a test to discriminate between
normal and abnormal individuals is described by its sensitivity and specificity.
The sensitivity and specificity of the test
Test sensitivity is defined as the percentage of people with a specific disease who
have an abnormal test result. Test specificity is defined as the percentage of people
without the disease who have a normal result. In the general population, specificity
should, ideally, equal 97.5%; thus 2.5% of unaffected (normal) individuals will have
a positive result and these can be considered as “false positives”.
Reliable data on sensitivity and specificity for particular tests have been difficult
to find in the literature, in part because it is often difficult to define a true
“reference population” and to generalise results beyond a particular study.
The predictive value of the test
The proportion of true positives to the total number of positives in the population
is the positive predictive value of the test; this represents the diagnostic value
of a positive result for the test. The relationship between the positive predictive
value and the prevalence of the disease in the community is:
Thus, the higher the disease prevalence, the greater the probability that a positive
result will be “correct”. For example, if a test with high sensitivity and specificity
is used to detect a disease with a prevalence of 1%, the positive predictive value
of the test is 16% – there is only a 1 in 6 chance that the test result is correct.
The predictive value of a test can be improved by selecting a population in which
the disease has a higher prevalence eg hepatitis C virus testing in “at risk”
groups. Thus a major factor in improving the positive predictive value of a test
is to limit the use of the test to those patients who, on clinical assessment, are
likely to have the disease in question ie to practise discretionary testing.
The negative predictive value of this test, with a sensitivity and specificity of
95%, used for a disease with a 1% prevalence, is 99% ie there is only a 1% chance
of a false negative result. Thus the test, applied to the general population, can
efficiently exclude the diagnosis but is extremely inefficient in confirming the
diagnosis. As previously noted, its efficiency as a diagnostic test can be markedly
increased by using it in a discretionary fashion, in high risk groups or in patients
with clinical features suggesting the disease in question; this improved diagnostic
efficiency does not significantly reduce its negative predictive value. It should
be noted that few tests achieve the specificity and sensitivity of the test used
in this example.
Pathology tests guide clinical decision-making and the clinician should have some
understanding of the factors which influence the reliability of a test for such decisions
to be valid. The clinician has an important part to play in the avoidance, or control,
of many of the preanalytical variables. The clinician also needs to have an understanding
of the sensitivity and specificity of tests and of their positive predictive value.
Profile testing, or “screening” is an expensive process which, even with tests of
high sensitivity and specificity, has a limited positive predictive value. Highly
accurate test results may be entirely meaningless or misleading when used in this
fashion. False positive results lead to unnecessary and expensive follow-up testing
and patient anxiety; false negative results place the patient at risk. Diagnostic
tests must supplement, rather than be used as a substitute for, clinical skills;
careful clinical assessment followed by discretionary testing is cost-effective,
efficient and leads to improved patient outcomes.