A review, published in the BMJ, at antibody tests for the SARS-CoV-2 virus.
Prof Sanjeev Krishna, Professor of Molecular Parasitology and Medicine, St George’s, University of London, said:
“This is a detailed and thorough assessment of antibody tests that were developed and published in the first few months of the pandemic. The systematic nature of the analysis acts like a peer review process for many of the papers that have been posted as ‘preprints’ that present results without having undergone a formal peer review process.
“The quality of studies examined varies widely, and this has already been analysed and reported in the recently published Cochrane review. Both reviews identify and make clear where standards should be improved in developing antibody tests and cover a similar period of time. This review is not intended to assess if immunity passports would be viable.
“Like most diagnostic tests that need to be developed with urgency, the first products may lack robustness through poor evaluation methodologies or poor test performance, but many tests can be improved with time. It is important not to dismiss them from the start, as antibody tests will take their place in the suite of diagnostic tests that is needed to understand and manage the COVID19 pandemic, whether to identify those eligible for vaccine or plasma donation studies, or to understand the role of immunity passports. In the end, we will need high quality antibody tests to understand the nature of immunity to COVID19 infections.”
Dr Alexander Edwards, Associate Professor in Biomedical Technology, Reading School of Pharmacy, University of Reading, said:
“This systematic review is extremely welcome – without careful and painstaking analysis of the flood of publications about antibody tests for COVID-19, we will struggle to make informed policy decisions and make use of new tools. By collating together so many studies, we are starting to see a picture emerge, but it’s still a sketch and there are still important gaps where it’s not yet possible to even guess what the picture shows.
“The nature of a systematic review is that a very formal process selects papers that fit strict rules, and that means they may appear ‘out of date’ – so far, only publications up to 30th April are included and so there will be a lot of more recent studies published, some of which might have better design and more patients, samples and timepoints included. But the most important element to me of this review is the clear and careful organisation of the evidence. This makes it much easier to understand new studies, as new pieces of the puzzle can now be fitted into the carefully assembled framework.
“Some important points are highlighted about the way that antibody tests are validated or evaluated. For example, the accuracy that any study publishes will depend a lot on the population used. Many studies are incomplete because they select only very specific groups of samples. But it takes time and a lot of resource to collect all different types of sample needed to fully define the performance of an antibody test. We can expect more comprehensive studies to emerge.
“The biggest unanswered questions are not surprisingly the hardest to answer – how long do antibody responses last, and what do antibody tests tell us about immunity and how long an individual might be protected from re-infection?”
Prof Kevin McConway, Emeritus Professor of Applied Statistics, The Open University, said:
“Given the huge interest in testing for current or past COVID-19 infection, it isn’t really surprising that there have been two major reviews of studies of antibody tests within a week, this new one and the Cochrane review (https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD013652/full) that was published last week. In most respects, the two reviews come to similar conclusions. Despite the considerable number of different studies that they considered (40 studies for this new review, 54 in the Cochrane review), they concluded broadly that most of the studies they reviewed were at high risk of bias, and had other deficiencies that mean their conclusions are not always very reliable – and if these reviews are based on studies that are not always very good science, it must be the case that the conclusions from the reviews cannot be as clear as would be ideal. The researchers who did the reviews certainly recognise this, and one of the two main conclusions from the new review is that “Higher quality clinical studies assessing the diagnostic accuracy of serological tests for covid-19 are urgently needed.” I certainly agree with that. Both this review and the Cochrane review give guidance on what needs to be done better in future studies.
“There has been much discussion and speculation about possible uses of the results of antibody tests to provide “immunity passports”. The idea would be that an antibody test could show that someone has previously been infected, with a further assumption that this would indicate that they are immune from re-infection and could possibly be subject to fewer restrictions in what they do. This possibility has been largely discredited already, for example on the grounds that we still do not know how immune people are, if they have antibodies to the virus, and we do not know how long any immunity might last. But this review (and the Cochrane review) point out clearly that the evidence indicates that some antibody testing methods are definitely not accurate enough to come near proving that a person even has antibodies. An important issue is that the studies behind the reviews mostly did not measure the accuracy of the tests in the context of how they might be used to make individual clinical decisions, at the ‘point of care’. That is, usually the tests were not evaluated in contexts that are similar to how they might be used in practice to make decisions about individual people, so the evaluations are limited in what they can tell us about those practical uses.
“It may seem strange to evaluate a diagnostic test, in a context that is different from the way it might be used in practice. But this illustrates the difficulty of assessing, or even defining, test accuracy. There are two ways that an antibody test can give a wrong result on an individual sample. It could declare that the sample contains antibodies when in fact the person was never infected – this is a false positive. Or it could declare that the sample contains no antibodies when in fact the person was previously infected – a false negative. So the test has to be tried out on samples from people who are definitely known to have been infected (which gives what is called the sensitivity of the test – the percentage of samples from infected people that give a positive result). It also has to be tried out on samples from people who were definitely not infected (which gives the specificity – the percentage of samples from uninfected people that give a negative result). It can often be easier to get samples to estimate both these quantities from sources like stored samples of blood serum, rather than samples taken in the same context that the test would be used.
“Both this review and the Cochrane review considered a wide range of different types of antibody test, including tests that require laboratory work as well as tests that might be made available for easy use at the point of care (or at home, for example), perhaps using a very small sample of whole blood from a finger prick. The second major conclusion from the new review is that currently available evidence “does not support the continued use of existing point-of-care serological tests.” This conclusion is largely based on their finding that the average sensitivity of lateral flow immunoassays (LFIA), the method that is most commonly used in commercially-available point-of-care tests, is rather low, and considerably lower than the sensitivity of other testing methods that are generally used in laboratories. In fact that finding is not entirely robust, statistically, and it remains plausible that the difference between the LFIA tests and others could perhaps, just about, be explainable by random variation – but that again draws attention to a different problem. The possible level of random variation is quite high partly because the different studies, that were reviewed, gave results that vary a lot from one another, probably because they were done on different populations under different circumstances and with differing procedures and testing kits. This heterogeneity means that the pooling process carried out in the reviews to find ‘average’ results for different test types would, at best, give us a type of average figure for each of the two measures of accuracy (sensitivity and specificity), which might not be relevant to any one specific test. Neither this nor the Cochrane review had enough data to allow clear specific comparisons between different test brands.
“One difference between this new review and the Cochrane review is that the Cochrane review concluded that the differences in sensitivity between different type tests were small, and that there was not convincing evidence of real differences in accuracy, whereas the new review draws particular attention to the lower performance of LFIA tests. This difference in emphasis will in part be due to different choices by the two review teams about what is important, out of the large and complicated set of different results they found. But it is also due to two differences in what was done. The Cochrane review distinguished between different types of LFIA, and reported mostly separately on the results of the different types, where they knew what type was used. They did find relatively low sensitivity in LFIA tests for which they did not know details of the underlying technology, and those results mostly came from a single study, that compared a lot of different tests but did not report which test was which, because (it seems) of confidentiality agreements with the test manufacturers. The new review did not make this distinction between different LFIA types. Also, perhaps crucially, the new review included one particular new study, by Whitman et al., that also compared several different commercial LFIA testing kits, this time identifying the manufacturers. This study could not be included in the Cochrane review, because it appeared (as a preprint) two days after the cut-off date for finding studies for the Cochrane review. (I expect it will be included when the Cochrane review is revised – that original review points out that the revision is already under way.) The Whitman study increases the amount of evidence about LFIA tests, and makes it clearer that their sensitivity is on average worse than the lab tests (ELISA and CLIA), particularly for commercial tests.
“Both this review and the Cochrane review make the important point that some lack of accuracy in tests is less important when they are being used to measure the overall patterns of antibody levels in a population of people rather than providing specific results about individuals. If the sensitivity and specificity of a test are known, they can be used to adjust the results for the population. But that’s not really possible in dealing with individuals.
“One very useful piece of information provided in the new review (and there were similar calculations in the Cochrane review too) is that it shows what the test results might look like in populations where the overall prevalence of antibodies is at various different levels. The most recent data on antibodies from the Office for National Statistics infection survey found that 5.4% of individuals had antibodies to COVID-19, in a sample of people from the population of England. So the figures for 5% prevalence of (past) infection (in Table 7 of the new review paper) would match best what one might find with a person from England being tested for antibodies, if nothing more is known about whether they had been infected. With an ‘average’ LFIA test, out of 1000 such people tested, 33 would be correctly classified as infected (true positives), 17 would be incorrectly classified as uninfected (false negatives), 918 would be correctly classified as uninfected (true negatives), and 32 would be incorrectly classified as infected (false positives). But if all that is known about someone is the result of their antibody test, what do these numbers imply? If their test is positive, they might be a true positive, or they might be a false positive, but nobody knows which. Out of 1000 people there are 33 true positives and 32 false positives, so someone with a positive test result is pretty well equally likely to have really been infected or to have really been uninfected. That’s another important reason why using the test result to give this person an ‘immunity passport’ would be dangerous. This issue arises not so much because the test is not accurate, but more because there are far more uninfected people in the population than previously infected people, so even a small percentage of wrong results on uninfected people shows up as rather a lot of false positives. In fact, for the most accurate test type in this review, CLIA tests (a type of laboratory test), the issue still arises. According to the review, these tests have almost 98% sensitivity and also almost 98% specificity. That is, if we could know that the person had definitely been infected, they would have a 98% chance of getting a positive test result, and if we could know that the person and definitely not been infected, there would again be a 98% chance of getting a negative test result. That sounds (and is) pretty good, as these things go. But if we don’t know whether or not someone has been infected, and we only know their test result, the low percentage of people in the population who have been infected is still against us. With 5% prevalence, out of 1000 people tested, there would be 49 true positives and 21 false positives, so someone with a positive test would still have about a 1 in 3 chance of not actually having been infected.”
Prof Jon Deeks, Professor of Biostatistics and head of the Biostatistics, Evidence Synthesis and Test Evaluation Research Group, University of Birmingham, said:
“The systematic review published in the BMJ by Bastos and colleagues uses similar methods to our Cochrane review1 of antibody tests published last week (Deeks and colleagues). Both reviews searched for studies available up to the end of April 2020 which were mainly undertaken in China in hospitalised patients. However, Bastos included fewer studies and less data (40 studies, 49 test evaluations in Bastos compared with 54 studies and 89 test evaluations in our review). Bastos et al reported the same important pattern of increasing sensitivity of the tests over time – which was the headline finding of our Cochrane review. However, the Bastos review has a headline finding related to the sensitivity of point of care tests (lateral flow assays) being much lower (66%) than the laboratory based ELISA (84%) or CLIA tests (98%) which we did not report.
“This result needs to be interpreted with great caution as the studies that used point of care tests were also more likely to be those in which antibody responses were assessed in the first two weeks since onset of symptoms, when all tests are known to perform poorly. For example, they include a study from Italy where the tests were used on presentation in the Emergency Department, when sensitivity was only 18%. We found little difference between point of care and laboratory assays in our analysis as we adjusted for differences in week of testing at the same time as comparing the test technologies. The headline finding from Bastos et al does not account for differences in week and is thus misleading – Bastos et al do report the data by week in a later Table which shows more similar results from the different test technologies, in line with our findings.
“It would be unfortunate for a promising technology to be discarded based on an analytical quirk and erroneous conclusion as is evident in these data. We need to see studies which compare the accuracy of point of care tests with laboratory tests in the same patient samples before we can draw strong conclusions about the relative accuracy of these test types. It must also be remembered that point of care tests have benefits in terms of accessibility and speed over laboratory tests, which may make them preferable in some settings, even if they do not achieve exactly the same level of accuracy. Their development and evaluation is of importance and should continue.”
1 Deeks JJ, Dinnes J, Takwoingi Y, Davenport C, Spijker R, Taylor-Phillips S, Adriano A, Beese S, Dretzke J, Ferrante di Ruffano L, Harris IM, Price MJ, Dittrich S, Emperador D, Hooft L, Leeflang MMG, Van den Bruel A. Antibody tests for identification of current and past infection with SARS‐CoV‐2. Cochrane Database of Systematic Reviews 2020, Issue 6. Art. No.: CD013652. DOI: 10.1002/14651858.CD013652. (weblink https://doi.org/10.1002/14651858.CD013652)
‘Diagnostic accuracy of serological tests for covid-19: systematic review and meta-analysis’ by Mayara Lisboa Bastos et al. was published in the BMJ at 23:30 UK time on Wednesday 1 July 2020.
DOI: 10.1136 bmj.m2516
Declared interests
Prof Sanjeev Krishna: “Professor Sanjeev Krishna is funded by Wellcome Trust/DFID to work in a consortium of academic partners to develop serological and antigen based assays with Mologic Ltd.”
Dr Alexander Edwards: “I am director and co-founder of a company that develops antibody testing technology, although we are not offering any COVID-19 antibody test products.”
Prof Kevin McConway: “Prof McConway is a member of the SMC Advisory Committee, but his quote above is in his capacity as a professional statistician.”
Prof Jon Deeks: “My salary is fully funded by the University of Birmingham.
My team is supported by funding from NIHR (National Institute of Health Research) for a programme of work on biomarker and test evaluation through the NIHR Birmingham Biomedical Research Centre and project grants funded by the MRC (Medical Research Council).
I have received funding from the World Health Organisation for supporting development of the WHO Essential Diagnostics List.
I receive funding from the BMJ as their chief statistical advisor.”