Pappillon: The Blood Passport

"Following repeated requests for information Dr Michael Ashenden has written an easy-to-follow article that describes the concepts, haematology and statistics behind the passport concept. Dr Ashenden is currently a member of the UCI's Blood Passport expert panel responsible for interpreting cyclist's blood results on behalf of the UCI. [What follows are his words entirely.]

Background

The red fluid in our veins we generically refer to as ‘blood’ is a complex mixture of water, red blood cells, white blood cells, platelets, various proteins, hormones and nutrients. In terms of sport and blood doping, we are only interested in the red blood cells. Contained within the red blood cells is a protein called haemoglobin, which has the unique capacity to bind (and release) oxygen molecules. The red cells pick up oxygen as they pass through the lungs, and carry this oxygen through the circulation until it is offloaded – in this case oxygen is delivered to the exercising muscle.

In the lead up to the 1968 Mexico City Olympic Games (held at an altitude of 2300 m), exercise physiologists became aware that competing at altitude generally hindered endurance performance, since less oxygen was available to be delivered to exercising muscle. Shortly thereafter, it was correctly reasoned that increasing the number of red blood cells in circulation would have the opposite effect – namely to carry more oxygen to the exercising muscle and thereby enhance athletic performance. Since that time endurance sports such as distance running, cycling and cross country skiing have been stained by what has since become known as ‘blood doping’ (artificially increasing the number of red blood cells in circulation).

How do athletes blood dope?

During the 1970-80s, the only available means to blood dope was via transfusions. Although it might seem strange to us in today’s hyper-critical climate, initially blood doping was considered somewhat dubious although not banned outright in sport. In fact at the 1984 Los Angeles Olympic Games the US cycling team used systematic blood transfusions – either reinfusing their own blood (‘autologous’ transfusion) or the blood of relatives/friends (‘homologous’ blood transfusion) in a deliberate attempt to improve their performance. However the subsequent medical outcry after this practice was publicised in the local media led to the International Olympic Committee banning blood transfusions, even though there was no method available to detect this blood doping. Decades later a test was finally introduced to detect when athletes had used homologous transfusions (ironically it was an American cyclist, Tyler Hamilton, who was the first to be sanctioned for transfusion, exactly 20 years after the Los Angeles Olympics). Although the introduction of this test closed one transfusion door (homologous), the alternate door (autologous) remains wide open – as of today no test exists to detect when an athlete has used autologous blood transfusion. As illustrated by the Operacion Puerto affair, this practice is widespread in elite sport.

During the early 1990s yet another door to blood dope was opened for athletes – ‘EPO’ (or more correctly, recombinant human erythropoietin). EPO is a naturally occurring hormone produced mainly in the kidney. The hormone circulates in blood and targets the bone marrow, where it stimulates the production of red blood cells. In 1985 scientists successfully cloned the human gene that produces EPO, and four years later the American pharmaceutical company Amgen released their own recombinant product onto market. The bone marrow behaves like a blind robot and releases extra red blood cells under the influence of EPO, regardless of whether the hormone originated from the kidney or a syringe. Athletes quickly realised that EPO injections were a quicker, neater and more convenient means to blood dope than either homologous or autologous transfusions. Sadly for those of us with a passion for true sporting contest, in the 1990s EPO tipped the sporting world upside down so that cynical doctors and drug gurus, rather than talent and training, came to dominate results.

How does sport seek to detect athletes who blood dope?

The first step taken by federations to counter the farcical performances generated by EPO-doped athletes was to collect blood samples and measure the red cell concentration possessed by the athlete. The rationale was that the concentration of red cells in athletes who had used EPO would be elevated beyond normal levels. Some federations such as cross country skiing (FIS), cycling (UCI) and biathlon (IBU) introduced upper limits whereby athletes would be prevented from competing if their blood concentrations were too high. The most well-known example of this strategy is the “50% haematocrit rule” that prevented cyclists from competing if the concentration of red blood cells exceeded 50%.

At the 2000 Sydney Olympic Games two new strategies to detect athletes using EPO were introduced. One was a urine-based test that was able to discriminate the naturally occurring EPO produced in the kidney from the synthetic EPO delivered via syringe. A decade later this test remains unsurpassed as the cornerstone of antidoping strategies. The second test introduced at Sydney was a blood-based test. Instead of relying on just haematocrit levels, the test also incorporated several other blood variables in an effort to minimise the risk of falsely accusing an athlete of EPO doping in circumstances where genetic, natural or permitted interventions led to increased concentrations of red blood cells.

Since 2000, the sophistication of blood testing has evolved to the point where federations are now conducting routine blood tests, entering these results into a database, and using these historical test results as a benchmark to evaluate the likelihood that an athlete has blood doped. This approach has become known as the ‘Blood Passport’.

What is the Blood Passport?

Although the bone marrow behaves like a blind robot to generate red blood cells when stimulated by EPO, the body has a negative feedback loop in place to protect against the concentration of red blood cells reaching excessive levels in the bloodstream. Sensors in the kidney detect when red cell levels are too high and suppress the production of EPO. Reduced EPO levels in the bloodstream leads to the bone marrow producing less red blood cells, thereby keeping the system in balance. All healthy individuals possess the same negative feedback mechanism, however the level of red blood cells required to trigger the feedback loop differs between individuals. Subsequently one person might have their feedback mechanism triggered when the concentration of red blood cells is 40%, whereas another person may have a different ‘set point’ of 47%. Rather than impose a single absolute threshold, say 50%, the Blood Passport seeks to establish an individual threshold that reflects each athlete’s natural set point level. These individually-tailored thresholds then replace the upper limit of 50%.

How does the Blood Passport work?

In theory the Passport is very simple. We must establish each athlete’s true value, and decide what tolerance around this true value we will allow for situations such as dehydration, exercise or exposure to altitude that we know can temporarily change the blood result.

As an example, one very simplistic approach would be to collect a single blood test from the athlete, and allow them to vary within 15 points (higher or lower) of whatever haematocrit we measured during the first blood test. Unfortunately this would do very little to deter athletes from blood doping, since it would allow athletes to increase their haematocrit from 40% up to 53% without penalty. The unfortunate athlete found to have a haematocrit of 56% might also argue that their baseline value of 40% was incorrect and should instead have been 42% (and subsequently, they could argue that their second reading of 56% would still remain within the permitted 15 point tolerance).

The hypothetical “15 point” rule is a test we would describe as being highly ‘specific’ – it would be exceedingly rare to find an athlete who varied by 15 points who had not doped! But the test would be deemed to have low ‘sensitivity’ – it would not detect any of the athletes who doped but whose value remained within 15 points of their baseline value. Sliding the limit to within (say) 5 points of their baseline would increase sensitivity but decrease specificity (since most athletes who doped would be caught, but some athletes would naturally vary by 5 points without doping and would therefore be a ‘false positive’). Researchers and statisticians have thus worked very hard to develop a Blood Passport approach that has high sensitivity as well as high specificity.

One strategy to increase sensitivity has been to abandon the inclusion of haematocrit in blood tests. Studies have shown that it is difficult for analysers to measure haematocrit precisely, and this value also changes if the blood sample is stored for more than a few hours since the red blood cells swell and thus distort their apparent prevalence in the bloodstream.

Haematology ‘101’
The preferred variable today in the Blood Passport is ‘haemoglobin concentration’. Each red blood cell contains a set amount of haemoglobin (which is the oxygen carrying protein that also gives blood its characteristic red colour). Even if the red cells swell between the time the sample is collected and analysed on the instrument, the amount of haemoglobin within the cell does not change and can be measured with a very high degree of precision by automated haematology analysers. Since transfusing blood or using EPO will increase the amount of haemoglobin in the bloodstream (each additional red blood cell adds to the amount of haemoglobin in circulation), haemoglobin is analogous to measuring haematocrit but with several additional benefits.

The second variable used in the Blood Passport is the percentage of ‘reticulocytes’ in circulation. These are perhaps the most sensitive indicator of blood doping we have, therefore we pay close attention to how reticulocyte levels change in the bloodstream. To explain what reticulocytes are, its necessary to revisit how red cells are produced in the bone marrow.

Circulating red blood cells are the only cells in the body that do not have a nucleus (the region of the cell that contains DNA and releases RNA into the cell cytoplasm). However this is not always the case - when red cells are first generated in the bone marrow they do contain a nucleus, but the nucleus is ‘extruded’ from the cell shortly before it is released from the bone marrow into circulation. For the first day or so in circulation, the red cell contains some remnant RNA leftover from the nucleic activity present when the cell was maturing within the bone marrow. In order to detect these newly released red cells, a stain can be added to the blood sample that attaches only to RNA. Any cell that contains some of this RNA-bound stain is designated to be a ‘reticulocyte’, and sufficient RNA persists in these reticulocytes that they can be detected after they are first released from the bone marrow. Once all of the remnant RNA has vanished from the cell and can no longer be detected via the stain, we designate that they have progressed from being a reticulocyte into a ‘mature’ red blood cell.

Typically we find that about 1% of all red blood cells in circulation contain remnant RNA, so we regard 1% as the ‘baseline’ value for reticulocytes in healthy individuals. There have been many studies conducted where subjects have been treated with EPO and in those subjects we find that once treatment commences, the reticulocyte values do not change for the first 3-4 days after the first injection. This represents the period of time that the immature red blood cells undergo maturation in the bone marrow (i.e., extruding the nucleus). However a few days later the percentage of reticulocytes in circulation gradually increases up to 2%, 3% or even in some extreme cases 4%. The peak values roughly correspond to the dosage of EPO used – the higher the dosage of EPO injected the greater the subsequent peak in reticulocyte percentage. Unfortunately there is not a ‘one size fits all’ equation that equates EPO use with a specific reticulocyte percentage – athletes who vary their dosage, or use infrequent injections can have values anywhere between 1% and 4% when they are using EPO.

Another scenario where we might find elevated reticulocyte levels is in an athlete who has recently donated blood intended for storage and subsequent reinfusion. It takes several weeks for the body to replenish the blood that has been withdrawn, and during this period of time the reticulocyte values increase to 2-3% as the bone marrow is stimulated to release additional reticulocytes to replace the red cells that have been withdrawn. Again it is not possible to designate a specific reticulocyte percentage to reflect blood withdrawal – reticulocyte percentages will be higher if larger volumes of blood are withdrawn and only slightly elevated if smaller volumes of blood were taken. Additionally the reticulocyte levels gradually return to baseline levels (around 1%) during the several weeks it takes for the body to replenish lost cells – so a value of 2% might reflect a large volume of blood withdrawn several weeks before or a small volume of blood withdrawn just a few days earlier.

The ‘OFF’ score

The most enduring outcome from antidoping research leading up to the 2000 Sydney Olympic Games is surely the now well-known ‘OFF’ model. During controlled studies when subjects were administered EPO, it became clear to researchers that for several weeks after subjects came ‘off’ EPO they had a lower-than-normal percentage of reticulocytes, in tandem with a higher-than-normal haemoglobin concentration. This finding has been widely reproduced in other laboratories, and is interpreted to be the body’s biological response to elevated levels of red blood cells in circulation. The body does not have a mechanism to remove excess red blood cells, therefore the only avenue open to bring elevated blood levels back to the set point is by reducing the baseline rate of red cell production. As the bone marrow suppresses reticulocyte production, fewer cells are released into circulation thus the percentage of reticulocytes typically falls below 1%. In some extreme cases the reticulocyte levels fall below 0.2% however a more typical post-EPO finding is around 0.4%.

Interestingly, the body demonstrates the same kind of response after additional red blood cells are introduced into circulation via transfusion. However our research shows that post-transfusion the reticulocytes do not fall as dramatically as after EPO usage, and we typically find reticulocyte values in the range of 0.5-0.7% in subjects following blood transfusion.

Although it has been revamped somewhat since 2000, the current OFF model utilised in the Blood Passport combines both haemoglobin concentration and reticulocyte percentage within a single equation to yield the ‘OFF’ score. Specifically, we take the square root of the reticulocyte percentage (square root of 4% is 2), multiply this by 60 and then subtract that value from the haemoglobin concentration. Assuming that an athlete was in the midst of an extremely high EPO treatment we might find a haemoglobin concentration of 180 g/L and reticulocyte percentage of 4%, their OFF score would be 60. If we tested that same athlete 10 days after they stopped EPO their haemoglobin might be unchanged but reticulocyte levels may have dropped to 0.2%, giving them an OFF score of 153.

The first application of the OFF model in sport stipulated that any athlete found with an OFF score in excess of 126 (or thereabouts, the value varies somewhat depending on which instrument is used to measure reticulocytes so some federations use a limit of 133) would be precluded from competition. These limits were derived from measures taken on thousands of athletes and volunteer subjects. On average it was found that a healthy person’s OFF score was around 85. Some people were slightly higher, and some people slightly lower. Very few athletes who had not doped (less than 1 in 1000) had an OFF score in excess of 126. Subsequently, federations imposed a limit of 126 (or 133) confident that the likelihood of a clean athlete exceeding this score was 1 in 1000 but knowing that athletes who had recently ceased using EPO often showed values well in excess of 126 for several weeks afterward.

So where does the Blood Passport fit in?

It did not take athletes long to realise that by decreasing their dosages of EPO, or reducing the volume of blood transfused, they could reduce the changes in haemoglobin and reticulocyte levels and thereby remain below the threshold limit. During the EPO treatment itself (reticulocyte levels elevated) OFF scores would be below 85 and after treatment ceased (reticulocyte percentage suppressed) they would be higher than 85, but at neither point would the absolute level be sufficiently extreme for the federation to take action. Similarly, in the case of blood transfusion, OFF scores would be decreased in the weeks after blood had been withdrawn (elevated reticulocytes) and increased in the weeks after blood had been reinfused (low reticulocytes). But the change in the OFF score over that time period is highly unusual in both scenarios, and searching for these suspicious changes in OFF score (or haemoglobin level) has became the mantra of antidoping researchers worldwide.

Statistics 101

After evaluating the blood results of thousands of athletes, it became clear to researchers that on average athletes have a haemoglobin concentration close to 145 g/L, a reticulocyte value close to 1%, and an OFF score close to 85. Of course, these values vary somewhat due to natural biological factors, day-to-day circumstances such as dehydration/exercise, as well as the error associated with the instrument measuring these variables. Collectively we regard these factors as variations ‘within the subject’ since they represent the variability that we would find in a subject’s blood values if we tested their blood repeatedly over weeks/months.

As well, it is clear that some athletes have set point haemoglobin (and reticulocyte) values that were slightly above or below the average, so in addition to the ‘within subject’ variation there was also a difference between subjects not due to those natural fluctuations but instead representing a ‘permanent’ difference between individuals. We term these differences ‘between subject’ variations since they represent the differences we find between subjects despite allowing for the natural and instrument variations.

One of the first statistical approaches to incorporate the Blood Passport concept has become known as the ‘3G’ approach. After an initial blood sample is tested, the 3G model allows for both ‘within’ and ‘between’ subject variations to set a limit above and below the recorded value. By using an approach analogous to the earlier mentioned 1 in 1000 thresholds, the 3G model establishes how unusual the gap between first and second values is in terms of what would be expected in an athlete who had not doped. It does this by calculating the number of standard deviations (also known as ‘z-scores’) the second value is away from the first. A second value that was exactly the same as the first would have zero standard deviations difference from the first score, whereas a value that differed considerably from the first might be 3 standard deviations away.

The z-scores associated with a blood profile need to be interpreted with some caution, because the units cannot be interpreted in a linear fashion. Changes that are within 2 standard deviations of expected (z-score less than or equal to 2) are commonplace. However as z-scores approach and exceed 3 it becomes highly unlikely that such a change would be found in non-doped athletes. In other words, whether a z-score was 0, 1 or 2 would be immaterial to me, however my attention would certainly be drawn to a sample with a z-score of 3.0 (and a score of 4.0 would be astronomical)!

Because earlier research has established how much the blood values from clean athletes vary between two different tests, we can use the z-score to convey a likelihood that the variations found would have occurred in an athlete who had not doped. Thus a z-score of 3.09 conveys that there is only a 1 in 1000 likelihood of finding such a large difference in an athlete who had not doped.

The 3G model requires a first sample against which to compare the current result. For the third sample, it is compared with the average of the first two samples collected from the athlete. Each additional value is added to the database, and gradually over time the athlete accrues a ‘Passport’ of results belonging to them and against which all of their subsequent results are compared.

A second model, commonly referred to as the ‘Bayesian approach’, is the model currently utilised by the UCI and also to be adopted by the World Anti Doping Agency (WADA). In its current guise the Bayesian approach utilises a modified Bayesian statistical approach which closely resembles the 3G approach, but it has several potential advantages. First, the Bayesian approach does not need a baseline blood value against which to compare the second – instead, it assumes that the athlete belongs to the normal population and attributes the ‘population’ mean value to the athlete. Each time the athlete provides an additional sample the Bayesian model places greater emphasis on the athlete’s own value and less emphasis on the original population value. It generates tolerable limits in the same fashion as the 3G model – enabling the federations to gauge how unusual each blood value is in the context of previous results from that athlete (i.e., based on the same concept of ‘within’ and ‘between’ subject variations described earlier for the 3G approach).

An added advantage of the Bayesian approach is the capacity to build in allowances for external factors that might influence the blood result – for example whether the athlete had been exposed to altitude, or belonged to a unique ethnic group. Currently there is insufficient background data to confidently make these allowances, but the Bayesian approach is expected to be refined and improved over time when and if this data becomes available.

Aside from setting tolerance limits, the Blood Passport approach also includes a ‘sequence’ step which provides important additional evidence whether a dataset is representative of a non-doped athlete. An athlete who deliberately uses small doses of EPO or transfusion will tend to have values that increase or decrease slightly over time, and this constant but unusual pattern of small change – despite not exceeding thresholds – can be detected by the specialised software used by federations to interpret blood profiles. We always compare back to what we would expect to see in a non-doped athlete, and the sequencing software yields a probability that the repeated small variations apparent in a data set would be found in a clean athlete. In other words, we don’t just look for unusual values that exceed limits, but unusual changes in those values that seem suspicious.

What happens when we find an unusual variation?

The WADA Code stipulates that federations can sanction athletes based on evidence gleaned from the Blood Passport, and the potential to sanction athletes displaying abnormal variations in their blood has been touted as an important application of this tool. However several steps are required before a federation will proceed to a sanction based on Blood Passport data.

In its current format, a blood profile would first need to exceed the 1 in 1000 threshold before it was highlighted as being unusual. This is largely a matter of convenience to ‘sieve’ out only the most unusual profiles from amongst the hundreds/thousands collected by a federation. However various expert panels have reached consensus that if a profile were to exceed this threshold it would be compelling evidence that the profile was abnormal. What this threshold does not reveal, however, is whether the abnormal profile was the result of doping, a medical condition or some other explanation.

To make this determination, the federations must convene a panel of experts who are required to interpret the blood profile to establish whether a non-doping circumstance is present. For example, some pathological conditions give rise to highly unusual blood profiles that may exceed a 1 in 1000 threshold but which can be diagnosed by medical specialists as being due to illness and not doping. By a process of exclusion, when no other explanation can be found for an abnormal profile the panel may deem the unusual result to be evidence that the athlete in question had doped.

In many instances the profile may not exceed the 1 in 1000 threshold, but experts may recognise tell-tale signatures as being characteristic of doping (for example the constant but small changes due to ‘microdosing’ with EPO or transfusion). In those situations, the federation may utilize this information to target test the athlete in question at times where EPO use might be predicted (such as in the week or two preceding a major competition). So in cases where the Blood Passport may not yield sufficient evidence to impose a sanction based on the blood values alone, the utilisation of intelligence gleaned from the Passport may lead to the federation conducting an out-of-competition test at a time when trace amounts of the banned substance are still in circulation – and thus indirectly lead to a sanction.

What’s the future for the Blood Passport?

In the short term, the evidence gleaned from blood profiles will enable federations to allocate their testing resources most heavily on those athletes with the most unusual blood profiles. For example, such information has already led the UCI to re-test samples collected from riders at times when their blood profile – in a historical context – suggested that they may have been using EPO. Although the stored sample had not been tested originally for the presence of EPO, re-testing the sample has found trace amounts of EPO which led to the rider being sanctioned.

One powerful application of the Blood Passport concept may well lie in the combination of results from different tests. As an example, an athlete with consistent, small fluctuations in blood values may be target tested several days after a recent injection of EPO. It could be that neither the urine profile nor the blood profile by themselves exceeded their relevant thresholds. However by combining the two ‘unusual’ outcomes (from independent tests) the cumulative evidence that an athlete doped might exceed the level of certainty required to impose a sanction. Moreover in the future, it is hoped that ‘forensic’ evidence from varied sources (blood tests, urine tests, hormone profiles, whereabouts information) could be combined in a manner that would dramatically increase the sensitivity of any single piece of evidence in isolation.

However, key researchers who developed the mathematical models underlying the Passport approach agree that perhaps the most powerful application of the Passport lies in the realm of no-start penalties. Although this currently sits outside of the WADA Code (an athlete is deemed to have doped, or not doped, but nothing in between) it is hoped that federations will realise the deterrent effect this would pose and introduce no-start penalties under specific rules of sport.

It is to be expected that as athletes seek to evade detection by maintaining blood values within thresholds, the variations apparent in blood profiles collected by federations will become smaller and less extreme over time. A rule of sport, rather than the WADA Code, could be the key to counter this evolution. Athletes would quickly realise that even ‘modulated’ doping within Passport limits would be counterproductive if it led to them being ruled ineligible to compete at a major competition. The burden of evidence to exclude an athlete from competition is necessarily less than the level of evidence required to impose a two year sanction, and it is envisaged that this ‘interim’ step to counter profiles that were suspicious but remained within the threshold might form a crucial pillar of future Blood Passport strategies."