Life is much harder than randomized control trials

Take Home Message


The euroPCR 2021 presented data from the STS/ACC TVT registry in a propensity matched group of low risk tricuspid and bicuspid aortic valve patients. The most important outcome, 1 year mortality, was 6 times higher in the TVT registry than its randomized control trial counterpart (PARTNER 3). Re-hospitalization was 3 times higher and even 2 times higher than the surgical group in PARTNER 3 (11%). This data has put a light into the real effectiveness of TAVI and the use risk prediction scores on real life scenarios. The FDA has approved the use of TAVI in low risk patients based on data from PARTNER 3 trial and the STS score as a framework. If the STS score does not tell us what is expected in real life, should FDA withdraw its approval until we really know what is happening out there ?


El euroPCR 2021 presentó datos del registro STS / ACC TVT en un grupo de propensión emparejada de pacientes con válvula aórtica tricúspide y bicúspide de bajo riesgo. El resultado más importante, la mortalidad a un año, fue 6 veces mayor en el registro TVT que su contraparte del ensayo de control aleatorio (PARTNER 3). La incidencia de reingreso hospitalaria fue 3 veces mayor e incluso 2 veces mayor que el grupo quirúrgico en PARTNER 3 (11%). Estos datos han arrojado luz sobre la efectividad real de TAVI y el uso de puntajes de predicción de riesgo en escenarios de la vida real. La FDA ha aprobado el uso de TAVI en pacientes de bajo riesgo según los datos del ensayo PARTNER 3 y la puntuación STS como marco. Si el puntaje STS no nos dice lo que se espera en la vida real, ¿debería la FDA retirar su aprobación hasta que sepamos realmente lo que está sucediendo?


O euroPCR 2021 apresentou dados do registro STS / ACC TVT em um grupo de propensão pareada de pacientes com válvula aórtica tricúspide e bicúspide de baixo risco. O desfecho mais importante, mortalidade de 1 ano, foi 6 vezes maior no registro TVT do que sua contraparte no ensaio clínico randomizado (PARCEIRO 3). A reinternação foi 3 vezes maior e até 2 vezes maior do que o grupo cirúrgico do PARTNER 3 (11%). Esses dados iluminaram a eficácia real do TAVI e as pontuações de predição de risco de uso em cenários da vida real. O FDA aprovou o uso de TAVI em pacientes de baixo risco com base nos dados do estudo PARTNER 3 e no escore STS como estrutura. Se a pontuação do STS não nos diz o que é esperado na vida real, o FDA deveria retirar sua aprovação até que realmente saibamos o que está acontecendo lá fora?


At EuroPCR 2021, Dr Raj Makkar (Interventional Cardiologist at Cedars-Sinai Medical Centre in Los Angeles, USA and a Principal investigator of the PARTNER trials) presented, “Outcomes of transcatheter aortic valve replacement for bicuspid aortic valve stenosis in the low-surgical risk population”. This presentation has triggered significant discussion online in the following days. Here, we take a closer look why.

The study was performed by analysing data extracted from the STS / ACC Transcatheter Valve Therapy (TVT) registry pertaining to implants of Edwards’ ballon-expandable Sapien 3 or Sapien Ultra TAVI valves between June 2015 to October 2020. Of 159,661procedures, 37,660 were in low risk patients (defined as STS < 3%) and – of those – 3,243 were patients with bicuspid aortic valves (BAV). Propensity score matching was performed, allowing comparison of  3168 patients with BAV and tricuspid aortic valves (TAV). Thus, the majority of the BAV patients were analysed in the final cohort (3168/3243 = 97.7%).. The primary endpoint was 30 day and 1 year mortality.

Selected baseline characteristics in BAV and TAV patients:

Age (SD) – 68.8(8.7) and 68.7(9.0) years.

BMI (kg/m2) – 30.1(7.3) and 30.1(6.5).

STS risk score – 1.7(0.6) and 1.7(0.7)

Prior CABG – 6.4% and 7%.

Chronic lung disease – 27.8% and 29.1%.

Porcelain aorta – 1.9% and 2%.

Estimated GFR (ml/min/1.73m2) – 74.5(22.6) and 73.9(22.6).

Outcomes at 1 year:

Mortality in TAV was 6.6% and 4.6% for BAV.

Stroke - 2.1% and 2.0%.

Pacemaker – 7.8% and 8.9%.

Aortic reintervention – 0.43% and 1.2%.

Any readmission – 23.3 % and 23.4%.

Basal characteristics and outcomes

Let’s compare basal characteristics and outcomes in the TAV group with those reported at 1 year in PARTNER 3 (Tables 1 and 2).

Table 1


PM STS/ACC TVT registry



68.7 (9.0)

73.3 (5.8)


30.1 (6.5)

30.7 (5.5)

STS risk score

1.7 (0.7)

1.9 (0.7)

Prior CABG



Chronic lung disease





Table 2


PM STS/ACC TVT registry














Aortic reintervention






The first thing to highlight is that the low-risk TAV group derives from a propensity match with low-risk BAV and therefore it represents a selected group of TAV patients. After comparing with PARTNER 3 baseline characteristics, patients in the TVT registry were younger, with lower STS risk and higher incidence of chronic lung disease (CLD). Assuming that TVT registry and PARTNER use the same definition for CLD, the percentage of patients in the low-risk group of TVT registry with CLD is extremely high and unusual for a low-risk group.

The most important outcome, 1 year mortality, was 6 times higher in the TVT registry than its randomized control trial counterpart. Re-hospitalization was 3 times higher and even 2 times higher than the surgical group in PARTNER 3 (11%).

Despite the outcomes presented, the author (Dr. Makkar) concluded:

  • Rates of death and stroke at 30 days and 1 year were favorable in low-risk patients

These results were presented in a lengthy thread on Twitter by Dr Benoy Shah, a UK-based cardiologist and current President of the British Heart Valve Society (BHVS). He described the ‘Conclusions’ slide from the presentation as a “a masterclass in spin”. Spin refers to the way authors bend their conclusions in order to highlight only the positive aspects of their findings. In other words, to describe the tip of the iceberg. The problem with this procedure is that patients die in the process and regulatory agencies may “buy” these conclusions.

Speaking to LACES, Dr Shah said “A lot of online discussion between surgeons and cardiologists in the past 48 hours regrading this study has focused on whether these patients were truly low risk. Maybe they had co-morbidities like frailty that were not captured by the STS score. That may be possible, but it isn’t clear if they were turned down for surgery on these grounds or, in fact, on what grounds they proceeded to TAVI rather than surgery. The level of detail needed to understand the decision-making in each case isn’t provided on this analysis. If it is true that these patients would not have met the very stringent inclusion criteria used in PARTNER 3, then it was a huge own goal by the researchers to label this as a ‘low risk population’, as that is what everyone will understand by that term.”

Dr Shah continued “It is vital we have randomized studies in this cohort. Patients with bicuspid valves were deliberately excluded from all the PARTNER trials. We have no RCT data to indicate TAVI as equivalent to SAVR in these patients. Yes, many patients have had TAVI if they were unsuitable for surgery, but equivalence in a patient otherwise amenable to an operation has not been shown. This must be undertaken. However, we know that the RCT inclusion criteria tend to extremely strict, so it is highly likely that any future RCT may have significant limitations, as I suspect patients with even slightly complex anatomy for TAVI may be excluded.”

From the evidence presented so far, efficacy is not equal to effectiveness. Outcomes and score prediction in RCTs are not the same as in real life. Even though the TVT TAV group represents a selected group of TAV patients, their baseline characteristics do not suggest any additional risk factor that may increase their real risk. It is not logical to assume higher frailty is the only explanation (since it is not considered in the STS score). Overall frailty in PARTNER 3 was 0 and in a 68 year-old group of patients from the TVT registry, it seem implausible for frailty to be a major factor to explain such un-acceptable mortality in low risk patients. Dr Shah commented on this aspect also: “The real-world outcomes are often not as sparkling as those obtained in the highly controlled and monitored environment of a randmomised trial. We know this from many past experiences. Nonetheless, the 1 year mortality of nearly 7% in a cohort of patients under the age of 70 should give pause for thought. These are not just numbers, not just percentages. These are people. People that may only have recently retired, people that may have hoped they had many years of life ahead of them. There are only two possible explanations for these data – either the patients are a lot more complex and unwell than the STS score indicates, or the real-world performance of TAVI (when not performed only in high volume centres of excellence like in RCTs) is much worse than in clinical trials. Either explanation requires further evaluation of the data. Let us hope we see this. Our patients deserve it.”


STS Score

The STS Score underestimates risk in cardiac surgery and SAVR and this is being supported in the literature, this under-performance of the score is more significant when applied in real world centers. This is caused not only by the absence of independent predictors not detected by the score, such as frailty, hostile thorax, etc. (this is common to the majority of risk scores developed for cardiac surgery), but also by the weight given to the factors that the score detects as independent predictors. This weight defines the coefficients that are used in the prediction formula of the multivariate model and results in the estimation of the risk of an eventual SAVR.

As a consequence, when trying to predict risk, the STS score underestimates the risk and could alter the risk/benefit equation of an eventual SAVR.  

In addition, there is literature that has demonstrated the existence of missing data or patients and this impacts on the OM/EM ratio, underestimating the risk.

This underestimation of risk is much more significant when external and temporary validation of the STS score is attempted in real-world populations and centers other than those in which it was developed, where it loses performance at the expense of a loss of calibration.

To this must be associated that there is a significant difference between an RCT where a treatment is applied under controlled conditions, very often in elite centers and with excellent results than the registers: uncontrolled treatments and in real world centers that evidently have other results different from the centers that participate in RCT (generally elite centers).

It is in this different scenario from the Real World registries that the STS score shows a greater loss of performance, with underestimation of the predicted risk of an eventual SAVR.

It is not the same to treat our patients based on publications and Trials than based on our Real World.

Let’s remember that the FDA approved the use of TAVI to patients at low risk for death or major complications. The risk framework used by the FDA to approve its use was the STS score. If the argument to explain higher real life mortality in low-risk patients after TAVI is the inadequacy of STS score for its prediction, then FDA will need to temporarily withdraw its authorization until a new score is constructed and compared with SAVR.