Study design and study population
The SPOVID pilot study was a randomized controlled intervention study (parallel group design) embedded in the POSTCOVE study, a prospective cohort study of 800 participants. The POSTCOVE study aimed to assess persistent symptoms, overall health status and (sub-)clinical markers for cardiovascular, metabolic and respiratory conditions in participants > 12 months after SARS-CoV-2 infection. Recruitment for the POSTCOVE cohort took place between September 2021 and May 2023). Eligible participants for POSTCOVE were residents of the City of Essen, Germany, aged 18–75 years, who had been registered by local health authorities as having tested positive for SARS-CoV-2 infection, with a date of first infection between February 2020 (the beginning of the first pandemic wave in Essen) and November 2020.
The SPOVID pilot study was explorative in nature; therefore, no formal sample size calculation was performed. Based on the assumption that approximately 10% of the POSTCOVE cohort would meet the specific inclusion criteria for the SPOVID pilot study (see below), it was assumed feasible to recruit 60 participants during the first ~ 9 month of the POSTCOVE recruitment period. This number of participants has been described as being adequate for pilot studies with the aim of assessing potential feasibility problems with a moderate to high prevalence20.
From the POSTCOVE cohort, all participants aged between 18 and 70 years (n = 479) who had completed study examination by July 22, 2022, were then screened for at least one of the self-reported, on-going COVID-19-related symptoms: fatigue, concentration difficulties, breathing problems or headache (see Fig. 1). Of these, participants with sufficient German language proficiency (n = 215) were invited to participate in the SPOVID pilot intervention study. 66 (30.7%) individuals were willing to participate, gave informed consent and underwent their baseline examination (T0) at the Department of Cardiology and Angiology, Elisabeth Hospital in Essen, between May and August 2022. This time frame was chosen to enable all participants to begin the intervention during late spring or summer.
At T0, participants received a comprehensive internistic-cardiological health assessment to evaluate general physical resilience and identify potential clinical contraindications for participation in a structured physical intervention program. Further exclusion criteria included a current SARS-CoV-2 infection; being classified as a trained or higher-level athlete according to the participant classification framework proposed by McKay et al. (2022)21; chronic fatigue syndrome; pregnancy or breastfeeding; current in-patient treatment; or unwillingness to begin and consistently adhere to an exercise training program during the study period. If any contraindications were identified, participants were not randomized and were referred for further medical evaluation. Four participants were excluded due to cardiovascular problems. None of the included participants met diagnostic criteria for chronic fatigue syndrome, as none reported post-exertional malaise in particular and all expressed confidence in their ability to follow a regular training program.
Of the 66 participants who gave informed consented to participate in the SPOVID pilot study, 62 met al.l inclusion criteria and had no medical contradictions (Fig. 1). These participants were individually randomized in a 1:1 ratio to either the intervention or control group, stratified by sex and age group (≤ 60 years/>60 years). Randomization was performed using the online service ALEA22which employs the minimisation technique described by Pocock and Simon23 that minimizes imbalance in the distributions of treatment numbers within the levels of each individual stratification factor.
A sport scientist blinded to the results of the medical examination, entered the relevant participant information (sex and age) into the ALEA software tool to determine the random group allocation to then generate an enrolment form containing the participant’s data and allocation results. A total of 31 participants were allocated to the intervention group and 31 to the control group (Fig. 1).
All enrolment forms were securely stored in a folder accessible only to the sport scientist responsible for randomisation. While the sport scientist guiding the intervention and supervising participants, as well as the participants themselves and the data analyst, were aware of the group allocation results, the medical staff conducting the physical examination- including spiroergometry, body composition examination and face-to-face interviews, were blinded to group allocation. Participants were instructed not to reveal their group allocation during the follow-up examination (T1).
T1 examination took place 12 weeks after baseline to assess the effects of the exercise intervention.
The intervention phase and all follow-up examinations were completed by December 2022.
A total of eight participants (n = 5 from the intervention group, n = 3 from the control group) dropped out due to the following reasons: time constraints related to training adherence or diary documentation (n = 2), personal reasons (n = 1), relocation (n = 1), inability to participate in the T1 examination (n = 2), injury unrelated to the intervention program (n = 1), and acute illness unrelated to the intervention program (n = 1). Consequently, 54 participants completed the T1 examination (Fig. 1).
The study was conducted in accordance with the guidelines and recommendations for ensuring Good Epidemiological Practice24approved by the ethics committee of the University Duisburg-Essen (approval number: 22-10565-BO), and monitored throughout its course. The study was performed in accordance with the ethical principles of the Declaration of Helsinki. The study was registered with the German Ministry of Education and Science prior to its start (#FKZ 01EP2104A/B; https://www.gesundheitsforschung-bmbf.de/de/spovid-sport-long-covid-syndrom-14348.php; registration date: January 12, 2021).
Aerobic exercise training program
The intervention group underwent a 12-week, unsupervised aerobic exercise training program focused on low intensities aligning with zone two and three of a five-zone intensity scale25. Individual training zones, along with the corresponding heart rates, were determined based on ventilatory thresholds derived from an incremental exercise test. Training zones two and three and their corresponding exercise doses were positioned below the first ventilatory threshold and between the first and second ventilatory thresholds, respectively. Prior to training initiation, the exercise group received guidance on setting aerobic exercise intensity by the same specially trained sport scientist in a single two-hour session. Subsequently, participants were instructed to independently engage in running or walking-based aerobic exercise sessions three times a week, in accordance with their individual fitness levels and corresponding training zones. Although some participants reported engaging in regular physical activity prior to enrolling in the study (Table 1), the training program still represented an increase in systematically planned, training-related physical activity for all participants. To enhance engagement and training effectiveness, the program included a blend of steady-state and interval training, along with progressive weekly increases in both training volume and intensity. Every fourth week was designed a recovery week with reduced training volume, establishing an undulatory loading scheme (Supplementary Table S1). Bi-weekly telephone check-ins were utilized for tracking progress throughout the training period.
Participants in the control group were asked to maintain their habitual physical activity patterns between baseline (T0) and follow-up (T1) and were offered the training intervention program after the T1 examination. All participants (intervention and control) documented their training adherence and additional physical activity using an online training diary (REGmon26.
Parameters
At T0 and T1, standardised face-to-face interviews with a cardiologist were conducted to assess the following clinical parameters: persistent self-perceived symptoms > 12 months after SARS-CoV-2 infection (fatigue, concentration difficulties, headache, dyspnea) using a 0–10 numeric scale (10 = highest intensity) the intensity related to their SARS-CoV-2 infection was measured, SARS-CoV-2 re-infection prior to T0 examination, current physical performance rated on a 0–10 scale (0 representing “very restricted” and 10 representing “very powerful”). Following previous or current comorbidities and risk factors such as past surgeries, accidents and hospitalizations, present disability were collected: cardiovascular, metabolic, neurological, mental, gastrointestinal, orthopaedic and other lung disease, asthma and/or allergies, arterial hypertension, diabetes mellitus, dyslipoproteinemia and smoking status (categorized as current, former or never).
At T1, prevalent diseases, accidents, surgeries and disabilities occurring between T0 and T1 were assessed.
At T0, a separate standardized face-to-face interview with a sport scientist collected data on prior sporting activities and post-infection exercise behaviour. Information on current health status (assessed on a 1–5 scale, 1= “very good” to 5= “poor”), quality of life (assessed on a 1–4 scale with 1= “very bad” to 4= “good”), satisfaction with life in general and with health (assessed on a 1–4 scale with 1= “very dissatisfied” to 4=“very satisfied”), and sleep quality in the past month (assessed on a 1–3 scale with 1=“very good” to 3=“very bad”). Further, depressive symptoms were assessed using the 15-item Centre for Epidemiologic Studies Depression Scale (CES-D) with a range of 0 to 45 points, higher scores indicating greater symptom burden27. Also, physical activity-related health competence (PAHCO) was rated using a the PAHCO- questionnaire consisting of 42-items (condensed into 10 first-order scales and additionally pooled into three second-order scores representing movement competence, control competence and self-regulation competence)28. Symptoms of depression and physical activity-related health competence were collected at T0 and T1 in standardized and validated self-administered questionnaires. Mean scores of the three second-order scores of the PAHCO-questionnaire with the range of 1 to 4 were used in the analysis, with higher values indicating higher competence.
The internistic-cardiological health check to determine general physical resilience and potential clinical contraindications included the following examinations: vital parameter status (heart rate, blood pressure, respiratory rate), auscultation of the heart, abdominal palpation, presence of edema or bowel sounds, NYHA-classification to categorize heart failure, transthoracic echocardiography, 12-channel ECG and bodypletysmography to assess pulmonary function. Laboratory analyses were performed to determine a complete blood count and to collect several additional biomarkers such as Troponin T, pro-BNP, e-CRP and D-Dimere and 25-OH-Vit.-D status. Parameters of the bioelectrical impedance analysis and spiroergometry on the treadmill ergometer including exercise ECG were analysed to assess the influence of the exercise training program. In the bioelectrical impedance analysis height, weight, body mass index, skeletal muscle mass and body fat were determined with a standard stadiometer and a biometrical impedance analysis system (InBody Deutschland, Eschborn, Germany). To determine peak oxygen consumption (\(\dot {V}\)O2peak), peak power output (Wpeak), peak heart rate (HRpeak), peak respiratory exchange ratio (RERpeak), as well as W and HR at ventilatory thresholds 1 (VT1) and 2 (VT2), W and HR at lactate thresholds, and heart rate recovery during the first 3 min after test cessation (HRR2), a stepwise incremental cycle ergometer test was conducted using a Cyclus 2 ergometer (RBM elektronik-automation GmbH, Leipzig, Germany). The test involved spiroergometry, exercise electrocardiography (ECG), and blood lactate diagnostics. Commencing with an initial resistance of 50 W, resistance increased by 25 W every 3 min until subjective exhaustion. Participants were free to cycle at their preferred pedal rate, as the Cyclus 2 ergometer maintains a constant power condition independent of pedal cadence.
Gas exchange data were continuously collected using a breath-by-breath gas collection system (Metalyzer 3B, Cortex Biophysik GmbH, Leipzig, Germany). Gas calibrations were performed before each test in accordance with the manufacturer’s instructions. A rolling average over 30 s was applied for respiratory data smoothing, and the highest 30-second rolling averages during the test were defined as \(\dot {V}\)O2peak and RERpeak. Wpeak was calculated as follows: Wpeak = Wf + [(t/D x P)], where Wf represents the value of the last completed workload (W), t is the time (s) the last uncompleted workload was sustained, D is the duration (s) of each stage, and P is the power output difference between workloads. W and HR at VT1 and VT2 were determined visually by combining four methods: the ventilatory equivalent method, the excess carbon dioxide method, the V-slope method, and the end-tidal method. Two trained sports scientists independently and blindly assessed each participant’s graphic data, followed by a conference to reconcile any differences and arrive at a consensus for each threshold.
HR was monitored and recorded during the test via a 12-lead ECG (custo cardio 100/ERG BT, custo med GmbH, Ottobrunn, Germany), and HRpeak as well as HRR after the test were determined from the data. Capillary whole-blood samples were obtained from the earlobe before the test, during the last 15 s of each stage, and at the point of exhaustion. These samples were analyzed for lactate (La) using 20-µL capillaries, hemolyzed in 1-mL microtest tubes, and subjected to amperometric-enzymatic analysis using the Biosen C-Line Sport (EKF-diagnostic GmbH, Barleben, Germany). From the resulting lactate values, W and HR at aerobic (2 mmol/l) and anaerobic lactate thresholds (4 mmol/l) were determined, following the methodologies outlined by Mader et al.29 and Heck et al.30.
Training diary
Participants from both the intervention and the control group maintained an online training diary (REGmon)26which served as a tool for tracking their adherence to the training plan and to document any training that was completed outside the study conditions. The digital platform enabled participants to log both external and internal training loads, including the date of the training session, the type of activity (such as jogging, walking, cycling, swimming, or rowing), training volume, and training intensity. Training duration was documented in terms of time spent on each exercise session, while mean training intensity was rated by the subjects following each training session using a 10-point category-ratio (CR-10) rating of perceived exertion (RPE) scale31,32,33. The internal training load for each training session was calculated using the session rating of perceived exertion (session-RPE) method32,34. This method involved multiplying the absolute training duration in minutes by the training intensity. The online training diary also served as a repository for recording significant events that could affect training progress. Participants documented occurrences such as injuries, illness, or other relevant circumstances that might influence their training progression.
Description of the Intention-to-treat, per-protocol and as-treated analysis population
In the ITT analysis, all participants with non-missing information (n = 54) were included according to their randomization (n = 26 in intervention group, n = 28 in control group) (Table 2). The PP population consisted exclusively of participants without major protocol deviation within their assigned group defined as at least 27 documented training sessions within training zone two and three of the five-zone intensity scale25 for intervention group participants and less than 18 documented training sessions for control group participants. 3 participants from the intervention group with 18–26 exercise sessions at training zone two and three had to be excluded, because they were not concordant with either of the two groups. Likewise, 2 participants from the intervention group with a stop of training due to documented infection or injury for more than 6 weeks (out of the 12 weeks training program) as well as 6 participants (n = 4 of the intervention group, n = 2 of the control group) with SARS-CoV-2 re-infection during T0 and T1 were excluded from PP analysis. This resulted in a per-protocol population of 37 participants (n = 14 in intervention group, n = 23 in control group) (Table 2). The AT analysis classified participants according to the actual training they reported rather than the study group they were assigned to applying the same definition of training volume and intensity for the intervention and control group as in PP population. Accordingly, 3 participants randomized to the intervention group reporting less than 18 exercise sessions were assigned to the control group. Conversely, 3 participants randomized to the control group reporting more than 27 exercise sessions within training zone two or three were assigned to the intervention group. This resulted in an AT population of 43 participants (n = 17 in intervention group, n = 26 in control group) (Table 2).
Statistical analyses
Adherence to the intervention procedure was the primary feasibility outcome. This was a strictly explorative pilot study using descriptive analyses. No hypotheses were stated, no group differences were tested, no outcome efficacy cut-offs were provided, no formal a priori feasibility thresholds were defined. Data was evaluated using intention-to-treat (ITT, n = 26 intervention and n = 28 control group), per protocol (PP, n = 14 intervention and n = 23 control group) and as-treated (AT, n = 17 intervention and n = 26 control group) analysis (Table 2). As this study was designed as a feasibility study and there was no primary endpoint, all parameters that could be potentially affected by the training intervention were explored in separate statistical models to assess direction and strength of group differences in parameter change between T0 and T1. In ITT, PP and AT analysis mean values and standard deviations (SD) were generated for the intervention and control group at T0 and T1. Scores and Likert-scale categorical parameters were treated as continuous parameters in the analyses. Furthermore, the difference of mean values at the time of T1 subtracted by T0 was calculated separately for the intervention and control group, to assess the change in the respective parameter over time in each group. Additionally linear regression models for each parameter were fitted including a time-dependent dummy variable (1 = T1, 0 = T0), a treatment group dummy variable (1 = intervention group, 0 = control group) and an interaction term between time-dependent and treatment group dummy variables to calculate the difference in differences (DID) with corresponding 95% confidence interval (95%-CI). The DID indicates the difference in change of the intervention group compared to the control group over time. Taking the different response scale widths (number of scale points ranged from 3 to 46 points) of the analysed parameters into account and for better comparability of the magnitude of effect sizes across parameters, effect parameters and confidence intervals were described per standard deviation (SD) by dividing the DID by the SD of the respective parameter at T0 in the ITT analysis population. Due to the different response scale directions (some descending from “very good” to “bad” and others ascending) positive DID did not automatically represent indication of a stronger improvement in the intervention group. Therefore, the direction of the treatment effect was reported with a “+” in the last column of each table to intuitively spot when the direction of DID went in the direction of the hypothesis (i.e., stronger improvement in the intervention group). Clopper-Pearson 95% confidence intervals were calculated in order to assess whether the proportion of DID consistent with the hypothesis could be expected by chance35. Mean values and difference of mean values were calculated using SAS® software v9.0436 and (SD-scaled) difference in difference with 95%-CI were calculated in R v4.1.237.
Direction of difference in differences (DID)
Direction of difference in differences (DID) was indicated with a “+” when direction of effect consistent to a positive DID ((T1-T0 intervention group) – (T1-T0 control group)) with a stronger positive difference (T1-T0) in the intervention group for parameters where higher values represent better health/wellbeing/fitness or consistent to a negative DID with a stronger negative difference in the intervention group for parameters where lower values represent better health/wellbeing/fitness and “-“ when direction of effect neither consistent to a positive DID ((T1-T0 intervention group) – (T1-T0 control group)) with a stronger positive difference (T1-T0) in the intervention group for parameters where higher values represent better health/wellbeing/fitness nor consistent to a negative DID with a stronger negative difference in the intervention group for parameters where lower values represent better health/wellbeing/fitness).
