Our Strengths
The capabilities below were developed by NCRI over a period of years for the analysis of extremism, disinformation, and coordinated influence operations. Their transposition to medical research is recent — and structural. The analytical problems are formally analogous; the instruments translate cleanly.
Comprehensive data infrastructure
NCRI maintains the leading independent research corpus of Reddit data, ingesting the public record continuously and in real time.
- Coverage: thousands of health-relevant communities
- Depth: more than a decade of archives
- Includes: ME/CFS forums whose origins predate the pandemic by years
For investigations into chronic illness — where the disease course unfolds over years and the affected population is geographically dispersed — the coverage is without parallel in conventional cohort data.
Sentiment and forensic discourse analysis
NCRI scientists are among the most widely cited specialists in the empirical analysis of online discourse. Our protocols for sentiment measurement, narrative tracking, and forensic attribution have been deployed by federal agencies, adopted by major platforms, and reported in the national press.
The methods are domain-transferable. The procedures that identify a coordinated influence operation in a political dataset are formally similar to those required to detect a genuine therapeutic signal in a patient dataset.
When an intervention produces benefit in a definable Long Covid subpopulation, the surrounding discourse exhibits structured, measurable changes well in advance of any peer-reviewed result.
Large language models as research instruments
Modern language models permit the systematic processing of textual corpora at scales until recently infeasible. Hundreds of thousands of patient narratives can be converted into structured records of symptom timing, treatment exposure, and reported outcome.
We treat these models as scientific instruments and impose corresponding methodological discipline:
- Validation against annotated ground truth
- Explicit quantification of uncertainty
- Human adjudication of any decision bearing on a substantive conclusion
The output is structured data suitable for statistical inference.
Citation and network analysis
The graph-analytic methods we developed for narrative diffusion across social platforms apply equally to the scientific literature itself.
We can:
- Map citation structure
- Identify clusters of mutual amplification
- Locate findings that are systematically uncited
- Trace institutional and financial ties among authors and editorial boards
When a clinical guideline reflects the influence of a small, densely interconnected authorial group with overlapping conflicts of interest, the network topology renders this visible.
Embedded patient perspective
NCRI Health was established by individuals living with the chronic conditions under study.
Membership in these communities is an epistemic asset, not a substitute for analytical rigor. It informs the calibration of our methods — indicating which patient-reported phenomena reflect substantive clinical reality, which are artifacts of forum dynamics, and where the most consequential disagreements within the patient population are located.
It tells us where to direct the instruments.