By Zoë Leanza
Sage Bionetworks
Launched in 2005, the ANM study produced a vast collection of data from more than 1,700 participants. Participants enrolled at six different sites across Europe and provided blood and other samples to help develop and validate biomarkers in Alzheimer’s disease (AD). The research team used these specimens to generate data for the cohort, including proteomic, metabolomic, genomic, transcriptomic and other data types. The volume of data was remarkable. It was also disorganized.
Once described as a “data dump” with mixed modalities and missing documents, the dataset was complex and disparate. Recognizing the importance of good data organization, Birkenbihl meticulously pre-processed the ANM dataset with the support of Sir Simon Lovestone of the University of Oxford.
The new version of the dataset includes comprehensive multimodal data, with identifiers mapped to public resources and metadata standardized to align with the principles of FAIR (Findable, Accessible, Interoperable, and Reusable) science. Standardizing the dataset enables researchers to focus on discovery and validation.
“Researchers will be able to explore more complex approaches, such as machine learning and artificial intelligence,” said Birkenbihl, whose recent study demonstrates this value. A preprint of the ANMerge study is available via medRxiv. As an AD Knowledge Portal adjacent study, the AddNeuroMed & ANMerge datasets are accessible through the Synapse platform. To qualify for access, view the terms of use.