Harmonizing Data Sharing Efforts towards a Federated Data Model
Alejandro Sweet-Cordero, MD, University of California, San Francisco
Richard Gorlick, MD, MD Anderson Cancer Center
A significant barrier to progress in studying pediatric cancer and in the design of novel precision medicine clinical trials is that it is not currently possible to visualize all research data in one place. This is because it is located in individual labs and in many separate databases, some of which are difficult to access due to procedural, regulatory, reporting and standardization issues. To maximize the success of pediatric cancer research, barriers to data access and sharing must be minimized. Through its support of pediatric cancer research, ALSF makes possible much-needed advances in the field, and the Crazy 8 Initiative is accelerating this work. As one important example, fusion-negative sarcomas are a diverse and understudied subset of pediatric cancers. To date, several different groups have performed comprehensive sequencing of osteosarcoma (OS) and embryonal rhabdomyosarcoma (ERMS). Our collaborators, Drs. Sweet-Cordero and Gorlick, propose to build a public resource that will allow for visualization and data sharing of all available ERMS and OS sequencing data in order to address this significant problem. Because we have experience with creating a similar public resource for the uniform processing and sharing of pediatric RNA sequencing data, we propose to: assist in harmonizing useful clinical metadata; offer standardization approaches for data dictionaries in the ERMS and OS public resource, link the data from the resource with other data from the same donors in other repositories and databases, and develop language for access and use agreements that are minimally burdensome on researchers.