Create an AD Portal Dataset for a Manuscript or Publication

The Alzheimer’s Disease Knowledge Portal (AD Portal) showcases the work of the AMP-AD consortium and related NIA programs. The Sage AD Data Coordination Center (AD-DCC) is funded to support consortium members in fulfilling journal and funder requirements to make published data, code, and analysis results Findable, Accessible, Interoperable, and Reusable (FAIR). This document describes how to use Synapse Datasets to group content hosted in the AD Portal, and link that Dataset to a Data Availability section in a manuscript through a persistent identifier.

When planning a manuscript based on data hosted in the AD Portal, contact the AD-DCC through our service desk as early as possible in the process. Please give us the Synapse username(s) of anyone you would like to have edit permission to your Dataset (e.g., manuscript coauthors, collaborators, etc), and let us know if you have any analysis files related to your manuscript* that are not already in the AD Portal and that you would like to include as part of the Dataset. The AD-DCC will create:
1. An empty Dataset in the AD Portal backend in Synapse with edit permission for you and any requested collaborators
2. If you have analysis files, a Synapse folder labeled YourLastname_YourFirstname in the “Manuscript-Related Analysis Files” section of the AD Portal Synapse project. Your directory will be private, and you will have edit/delete permission to it.
Upload any additional files and analysis outputs to your designated Synapse folder from step 2b. You can upload with the Synapse web UI or in bulk with one of the programmatic clients. You can annotate these files to further describe them, but it is not required. We recommend creating sub-folders within your designated folder for different projects, because you will use the same designated folder for any future manuscripts. Once your files are uploaded, the AD-DCC will verify that they do not contain any sensitive data and make them public so they can be included in a Dataset. Note that any human individual level data is treated as Controlled access. Let the AD-DCC know if this is the case so that proper access controls can be applied.
Go to your new Dataset and add items. You can add any files that you have access to from anywhere in Synapse, and you can specify the exact version of those files you used in your analysis if a file has multiple versions. For AD Portal Datasets, you should add:
1. All raw or processed data files used in the analysis
2. The individual, biospecimen, and appropriate assay metadata files from all AD Portal studies with data in your Dataset
3. Any analysis, results, or additional files from your personal “Manuscript-Related Analysis” folder made in step 2b.
Go to "Dataset Tools" and then "Edit Dataset Wiki" to customize the wiki information for your dataset (example here). For AD Portal Datasets, please include:
1. Manuscript title
2. Author list
3. Manuscript preprint URL or published URL
4. The Synapse username for a “Dataset contact” – someone whom data users can contact if they have questions about your Dataset. Do this by typing ‘@’ and search for the name.
5. Any relevant acknowledgement statements and/or publications that secondary data users should reference in further publications. Data acknowledgement statements can be found on the Study Details pages for each study in the AD Portal.
6. Optional: Any other information you would like to include to help contextualize your dataset or analysis. See here for information on adding links, tables, images, and more to your wiki.
Edit the Dataset schema to customize the visible columns displayed with the Dataset. You can add, remove, or re-order any columns you want, but a good default would be to select the “Add Existing Annotations” option, which will include all of the annotations that we apply to files in the AD Portal.
Once the wiki details and Dataset items are finalized, create a stable Dataset version. Please provide an informative comment when you create a stable version – e.g., “Dataset for manuscript submission April 2022” or “Finalized analysis for Smith et al. 2022 publication”. If you need to make additional changes after you create a stable version, go back to the draft version, make changes, and create a new stable version with a new informative comment.
From your stable version, go to "Dataset Tools" --> "Create a DOI" and mint the DOI for the stable Dataset version. DOI metadata becomes public information, so make sure to enter manuscript relevant information. See this example. Then add the DOI to the following manuscript Data Availability statement and provide the complete statement with your manuscript submission:

<data, analysis output, tools (describe content)> are available via the AD Knowledge Portal (https://adknowledgeportal.org). The AD Knowledge Portal is a platform for accessing data, analyses, and tools generated by the Accelerating Medicines Partnership (AMP-AD) Target Discovery Program and other National Institute on Aging (NIA)-supported programs to enable open-science practices and accelerate translational learning. The data, analyses and tools are shared early in the research cycle without a publication embargo on secondary use. Data is available for general research use according to the following requirements for data access and data attribution (https://adknowledgeportal.synapse.org/Data%20Access).
For access to content described in this manuscript see: <dataset DOI>

Contact the AD-DCC data liaison if you have questions! We are happy to help.

Notes:

Manuscript-Related Analysis Files that you upload will not be curated by the DCC and can only be uploaded if they meet open-access governance criteria. These can include:

Any files that contain summary-level data such as differential expression results, summary statistics, model parameters or estimates, etc (must NOT contain individual-level data like individual identifiers, sequence data, gene counts, or clinical data)
Code or processing scripts
Other manuscript-related files like supplementary tables, protocols, images, diagrams, etc, that help contextualize your analysis and that do not contain any identifying or sensitive information