Manifest CSV¶
The manifest is a CSV file with file locations and metadata used to bulk upload and download files in Synapse. It is the standard manifest format used by Project.sync_from_synapse, Project.sync_to_synapse, Folder.sync_from_synapse, Folder.sync_to_synapse, the Synapse UI download cart, and the synapse get-download-list CLI command.
Replaces legacy TSV manifest
This CSV manifest replaces the legacy TSV manifest (SYNAPSE_METADATA_MANIFEST.tsv) produced by synapseutils.syncFromSynapse. The syncFromSynapse and syncToSynapse utility functions are deprecated and will be removed in v5.0.0. Use Project.sync_from_synapse / Folder.sync_from_synapse and Project.sync_to_synapse / Folder.sync_to_synapse instead. See the legacy TSV manifest documentation for details on the old format.
Manifest file format¶
The format of the manifest file is a comma-separated values (CSV) file with one row per file and columns describing the file. The minimum required columns for uploading are path and parentId, where path is the local file path and parentId is the Synapse ID of the project or folder where the file is uploaded to.
Values that contain commas are automatically quoted (e.g., "hello, world"). This is handled by the standard CSV format using QUOTE_MINIMAL quoting.
Required fields for upload¶
| Field | Meaning | Example |
|---|---|---|
| path | local file path or URL | /path/to/local/file.txt |
| parentId | Synapse ID of parent | syn1235 |
Column renamed from legacy TSV format
The legacy TSV manifest used a column named parent. The CSV manifest uses parentId instead, which is consistent with the Synapse REST API field name. If you are migrating an existing TSV manifest to CSV, rename the parent column to parentId.
Standard fields¶
These columns are recognized by sync_to_synapse and have specific meaning. Any of these columns may be present in the manifest but only path and parentId are required for upload.
| Field | Meaning | Example |
|---|---|---|
| path | local file path or URL | /path/to/local/file.txt |
| parentId | Synapse ID of parent container | syn1235 |
| ID | Synapse entity ID | syn2345 |
| name | name of file in Synapse | Example_file |
| synapseStore | whether to upload the file | True |
| contentType | content type of file to overload defaults | text/html |
| forceVersion | whether to update version | False |
| activityName | name of activity in provenance | Ran normalization |
| activityDescription | text description of what was done | Ran algorithm xyz with parameters... |
| used | list of items used to generate file | syn1235;/path/to_local/file.txt |
| executed | list of items executed | https://github.org/;/path/to_local/code.py |
Metadata fields (ignored during upload)¶
These columns are present in manifests generated by the Synapse UI download cart and synapse get-download-list CLI. They are ignored by sync_to_synapse and are not treated as annotations.
| Field | Meaning |
|---|---|
| error | any error in downloading file |
| versionNumber | version of the file |
| dataFileSizeBytes | size of the file in bytes |
| createdBy | user who created the file |
| createdOn | date the file was created |
| modifiedBy | user who last modified |
| modifiedOn | date last modified |
| synapseURL | URL to the file in Synapse |
| dataFileMD5Hex | MD5 hash of the file |
Annotations¶
Any columns that are not in the standard or metadata fields described above will be interpreted as annotations of the file.
Adding annotations to each row:
| path | parentId | annot1 | annot2 | annot3 | annot4 | annot5 |
|---|---|---|---|---|---|---|
| /path/file1.txt | syn1243 | bar | 3.1415 | "aaaa, bbbb" | "[14,27,30]" | "Annotation, with a comma" |
| /path/file2.txt | syn12433 | baz | 2.71 | value_1 | "[1,2,3]" | test 123 |
| /path/file3.txt | syn12455 | zzz | 3.52 | value_3 | "[42,56,77]" | a single annotation |
Multiple values of annotations per key¶
Using multiple values for a single annotation should be used sparingly as it makes it more difficult for you to manage the data. However, it is supported.
Annotations can be comma , separated lists surrounded by brackets [].
Because the manifest is a CSV file, multi-value annotations that contain commas are automatically quoted. For example, [a,b,c] will appear in the CSV as "[a,b,c]".
This is an annotation with 3 values:
| path | parentId | annot1 |
|---|---|---|
| /path/file1.txt | syn1243 | "[a,b,c]" |
This is an annotation with 1 value (no brackets):
| path | parentId | annot1 |
|---|---|---|
| /path/file1.txt | syn1243 | my sentence with commas |
Dates in the manifest file¶
Dates within the manifest file will always be written as ISO 8601 format in UTC without milliseconds. For example: 2023-12-20T16:55:08Z.
Dates can be written in other formats specified in ISO 8601 and they will be recognized. However, sync_from_synapse will always write dates in the UTC format specified above. For example, you may want to specify a datetime at a specific timezone like 2023-12-20 23:55:08-07:00 and this will be recognized as a valid datetime.
Manifest sources¶
The CSV manifest format is shared across multiple tools:
| Source | Filename | Format |
|---|---|---|
Project.sync_from_synapse / Folder.sync_from_synapse |
manifest.csv | CSV |
| Synapse UI download cart | manifest.csv | CSV |
CLI synapse get-download-list |
manifest_\<timestamp>.csv | CSV |
A manifest generated by any of these sources can be used as input to sync_to_synapse, provided the path column is present with valid local file paths. Manifests from the Synapse UI and CLI do not include a path column by default, so users must add it before uploading.
Example manifest file¶
| path | parentId | ID | name | annot1 | annot2 | collection_date | used | executed |
|---|---|---|---|---|---|---|---|---|
| /path/file1.txt | syn1243 | syn5001 | file1.txt | bar | 3.1415 | 2023-12-04T07:00:00Z | syn124;/path/file2.txt | https://github.org/foo/bar |
| /path/file2.txt | syn12433 | syn5002 | file2.txt | baz | 2.71 | 2001-01-01T08:00:00Z | https://github.org/foo/baz | |
| /path/file3.txt | syn12455 | syn5003 | file3.txt | zzz | 3.52 | 2023-12-04T07:00:00Z | https://github.org/foo/zzz |
References¶
- Project.sync_from_synapse
- [Project.sync_to_synapse][synapseclient.models.Project.sync_to_synapse]
- Folder.sync_from_synapse
- [Folder.sync_to_synapse][synapseclient.models.Folder.sync_to_synapse]
- Manifest TSV (legacy)
- Managing custom metadata at scale