Skip to content

Manifest CSV

The manifest is a CSV file with file locations and metadata used to bulk upload and download files in Synapse. It is the standard manifest format used by Project.sync_from_synapse, Project.sync_to_synapse, Folder.sync_from_synapse, Folder.sync_to_synapse, the Synapse UI download cart, and the synapse get-download-list CLI command.

Replaces legacy TSV manifest

This CSV manifest replaces the legacy TSV manifest (SYNAPSE_METADATA_MANIFEST.tsv) produced by synapseutils.syncFromSynapse. The syncFromSynapse and syncToSynapse utility functions are deprecated and will be removed in v5.0.0. Use Project.sync_from_synapse / Folder.sync_from_synapse and Project.sync_to_synapse / Folder.sync_to_synapse instead. See the legacy TSV manifest documentation for details on the old format.

Manifest file format

The format of the manifest file is a comma-separated values (CSV) file with one row per file and columns describing the file. The minimum required columns for uploading are path and parentId, where path is the local file path and parentId is the Synapse ID of the project or folder where the file is uploaded to.

Values that contain commas are automatically quoted (e.g., "hello, world"). This is handled by the standard CSV format using QUOTE_MINIMAL quoting.

Required fields for upload

Field Meaning Example
path local file path or URL /path/to/local/file.txt
parentId Synapse ID of parent syn1235

Column renamed from legacy TSV format

The legacy TSV manifest used a column named parent. The CSV manifest uses parentId instead, which is consistent with the Synapse REST API field name. If you are migrating an existing TSV manifest to CSV, rename the parent column to parentId.

Standard fields

These columns are recognized by sync_to_synapse and have specific meaning. Any of these columns may be present in the manifest but only path and parentId are required for upload.

Field Meaning Example
path local file path or URL /path/to/local/file.txt
parentId Synapse ID of parent container syn1235
ID Synapse entity ID syn2345
name name of file in Synapse Example_file
synapseStore whether to upload the file True
contentType content type of file to overload defaults text/html
forceVersion whether to update version False
activityName name of activity in provenance Ran normalization
activityDescription text description of what was done Ran algorithm xyz with parameters...
used list of items used to generate file syn1235;/path/to_local/file.txt
executed list of items executed https://github.org/;/path/to_local/code.py

Metadata fields (ignored during upload)

These columns are present in manifests generated by the Synapse UI download cart and synapse get-download-list CLI. They are ignored by sync_to_synapse and are not treated as annotations.

Field Meaning
error any error in downloading file
versionNumber version of the file
dataFileSizeBytes size of the file in bytes
createdBy user who created the file
createdOn date the file was created
modifiedBy user who last modified
modifiedOn date last modified
synapseURL URL to the file in Synapse
dataFileMD5Hex MD5 hash of the file

Annotations

Any columns that are not in the standard or metadata fields described above will be interpreted as annotations of the file.

Adding annotations to each row:

path parentId annot1 annot2 annot3 annot4 annot5
/path/file1.txt syn1243 bar 3.1415 "aaaa, bbbb" "[14,27,30]" "Annotation, with a comma"
/path/file2.txt syn12433 baz 2.71 value_1 "[1,2,3]" test 123
/path/file3.txt syn12455 zzz 3.52 value_3 "[42,56,77]" a single annotation

Multiple values of annotations per key

Using multiple values for a single annotation should be used sparingly as it makes it more difficult for you to manage the data. However, it is supported.

Annotations can be comma , separated lists surrounded by brackets [].

Because the manifest is a CSV file, multi-value annotations that contain commas are automatically quoted. For example, [a,b,c] will appear in the CSV as "[a,b,c]".

This is an annotation with 3 values:

path parentId annot1
/path/file1.txt syn1243 "[a,b,c]"

This is an annotation with 1 value (no brackets):

path parentId annot1
/path/file1.txt syn1243 my sentence with commas

Dates in the manifest file

Dates within the manifest file will always be written as ISO 8601 format in UTC without milliseconds. For example: 2023-12-20T16:55:08Z.

Dates can be written in other formats specified in ISO 8601 and they will be recognized. However, sync_from_synapse will always write dates in the UTC format specified above. For example, you may want to specify a datetime at a specific timezone like 2023-12-20 23:55:08-07:00 and this will be recognized as a valid datetime.

Manifest sources

The CSV manifest format is shared across multiple tools:

Source Filename Format
Project.sync_from_synapse / Folder.sync_from_synapse manifest.csv CSV
Synapse UI download cart manifest.csv CSV
CLI synapse get-download-list manifest_\<timestamp>.csv CSV

A manifest generated by any of these sources can be used as input to sync_to_synapse, provided the path column is present with valid local file paths. Manifests from the Synapse UI and CLI do not include a path column by default, so users must add it before uploading.

Example manifest file

path parentId ID name annot1 annot2 collection_date used executed
/path/file1.txt syn1243 syn5001 file1.txt bar 3.1415 2023-12-04T07:00:00Z syn124;/path/file2.txt https://github.org/foo/bar
/path/file2.txt syn12433 syn5002 file2.txt baz 2.71 2001-01-01T08:00:00Z https://github.org/foo/baz
/path/file3.txt syn12455 syn5003 file3.txt zzz 3.52 2023-12-04T07:00:00Z https://github.org/foo/zzz

References