Accessing Metadata

taleinav · October 6, 2023, 6:12pm

For those of us looking for TSV data files: The raw datasets contain the subject, specimen, and experimental data.

Is there a simple way to get gene, cell_type, and protein as TSV files? It seems I may be able to get them from https://www.cmi-pb.org/api/gene], https://www.cmi-pb.org/api/cell_type, and https://www.cmi-pb.org/api/protein as JSON files, but TSV would be preferred, and I wanted to make sure these were the correct up-to-date datasets.

===
As a (possibly related) follow-up question, can you explain how to use the APIs to pull these types of data, and also what the categories Get, Post, Delete, and Patch mean?

Pramod · October 10, 2023, 6:42pm

General/long Answer:

Swagger (now known as OpenAPI) is a specification for building APIs. It provides a standard, language-agnostic interface to RESTful APIs. CMI-PB uses Swagger API to organize APIs and provide data access. There are multiple ways data can be downloaded from CMI-PB swagger-based API.

1) Table specific URIs: We recommend using CMI-PB API or the uniform resource identifier (URI) format i.e., https://www.cmi-pb.org/api/<table_name> to download data for each table. Database schema is avaiable at UNDERSTAND THE DATA - CMI-PB Blog pages page. This method only provides data in json format.

2) Using Curl command: Following command will return all rows in the ‘subject’ view that we created. Documentation on querying/filtering is here (Tables and Views — PostgREST 7.0.1 documentation)

curl -X 'GET' \
  'https://www.cmi-pb.org:443/api/v4/subject' \
  -H 'accept: text/csv' \
  -H 'Range-Unit: items' > subject.csv

3) Swagger UI: Many APIs described with Swagger provide a Swagger UI, a visual interface where you can see all available endpoints, try them out, and see their responses. You could use this UI to fetch data by hitting the relevant endpoints.

Steps with an example:

Locate the Endpoint: Once you’re on the Swagger UI, you’ll see a list of available endpoints, organized by HTTP method (GET, POST, PUT, etc.). We recomned using only GET method. Other methods are only for internal purposes. Find the endpoint from which you want to download data. Let’s say you’re looking for an endpoint that fetches subject details.
Click on the Endpoint: When you click on an endpoint in Swagger UI, it expands to show additional details, including the parameters the endpoint accepts, the expected response format, etc.
Enter Parameters (if any): If the endpoint requires any parameters (e.g., biological_sex=eq.Male in limit option), you’ll see input fields where you can enter these parameters. Fill in the necessary details.
Execute the Request: There should be a “Try it out” button or similar. Click on this button. This will execute the API request right from the browser, and Swagger UI will display the request URL, the response body, response code, and headers.
View/Download the Data: Once you execute the request:
A) The response will be displayed in the Swagger UI itself.
B) You can view the data directly in the browser in json or tsv formats.

This will return all rows in the ‘subject’ view that we created. Documentation on querying/filtering is here (Tables and Views — PostgREST 7.0.1 documentation)

Pramod · October 10, 2023, 7:11pm

The short answer:
We are working on providing all metadata files in TSV format from the download repo here. We will keep you posted.

For now, we recommend using the second option (Using Curl command).