Shouldn’t the ensembl_gene, ensembl_transcript, and ensembl_translation tables also have a chromosome column? I see that each of them have a seq_region_start and seq_region_end column but not a chromosome column. I suspect some people may need it for various applications or simple just plotting.
Thank you the putting this point into discussion. Actually, we have already included the common metadata into metadata tables (Gene, Transcript and Protein). The current database contains ensembl_gene, ensembl_transcript, and ensembl_translation tables which are due to upgrade into gene, transcript, and protein tables, respectively.
On a semi-related question, if I check the chr, start and end information between the gene table and the Ensembl website I find slightly different values. Ensembl is using GRCh38 but I’m not sure what reference we are using. Here is an example:
Thanks Pramod, I think this would be really important to document since some people would want to Google/lookup these ID’s and they would also find these discrepancies. I can definitely move forward with the tasks but I just wanted to point this out.
Thanks @joreyna for raising this point. We re-discussed about preferring GRCh38 over GRCh37. As you also pointed out GRCh38 would be the obvious choice since GRCh38 is default option at Ensembl.
We will be using GRCh38 from now and tables (gene, transcript and protein) would be available for access very soon. I will keep it posted here.