Hi, is the raw sequencing data for the new asexual/sexual assemblies available in SRA?
Can you tell me how I could retrieve the nucleotide sequence for a gene I.D. via your website (i.e. HB.14.06c). These gene I.D.s are specific to S. mediterranea.
By chance do you mean, HB.14.6d instead of HB.14.6c? This sequence can be found at the NCBI with a search. You can also find it by searching in the SmedSxl_v31 genome browser for HB.14.6d. We typically use the corresponding smednr sequence. This sequence can be found by clicking on the desired smednr feature in the browser.
Hello SmedGD team,
thank you for such a great resource. I have a question about the unigenes. As far as I understand they are generated by combining different assemblies, but can I be sure that these genes represent real proteins that this organism produces? Also, Smed Unigene SMU15026064, for example, does not have any transcripts to support it, what does it mean about this unigene?
Yes, the unigenes were made by combining different assemblies of transcripts generated from RNASeq experiments. Since these are generated from experimental data it is very likely that these genes are expressed in the organism. The reason no transcript is listed as support for this Unigene is because this Unigene does not align to the genome assembly. This happens occasionally and does not mean this is not a real gene. The genome assembly is not perfect, as is no genome assembly. The list of supporting transcripts is generated by identifying transcripts that align to the same genomic loci as the unigene. We do this so that as we align more transcript sequences to the genome assembly we can add support for the unigenes.
Hi, which annotation set was used in the recent “Set1 and MLL1/2 Target Distinct Sets of Functionally Different Genomic Loci In Vivo” Cell Reports paper? Looking at the data in GEO, the geneID has the format “SMED30010847”.
Sequences with the SMED3 prefix can be found in the smednr data set (smed_20140614) and it can be downloaded from our download page.
I don’t have access to NT sequences for SMU15033498 and SMU15030872 id (in fact I can just get the AA sequences). What can I do ?
Thanks a lot
I am looking into this and will get back with you.
I have this sorted out. It turns out that the two Smed Unigene IDs you listed do not align to the SmedSxl_v3.1 genome assembly. The sequence retrieval tool only had access to the Smed Unigenes sequences if they aligned to the genome. I have corrected this. Now, even if the Smed Unigene does not align to SmedSxl_v3.1 the nucleotide sequence will be available.
Your Smed Unigene is awesome.. But it’s a bit painful to search independently each annotated gene to find their putative homolog (I have more that 1000 hits from MS experiments). So, does it’s possible to have in a single file the connection between the smu name of the gene and the annotated one?
Thanks a lot.
We have made a tab delimited file available on our downloads page. Please let me know if this has all the SMUnigene annotation information you are looking for.
Smed Unigene Sequence Retrieval from the Unigene pages does not appear to link to nt or aa sequences. Thanks!
I had utilized this website around the unique mk4 numbers that denoted specific sequences. I had entered this sequence number (for example mk4.001927.02.01), and the current version of the database does not seem to support that method of identifying sequences. Is there a way I could still access these data?
The MAKER search functionality has been repaired. Sorry for any inconvenience.
In the download section, the 2 links below “Smed Unigenes” seem to link to the same unique file…
This appears to be fixed. Let us know if you have any more issues or concerns.
Your email address will not be published. Required fields are marked *