- About
- Species
- Tools
- How To
- Contact
- Site map
- S.purpuratus Quick Search
Table of Contents
Summary
The method described here Lv Other assemblies was applied to these datasets:
SRR531952|embryo 0h SRR531949|embryo 10h SRR531860|embryo 18h SRR531853|embryo 24h SRR531948|embryo 24h SRR532074|embryo 30h SRR531956|embryo 40h SRR531964|embryo 48h SRR531954|embryo 56h SRR531996|embryo 64h SRR531950|embryo 72h SRR532151|larva four-arm stage SRR532055|larva vestibular invagination stage SRR532143|larva pentagonal disc stage SRR533746|larva tube-foot protrusion stage SRR531957|post-metamorphosis SRR531953|adult tissue coelomocyte SRR531955|adult tissue gut SRR532046|adult tissue radial nerve SRR532121|adult tissue testes SRR531958|adult tissue ovary
The resulting Sp_evigene_mRNA and Sp_evigene_pep databases contain 80545 entries. The BUSCO
scores for the peptides are:
C: | S: | D: | F: | M: |
97.3% | 92.3% | 5.0% | 0.3% | 2.4% |
Summary
This method 'best-transcript-set'-from-many-libraries was used. Trinity 2.8.3 was employed. The resulting Sp_common_mRNA and Sp_common_pep databases contain the best common representation of each gene as determined from Trinity 2.8.3 runs over many RNA-seq libraries. There are 29742 entries in each file, or very roughly one representation per gene. In brief the method attempts to find the most commonly used ends and to group together splicing variants for the same gene, so that a single representative may be selected from that group. This produces about 2.7X fewer sequences than Evigene, mostly by removing near duplicates.
The transcriptome has NCBI accession number GHFM01000000. The cutoff parameter of 60% which was used was determined from this table of BUSCO (peptide) scores:
Cutoff | C: | S: | D: | F: | M: |
100 | 98.5% | 92.3% | 6.2% | 0.2% | 1.3% |
90 | 98.6% | 93.7% | 4.9% | 0.2% | 1.2% |
80 | 98.6% | 93.8% | 4.8% | 0.2% | 1.2% |
60 | 98.6% | 93.8% | 4.8% | 0.2% | 1.2% |
50 | 98.3% | 93.5% | 4.8% | 0.2% | 1.5% |
40 | 97.6% | 93.1% | 4.5% | 0.2% | 2.2% |