Welcome to ASeeDB!

User Guide


ASeeDB is a database designed to provide access to analyze alternative splicing of multiple species cancer genes, evolution, expression and other gene function information. So far, this database has covered six species (human, mouse, rat, platyus, zebrafish ), in the near future, we plan to add more species and cover all known genes to ASeeDB.

This page is intended to introduce new users to ASeeDB providing information on the basic retrieval procedures.

1 Database Generation

2 Search

Fig1

If you wish to know a specific gene information in our database, you can choose use the gene name (e.g. ACTA2), or Ensembl gene id (e.g.ENSG00000107796) or NCBI uniGene ID (e.g. 23373) to access to the database for more information in the first form show above(Fig1 a). You also can select a specific species of gene in the species form.

3 Search Result

Fig2

After you click Sumbit button, you will get a result page which looks like the picture show above (Fig 2). This result page shows the all orthologous genes when you choose all species in search page (notably, we only choose orthologous gene marked 1 to 1 in the Ensembl orthologues database). Alternatively, if you choose one of species which we provide, you may get one specific gene of your selected species.

Each gene listed in above picture has a link to detail information about this gene. You just need to click the gene link which you interested in, and then you will get a specific result page about this gene.

4 Result

After you click the gene link in the search result page, a specific result page that contains a lot of information will show to you, so we intend to explain the result via part by part.

a) Gene basic information

Fig3

This part shows the basic information about your query gene(e.g. gene name, gene id, organism, genomic location, strand, transcript). The words in green all have a web link, for example: genomic sequence links to a page which contain the gene genomic sequence in fasta format, Ensembl transcript ID will link to a page that contains a gene transcript table which shows transcript exon information, and Ensembl protein ID link to a protein sequence in fasta format translated by corresponding transcript in the form.

b) ExonRegion

Fig4

Exon regions were defined by identifying clusters of overlapping exons to create a `block` of exon content (Fig 4). Through this approach, it will be convenient to compare exons of different species in gene level.

Fig5

This exonregion table shows your query geneid and the number of exon region in query gene (Fig 5).

c) Exon Region Annotation

Fig6a

Fig6b

This part shows some Exon Region information, we analysed the exon region AS type, searched Exonic splicing enhancers(ESE) and searched domains by InterproScan. We hope this part will provide comprehensive annotation about Exon Region[Fig 6a]. We visualize the results to make users more intuitive to study each exon region, focus on exon regions with functions.[Fig6 b]

d) Exon Region Alignment

Fig7 a

Fig7 b

This part shows exons (exon reigions) relation of your query gene in all six species, which may imply an evolution procedure of the gene.

We use NCBI BLAST tool (ftp://ftp.ncbi.nih.gov/blast) to do a sequence alignment with orthologue gene provided from Ensembl or NCBI, and then we obtain a relation of exons of different species gene. The above table shows the relation of exon regions in six species (Fig 6a). In the table, id in green which looks like 'ENSG00000114626' has a link has a link to original Ensembl exon id making up exon region,-1 strand for Intron (e value < 1e-3) , 0 stands for no aligned exon (e value > 1e-3) , other numbers stand for Exon Region number (e value < 1e-3)

To demonstrate the relation of exons in orthologue gene more clearly, we create a graph by using the data in the above table (Fig 6b). In the graph, the green block stands for exon region in gene, grey block strands for intron sequence corresponding to exon region in orthologue genes.

e) Expressino Information

1) Data

Tissue Species total reads (pair-end) read length(bp)
brain Mouse 19,037,324 100
cerebellar Mouse 10,062,700 115
brain Rat 12,285,034 115
cerebellar Rat 10,870,015 114

Fig8 a

We use NCBI SRA database (http://www.ncbi.nlm.nih.gov/sra/) to calculate gene, transcripts and exon region expression[Fig8 a]. While, we also sequencing brain and cerebellar from brain and cerebellar by ourselves. Mouse and rat (brain and cerebellar) high-throughput sequencing data were generated using standard Illumina/Solexa sequencing platform at China National Human Genome Center (Shanghai). The data's described above.[Fig8 a]

2) Gene Expression

Fig8 b

This histogram shows the expression of gene in a varity of tissues and the table provides exactly FPKM in tissues. Users can facilitate observe the gene expression in different tissues.[Fig8 b]

3) Transcript Expression

Fig8 c

This lineplot shows expression of transcripts in a variety of tissues and the table provides each transcripts's FPKM in tissues. Transcripts in grep has no protein production according to Ensembl transcript annotation. Click the lengend, the corresponding line could be hide, which is useful when the gene has too many transcripts. Users can find out which transcirpt take a domaint position in specific tissue from this part, which may be helpful in experiment.[Fig8 c]

4) Exon Region Expression

This lineplot shows expression of exon regions in a variety of tissues and the table provides each exon region's FPKM in tissues.[Fig8 d]

Notice:
You may find some genes of species in NCBI, but not in our database. The following reasons can cause this happen:
  1. Our data is based on Ensembl 68 release versions. With the data update, more genes may be found.
  2. Our database only includes orthologus genes which show 1 to 1 with human genes in Ensembl.