Title: | annotate the gene symbols for probes in expression array |
---|---|
Description: | We curated 147 of expression array, from 3 species(human,mouse,rat), 3 companies(affymetrix,illumina,agilent), by aligning the fasta sequences of all probes of each platform to their corresponding reference genome, and then annotate them to genes. |
Authors: | Yujia Xiang && Jianming Zeng |
Maintainer: | The package maintainer <[email protected]> |
License: | Artistic-2.0 |
Version: | 0.1.0 |
Built: | 2024-10-27 04:20:23 UTC |
Source: | https://github.com/jmzeng1314/annoprobe |
annoGene will return a data.frame of gene information or write them to a file (csv or html format). The user should set a list of genes to be annotated, with "ENSEMBL" or "SYMBOL" style.
annoGene(IDs, ID_type, species = "human", out_file)
annoGene(IDs, ID_type, species = "human", out_file)
IDs |
a list of genes |
ID_type |
the type of input IDs, should be "ENSEMBL" or "SYMBOL" |
species |
choose human or mouse, or rat, default: human |
out_file |
the filename, should be ".csv" or ".html". |
a dataframe which columns contain genesymbol, biotypes, ensembl ids and the positions of genes
IDs <- c("DDX11L1", "MIR6859-1", "OR4G4P", "OR4F5") ID_type = "SYMBOL" annoGene(IDs, ID_type) annoGene(IDs, ID_type,out_file ='tmp.html') annoGene(IDs, ID_type,out_file ='tmp.csv')
IDs <- c("DDX11L1", "MIR6859-1", "OR4G4P", "OR4F5") ID_type = "SYMBOL" annoGene(IDs, ID_type) annoGene(IDs, ID_type,out_file ='tmp.html') annoGene(IDs, ID_type,out_file ='tmp.csv')
How does a gene or a list of genes show difference between two group. The boxplot or heatmap will be drawed. just a wrap function of ggpubr and pheatmap.
check_diff_genes(gene, genes_expr, group_list)
check_diff_genes(gene, genes_expr, group_list)
gene |
A vector contains all gene ids of interest. Gene ids should be gene symbol. |
genes_expr |
An expression matrix, the rownames should be gene symbol. |
group_list |
A vector contains the group information of each samples in expression matrix |
A figure : boxplot or heatmap
attach(GSE95166) check_diff_genes('NKILA',genes_expr,group_list ) x=DEG$logFC names(x)=rownames(DEG) cg=c(names(head(sort(x),100)), names(tail(sort(x),100))) check_diff_genes(cg,genes_expr,group_list )
attach(GSE95166) check_diff_genes('NKILA',genes_expr,group_list ) x=DEG$logFC names(x)=rownames(DEG) cg=c(names(head(sort(x),100)), names(tail(sort(x),100))) check_diff_genes(cg,genes_expr,group_list )
Check whether the input gpl in our platform list or not
checkGPL(GPL = NULL)
checkGPL(GPL = NULL)
GPL |
GPL(GEO platform) number, eg: GPL570 |
returns a boolean value
checkGPL('GPL570') checkGPL('GPL15314') checkGPL('GPL10558')
checkGPL('GPL570') checkGPL('GPL15314') checkGPL('GPL10558')
deg_heatmap
will draw a heatmap for you.
deg_heatmap(deg, genes_expr, group_list, topn = 20)
deg_heatmap(deg, genes_expr, group_list, topn = 20)
deg |
the result from limma. |
genes_expr |
the expression matrix |
group_list |
a vector |
topn |
the number of genes in heatmap, default:20 |
a ggplot2 style figure.
attach(GSE27533) deg_heatmap(DEG,genes_expr)
attach(GSE27533) deg_heatmap(DEG,genes_expr)
deg_volcano
will draw a volcano for you.
deg_volcano(need_deg, style = 1, p_thred = 0.05, logFC_thred = 1)
deg_volcano(need_deg, style = 1, p_thred = 0.05, logFC_thred = 1)
need_deg |
should be 3 columns : gene, logFC, p.value(or p.adjust |
style |
you can try 1 or 2, default: 1 |
p_thred |
default:0.05 |
logFC_thred |
default:1 |
a ggplot2 style figure.
deg=GSE27533$DEG need_deg=data.frame(symbols=rownames(deg), logFC=deg$logFC, p=deg$P.Value) deg_volcano(need_deg,1) deg_volcano(need_deg,2)
deg=GSE27533$DEG need_deg=data.frame(symbols=rownames(deg), logFC=deg$logFC, p=deg$P.Value) deg_volcano(need_deg,1) deg_volcano(need_deg,2)
filterEM
will annotate the probes in expression matrix and remove the duplicated gene symbols.
because there will be many probes mapped to same genes, we will only keep the max value one.
filterEM(probes_expr, probe2gene)
filterEM(probes_expr, probe2gene)
probes_expr |
is an expression matrix which rownames are probes of probe2gene and each column is a sample |
probe2gene |
the first column is probes and the second column is corresponding gene symbols |
a expression matrix which has been filtered duplicated gene symbols
attach(GSE95166) head(probes_expr) head(probe2gene) genes_expr <- filterEM(probes_expr,probe2gene) head(genes_expr)
attach(GSE95166) head(probes_expr) head(probe2gene) genes_expr <- filterEM(probes_expr,probe2gene) head(genes_expr)
geoChina
will download the expression matrix and phenotype data as ExpressionSet format
from cloud in mainland China,
it's a alternative method for getGEO function from GEOquery package.
geoChina('gse1009') is the same as eSet=getGEO('gse1009', getGPL = F)
geoChina(gse = "GSE2546", mirror = "tercent")
geoChina(gse = "GSE2546", mirror = "tercent")
gse |
input GSE id, such as GSE1009, GSE2546, default:GSE2546 |
a list of ExpressionSet, which contains the expression matrix and phenotype data
geoChina() geoChina('gse1009') geoChina('GSE1009')
geoChina() geoChina('gse1009') geoChina('GSE1009')
getGPLList
returns all the GPL number checklist stored in packageGet all GPL list in our package
getGPLList
returns all the GPL number checklist stored in package
getGPLList()
getGPLList()
a data.frame which contains the gpl and name of array.
getGPLAnno
returns probe annotations for input gpl
idmap(gpl = "GPL570", type = "bioc", mirror = "tercent")
idmap(gpl = "GPL570", type = "bioc", mirror = "tercent")
GPL |
GPL(GEO platform) number, eg: GPL570 |
source |
source of probe anntation stored, one of "pipe", "bioc", "soft", default:"pipe" |
probe annotaions
ids=idmap('GPL570') ids=idmap('GPL570',type='soft') ids=idmap('GPL18084',type='pipe')
ids=idmap('GPL570') ids=idmap('GPL570',type='soft') ids=idmap('GPL18084',type='pipe')
Print GPL information
printGPLInfo(GPL = NULL)
printGPLInfo(GPL = NULL)
GPL |
GPL(GEO platform) number, eg: GPL570 |
print detail information of the input GEO platform
printGPLInfo('GPL93')
printGPLInfo('GPL93')