RPFdb stands for Ribosome Profiling (RPF) Database, which is a comprehensive resource of ribosome profiling data, also known as ribosome footprinting. It was first introduced by Nicholas Ingolia and Jonathan Weissman. A good review of RPF can be found here. The current version, updated on 04, July, 2015, contains 777 samples from 82 studies for eight species. Please check this website often as we will frequently add newly published studies.

The Browse page provides: (1) Meta information of the study, including abstract of study, tissue or cell source, treatment for RPF experiment, and reference genome for alignment; (2) The top 200 most translated genes. In addition, the whole RPKM table of the study is also downloadable; (3) Plots showing overall statistics of each sample, including the numbers and fraction of mapped and unmapped reads in each sample, and statistics of RPKM on different genomic regions of each sample, including 5’UTR, CDS, 3’UTR and gene.


This page provides two ways to query the database: (1) search gene; (2) search study.

(1) Search gene: by selecting a species and entering a gene name or ensemble ID in the search box of the search page (also appears in the home page), the output shows the genome information of this gene from all the samples for the selected species, including RPKM of the gene and RPKM of the 5’UTR, CDS and 3’UTR regions. The JBrowse icon provides hyperlink to a genome browser, which query and visualize context-specific RPF data.

(2) Search study: users can search studies by keywords. This feature is useful to retrieve data set from curated studies in the database.


Download page also has search function so that users can quickly find out their interested dataset for downloading. In addition, we also support the Application Programming Interface (API), which allows developers to obtain the analysis result from RPFdb by using a HTTP client. For example, the search result of gene THI1 in Arabidopsis can be returned by using the URL Besides, both gene sysmol and Ensembl ID can be used as query keys.

Data processing

The software used in above pipeline are: (1) SRAToolkit v2.4.3; (2) FastQC v0.11.2; (3) STAR v2.4.0i; (4) HTSeq v0.6.1p1; (5) BEDTools v2.19.1. And the reference genome for alignment in species are listed as follows:

Species Reference Link


If you have any questions or comments, please contact us.

Zhi Xie
Laboratory of Systems Biology, Sun Yat-sen University, Guangzhou 510060, China

Jian Ren
School of Life Sciences; Cancer Center, Sun Yat-sen University, Guangzhou 510060, China