The Institute for Systems Biology RepeatMasker

Services

  • RepeatMasking
  • Protein-based RepeatMasking
  • Pre-Masked Genomes Search
  • Genome Analysis and Downloads
  • Server Queue Status
  • FEAST - Gene Prediction
  • Software

  • Download RepeatMasker
  • Download RepeatModeler
  • RMBlast (NCBI Blast for RM)
  • Download COSEG
  • Download DupMasker
  • Documentation

  • FAQ
  • RepeatMasker
  • Server Configuration
  • Community

  • Repbase [GIRI]
  • Dfam [HHMI]
  • Tools and Scripts
  • Related Papers
  • Contact

  • Mailing List
  • Submit Feedback
  • People
  • Stats

  • Sequence Processed:
  • Welcome!

    RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library derived from Repbase sequences ) and Repbase, a service of the Genetic Information Research Institute.

    Latest News

    If you would like to keep up with news and announcements relating to RepeatMasker, you can either follow us on Twitter:
    or subscribe to our low-volume announcement only mailing list: RepeatMasker Announcements List.

    New RepeatModeler and RECON Released
    Thursday, May 29, 2014
    We released a new version of the RepeatModeler de-novo repeat identification and library building software suite. This version supports parallel BLAST searches and greatly speeds up the analysis on multiprocessor systems. The new version also adds the ability to restart a crashed RepeatModeler run where it left off. We are also releasing a new version of RECON which fixes a buffer overrun bug ( reported by Stephen Ficklin ). The RepeatModeler release is available here: www.repeatmasker.org/RepeatModeler.html. and our RECON release can be downloaded from here: RECON-1.0.8.tar.gz.
    Updated Pre-Masked Genomes And Landscapes
    Thursday, May 29, 2014
    We have expanded the "Genome Analysis and Downloads" page at the repeatmasker website, adding an additional 30 species. RepeatMasker 4.0.5 ( db20140131 ) has been run on all of these species ( 67 existing and new ) and the repeat landscape graphs have been updated. In addition to displaying the standard repeat landscape we now provide additional summary statistics of the RepeatMasker run and a pie chart of the repetitive fraction of the genome. The page is listed as Genomic Analysis and Downloads under the Service menu at the top left of the main site.
    RepeatMasker 4.0.5 and New RepeatMasker Libraries Released
    Wednesday, February 5, 2014
    RepeatMasker 4.0.5 is now available for download. Enhancements include: Dupmasker support for RMBlast, the Kimura divergence is now calculated for each alignment and placed in the *.algn files, and we now make available our software for drawing repeat landscapes (util/createRepeateLandscape.pl). The new release is available here: www.repeatmasker.org/RMDownload.html.

    An updated set of RepeatMasker libraries ( 20140131 ) is also available for download from GIRI at: http://www.girinst.org. Additions include improvements and expansion of eutherian to mammalian-wide ancestral repeats, addition of a very detailed set of mouse-specific LINE subfamilies from the Boissinot lab, and new or much expanded libraries created by GIRI for an oddball selection of species, including strawberry, oyster, anole lizard, painted turtle, sea lamprey, acorn worm, and a few fungi.

    RepeatMasker 4.0.3 Maintenance Update
    Thursday, June 20, 2013
    Today we released RepeatMasker 4.0.3. This is a maintenance update which fixes several minor issues in the 4.x releases including: a problem running RM on species names which contain parentheses in the NCBI taxonomy database, missing ID values in rare circumstances, and a problem with Alu refinement when provided with very long sequence names. The new release is available here: www.repeatmasker.org/RMDownload.html.

    NOTE: Dfam users will want to update their HMMER distribution to the recently released v3.1b1 available at: http://hmmer.janelia.org/

    RepeatMasker 4.0.2 Maintenance Update And New Library Release
    Monday, April 29, 2013
    Today we released RepeatMasker 4.0.2. This is a maintenance update which fixes several problems in 4.0.0/4.0.1. Notably there were issues with human Alu refinement, short input sequences producing "FastaDB::substr - Error index out of bounds!" errors, and lastly an issue with overlapping annotations not being merged. We have also released a new RepeatMasker library ( rm-20130422 ) which includes updates from Repbase as well as four new genome libraries: Gibbon (Nomascus leucogenys), American alligator (Alligator mississippiensis), saltwater crocodile (Crocodylus porosus), and gharial (Gavialis gangeticus). The new release is available here: www.repeatmasker.org/RMDownload.html.
    RepeatMasker 4.0.1 Maintenance Update
    Friday, February 22, 2013
    Today we released RepeatMasker 4.0.1. This is a maintenance update which fixes problems observed by some of our users. Notably this fixes error messages produced by the configure script, problems using the older wublast program with RepeatMasker, empty classname columns when custom libraries are used, and noisy perl warnings. Also included in this release are an updated taxonomy database, and an expanded repeat protein database. The new release is available here: www.repeatmasker.org/RMDownload.html.
    RepeatModeler 1.0.7 - Update
    Tuesday, January 15, 2013
    Today we released RepeatModeler 1.0.7. This version adds support for the newly released RepeatMasker 4.0 package and the RMBlast 2.2.27+ search engine. The release is available here: www.repeatmasker.org/RepeatModeler.html.
    RepeatMasker 4.0
    Thursday, January 10, 2013
    Today we released RepeatMasker 4.0 adding support for the new nhmmer program and the new profile HMM database of transposable elements - Dfam. Other changes include: a new alignment file format for improved cross referencing of database/annotation identifiers, adoption of TRF for simple repeat identification, improved SINE subfamily refinement, and plenty of bugfixes. The new release is available here: www.repeatmasker.org/RMDownload.html.
    NCBI Releases BLAST+/RMBlast 2.2.27
    Friday, September 14, 2012
    In collaboration with NCBI we now have a synchronized release of the RMBlast and NCBI BLAST+ tools. NCBI now hosts the source code and pre-compiled binaries for RMBlast allowing us to support a more diverse set of hardware/software platforms. Please see our RMBlast page for details on how to install the new release with RepeatMasker and RepeatModeler. Special thanks to George Coulouris at NCBI for all his assistance in getting this distribution system setup.
    Dfam: A Database for Profile HMMs of Transposable Elements
    Thursday, September 13, 2012
    The first version of a transposable element profile HMM database was released this month. This represents a major improvement in the characterization of these interesting sequences. Profile methods are known to improve sensitivity over single sequence search, with profile HMMs in particular leveraging the additional information content in position-specific residue and indel variability. Until very recently the use of DNA/DNA profile HMMs to conduct large scale genomic searches was impractical. Advances by the HMMER3 development team at HHMI Janelia farm have made genome scale searches of profile HMMs feasible and enabled the development of this new community resource. A new version of RepeatMasker which uses Dfam and nhmmer will be released in the next few weeks. This work is a collaboration between HHMI Janelia Farm, GIRI ( Genetic Information Research Institute, Repbase ), and the Institute for Systems Biology.

    The official announcement of the resource: http://selab.janelia.org/people/eddys/blog/?p=675

    The database website: http://dfam.janelia.org

    [Previous News]
    Search

    Search the RepeatMasker website:

    Links
    - RepeatMasker makes use of Repbase which is a service of the Genetic Information Research Institute. Repbase is a comprehensive database of repetitive element consensus sequences.
    - RepeatMasker also makes use of the new Dfam database of repeat profile hidden markov models. Used in conjunction with the new nhmmer program ( part of the HMMER suite ) this greatly improves the sensitivity of RepeatMasker while maintaining runtimes similar to the Cross_match search engine.
    - Data and computational resources for the Pre-Masked Genomes page is provided courtesy of the UCSC Genome Bioinformatics group.

    Institute for Systems Biology
    This server is made possible by funding from the National Human Genome Research Institute (NHGRI grant # RO1 HG002939).