Most of our tools and software are hosted at GitHub: https://github.com/Shao-Group
- DCJ-SAT is an exact, fast algorithm to compute the DCJ distance between two genomes with duplicate genes, using a SAT formulation. It is available at github.
- Beaver is an assembler for single-cell RNA-seq data, featuring accurate assembly at single-cell resolution. It is available at github.
- EquiRep implements an algorithm to detect tandem repeats from error-prone sequences, available at github.
- TENNIS is an evolution-based model that is also able to predict missing isoforms from an annotation, available at TENNIS.
- SubseqHash2 implements an improved algorithm to find the smallest subsequence as seed. It is about 10-50 times faster than SubseqHash while preserving the high accuracy. It is available at github.
- Aletsch is a meta-assembler (i.e., assembling multiple samples/cells), available at github.
- Anchorage is an assembler for synthetic long reads (SLR) with anchors, available at github.
- lsb-learn implements an approach to learn LSB functions, available at github.
- TERRACE is an assembler for circular RNAs, available at github.
- SubseqHash implements a new seeding algorithm for sequencing data with high error-rate. The tool is available at github.
- Scallop2 is an improved transcript assembler that is optimized for paired-end RNA-seq data and multi-end RNA-seq data (such as Smart-seq3 data). Software is available at github.
- Altai is an allele-specific transcript assembler. Software is available at github.
- rnabridge-denovo implements an efficient algorithm to reconstruct the full sequences of fragments given paired-end RNA-seq reads. Software is available at github.
- rnabridge-align implements an efficient algorithm to reconstruct the alignments of fragments given the alignments of paired-end RNA-seq reads. Software is available at github.
- Aletsch is a meta-assembler (i.e., one that can assemble a set of RNA-seq samples). Software is available at github.
- Scallop-LR is a reference-based transcriptome assembler for PacBio Iso-Seq data. Software is available at github. Manuscript is published at Genome Biology.
- Scallop is an accurate reference-based transcriptome assembler. Software is available at github. Scallop has been published at Nature Biotechnology. A podcast about Scallop (thanks to Roman Cheplyaka for the interview) is available at both bioinformatics.chat and iTunes.
- DeepBound presents a new framework to identify boundaries of expressed transcripts from RNA-seq alignments using convolutional neural networks. Software is available at github. DeepBound has been published at Bioinformatics.
- Catfish implements an efficient algorithm for the flow decomposition problem, the abstracted mathematical formulation for transcript assembly. Software is available at github. Catfish has been published at IEEE/ACM Transactions on Computational Biology and Bioinformatics.
- SQUID presents an algorithm to identity transcriptomic structural variations from RNA-seq alignments. Software is available at github. SQUID has been published at Genome Biology.
- GREDU (Genome REarrangements with DUplications) is a software package that implements fast and exact algorithms for five edit distance problems between pairwise genomes with duplicate genes. Released source code is at github. Reference manual is here.