Annotating change among reference genome
December 12, 2025 ยท View on GitHub
This repository contains scripts to process FASTA sequences relative to custom BED files from different reference genomes in order to compare nucleotide changes among them.
Pairwise sequence alignment & homopolymer detection
Tool requirements: UCSC command line tools (download at https://hgdownload.soe.ucsc.edu/admin/exe/) and Needleman and Wunsch implementation from noporpoise/seq-align repository on GitHub (https://github.com/noporpoise/seq-align)
Input requirements:
- two BED files in the format chr start end ID (which should share the ID column and have the same number of rows) with coordinates from two difference reference genomes. Ideally one of the two BED files can be obtained by using liftOver binary from UCSC command line tools from the first BED using the appropriate chain file (more at https://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#Liftover).
- two .2bit files for reference genomes (can be downloaded from https://hgdownload.gi.ucsc.edu/downloads.html)
Within the file settings.sh, insert your custom paths for tools, annotation and output.
./code/run_seq_analysis.sh