haplotag.md

June 20, 2025 ยท View on GitHub

Haplotagging command

This command tags (assigns) each read (in BAM) to one haplotype in the phased SNP/SV VCF. i.e., reads will be tagged as HP:i:1 or HP:i:2. In addition, the haplotype block of each read is stored in the PS tag. The phased VCF can be also generated by other programs as long as the PS or HP tags are encoded. The author can specify --log for additionally output a plain-text file containing haplotype tags of each read without parsing BAM.

longphase-s haplotag \
-r reference.fasta \
-s phased_snp.vcf \
--sv-file phased_sv.vcf \
-b alignment.bam \
-t 8 \
-o tagged_bam_prefix

The complete list of haplotagging parameters

Usage:  haplotag [OPTION] ... READSFILE
      --help                          display this help and exit.

require arguments:
      -s, --snp-file=NAME             input SNP vcf file.
      -b, --bam-file=NAME             input bam file.
      -r, --reference=NAME            reference fasta.
optional arguments:
      --tagSupplementary              tag supplementary alignment. default:false
      --sv-file=NAME                  input phased SV vcf file.
      --mod-file=NAME                 input a modified VCF file (produced by longphase modcall and processed by longphase phase).
      -q, --qualityThreshold=Num      not tag alignment if the mapping quality less than threshold. default:1
      -p, --percentageThreshold=Num   the alignment will be tagged according to the haplotype corresponding to most alleles.
                                      if the alignment has no obvious corresponding haplotype, it will not be tagged. default:0.6
      -t, --threads=Num               number of thread. default:1
      -o, --out-prefix=NAME           prefix of phasing result. default:result
      --region=REGION                 tagging include only reads/variants overlapping those regions. default:(all regions)
                                      input format:chrom (consider entire chromosome)
                                                   chrom:start (consider region from this start to end of chromosome)
                                                   chrom:start-end
      --log                           an additional log file records the result of each read. default:false