CHM13-issues

December 2, 2022 ยท View on GitHub

CHM13 human reference genome issue tracking

For any downstream analysis, please use the following files:

  • Possible consensus or mis-assembly issue: <ver.>_issues.bed
  • Het sites: <ver.>/chm13.draft_<ver.>.curated_sv.20210612.vcf, <ver.>/chm13.draft_<ver.>.hets_combined.20210615.bed

Releases

  • 2022-12-02 Issues added for v2.0. X and Y were simultaneously used in T2T-HG002XYv2.7, and issues found on the Y are appended to v1.1_issues.bed. Note the sequencing data used is from HG002
  • 2021-10-13 Het regions lifted over from v1.0 to v1.1
  • 2021-06-23 Updating 3 additional issues and adding error k-mers in v1.0 and v1.1
  • 2021-06-15 Validated het SVs and clusters of heterozygous sites in v1.0 assembly
  • 2021-04-28 Issues track for HiFi and ONT read alignments from Winnowmap 2.01
  • 2021-03-08 Combined low coverage and clipped regions
  • 2021-02-23 Low coverage regions for HiFi, CLR, and ONT read alignments

Issues.bed file format

LabelDescriptionR,G,BColor
LowLow coverage204,0,0red
Low_QualLow coverage from lower consensus quality204,0,0red
Error_KmerK-mers identified as errors from the Illumina-HiFi hybrid 21-mers0,0,0black
CollapseApproximate region conatining sequence collapse204,0,0red
Chimeric_HapChimeric consensus of two haplotypes204,0,0red

Methods

Brief descriptions are provided for

More details for the polishing and evaluation methods applied on CHM13 is available in T2T-Polish. For the methods used for polishing a nd evaluating the Y, see this preprint for more details.

Citation

Please cite the papers below if any of the materials posted on this github are used: