摘要Arabidopsis thaliana is an important and long-established model species for plant molec-ular biology,genetics,epigenetics,and genomics.However,the latest version of reference genome still contains a significant number of missing segments.Here,we reported a high-quality and almost complete Col-0 genome assembly with two gaps(named Col-XJTU)by combining the Oxford Nanopore Technologies ultra-long reads,Pacific Biosciences high-fidelity long reads,and Hi-C data.The total genome assembly size is 133,725,193 bp,introducing 14.6 Mb of novel sequences compared to the TAIR1 0.1 reference genome.All five chromosomes of the Col-XJTU assembly are highly accurate with consensus quality(QV)scores>60(ranging from 62 to 68),which are higher than those of the TAIR10.1 reference(ranging from 45 to 52).We completely resolved chro-mosome(Chr)3 and Chr5 in a telomere-to-telomere manner.Chr4 was completely resolved except the nucleolar organizing regions,which comprise long repetitive DNA fragments.The Chr1 cen-tromere(CEN1),reportedly around 9 Mb in length,is particularly challenging to assemble due to the presence of tens of thousands of CEN180 satellite repeats.Using the cutting-edge sequencing data and novel computational approaches,we assembled a 3.8-Mb-long CEN1 and a 3.5-Mb-long CEN2.We also investigated the structure and epigenetics of centromeres.Four clusters of CEN180 monomers were detected,and the centromere-specific histone H3-like protein(CENH3)exhibited a strong preference for CEN 180 Cluster 3.Moreover,we observed hypomethylation patterns in CENH3-enriched regions.We believe that this high-quality genome assembly,Col-XJTU,would serve as a valuable reference to better understand the global pattern of centromeric polymorphisms,as well as the genetic and epigenetic features in plants.
更多相关知识
- 浏览3
- 被引0
- 下载0

相似文献
- 中文期刊
- 外文期刊
- 学位论文
- 会议论文