A Sequence Distance Graph framework for genome assembly and analysis

The Sequence Distance Graph (SDG) framework works with genome assembly graphs and raw data from paired, linked and long reads. It includes a simple deBruijn graph module, and can import graphs using the graphical fragment assembly (GFA) format. It also maps raw reads onto graphs, and provides a Python application programming interface (API) to navigate the graph, access the mapped and raw data and perform interactive or scripted analyses. Its complete workspace can be dumped to and loaded from disk, decoupling mapping from analysis and supporting multi-stage pipelines. We present the design and implementation of the framework, and example analyses scaffolding a short read graph with long reads, and navigating paths in a heterozygous graph for a simulated parent-offspring trio dataset. SDG  is  freely  available  under  the  MIT  license  at https://github.com/bioinfologics/sdg

Data and Resources

Additional Info

Field Value
  • Name: Yanes, Luis, Type: Corresponding Author,
  • Name: Garcia Accinelli, Gonzalo, Type: Author,
  • Name: Wright, Jonathan, Type: Author,
  • Name: Ward, Ben J., Type: Author,
  • Name: Clavijo, Bernardo J., Type: Author,
Maintainer Email
Article Host Type publisher
Article Is Open Access true
Article License Type cc-by
Article Version Type publishedVersion
Citation Report https://scite.ai/reports/10.12688/f1000research.20233.1
DFW Organisation EI
DFW Work Package 4
DOI 10.12688/f1000research.20233.1
Date Last Updated 2022-09-15T13:48:51.699579
Evidence oa journal (via doaj)
Funder Code(s)
Journal Is Open Access true
Open Access Status gold
PDF URL https://f1000research.com/articles/8-1490/v1/pdf
Publisher URL https://doi.org/10.12688/f1000research.20233.1