Bioinformatics pipeline to identify RH regions


To identify these RH regions, we developed a custom pipeline (, which uses the output files generated from the MAPS pipeline. The first step of the pipeline breaks large scaffolds in the reference into 10 kb bins to avoid calling all mutations on a large scaffold as RH if it has a small RH region. This was particularly important for the large 3B pseudomolecule. Next, for each bin in each individual the pipeline calculates a score based on the criteria described in the previous section and in SI Appendix, Table S11. Intervals with a score of 12.5 or higher are tagged as RH regions in the database and users are warned in the JBrowse viewer if they are in a RH region. Using thisbioinformatics pipeline, we identified 69,651 SNPs in RH regions in the tetraploid population (1.7%), and 38,626 SNPs in RH regions in the hexaploid population (0.6%) at HetMC5/HomMC3 (Table 1).

