Skip to contents

This function handle conversion of roary's output files into a pagoo R6 class object. It takes the "gene_presence_absence.csv" file and (optionally but recommended) gff input file paths, and returns an object of class PgR6MS (or PgR6M if left empty the gffs argument).

Usage

roary_2_pagoo(gene_presence_absence_csv, gffs, sep = "__", paralog_sep = "\t")

Arguments

gene_presence_absence_csv

character, path to the "gene_presence_absence.csv" file. (Do not confuse with the file with the same name but with .Rtab extension).

gffs

A character vector with paths to original gff files used as roary's input. Typically the return value of list.files() function. This parameter is optional but highly recommended if you want to manipulate sequences.

sep

character. Default: "__" (two underscores). See PgR6MS for a more detail argument description.

paralog_sep

character. A gene separator for cases where the clusters have in-paralogs. (Default: "\t" - tab).

Value

A pagoo's R6 class object. Either PgR6M, if gffs

argument is left empty, or PgR6MS if path to gff files is provided.

References

Andrew J. Page, Carla A. Cummins, Martin Hunt, Vanessa K. Wong, Sandra Reuter, Matthew T. G. Holden, Maria Fookes, Daniel Falush, Jacqueline A. Keane, Julian Parkhill, "Roary: Rapid large-scale prokaryote pan genome analysis", Bioinformatics, 2015;31(22):3691-3693

Examples

if (FALSE) {
gffs <- list.files(path = "path/to/gffs/",
                   pattern = "[.]gff$",
                   full.names = TRUE)
gpa_csv <- "path/to/gene_presence_absence.csv"

library(pagoo)
pg <- roary_2_pagoo(gene_presence_absence_csv = gpa_csv,
                    gffs = gffs)
}