PgR6 with Methods and Sequences. Final users should use pagoo
instead of this, since is more easy to understand.
Inherits: PgR6M
Super classes
pagoo::PgR6 -> pagoo::PgR6M -> PgR6MS
Active bindings
sequencesA
DNAStringSetListwith the set of sequences grouped by cluster. Each group is accessible as were a list. AllBiostringsmethods are available.core_sequencesLike
$sequences, but only showing core sequences.cloud_sequencesLike
$sequences, but only showing cloud sequences as defined above.shell_sequencesLike
$sequences, but only showing shell sequences, as defined above.
Methods
Inherited methods
pagoo::PgR6$add_metadata()pagoo::PgR6$drop()pagoo::PgR6$recover()pagoo::PgR6$save_pangenomeRDS()pagoo::PgR6$write_pangenome()pagoo::PgR6M$cg_exp_decay_fit()pagoo::PgR6M$dist()pagoo::PgR6M$gg_barplot()pagoo::PgR6M$gg_binmap()pagoo::PgR6M$gg_curves()pagoo::PgR6M$gg_dist()pagoo::PgR6M$gg_pca()pagoo::PgR6M$gg_pie()pagoo::PgR6M$pan_pca()pagoo::PgR6M$pg_power_law_fit()pagoo::PgR6M$rarefact()pagoo::PgR6M$runShinyApp()
Method new()
Create a PgR6MS object.
Usage
PgR6MS$new(
data,
org_meta,
cluster_meta,
core_level = 95,
sep = "__",
DF,
group_meta,
sequences,
verbose = TRUE
)Arguments
dataA
data.frameorDataFramecontaining at least the following columns:gene(gene name),org(organism name to which the gene belongs to), andcluster(group of orthologous to which the gene belongs to). More columns can be added as metadata for each gene.org_meta(optional) A
data.frameorDataFramecontaining additional metadata for organisms. Thisdata.framemust have a column named "org" with valid organisms names (that is, they should match with those provided indata, columnorg), and additional columns will be used as metadata. Each row should correspond to each organism.cluster_meta(optional) A
data.frameorDataFramecontaining additional metadata for clusters. Thisdata.framemust have a column named "cluster" with valid organisms names (that is, they should match with those provided indata, columncluster), and additional columns will be used as metadata. Each row should correspond to each cluster.core_levelThe initial core_level (that's the percentage of organisms a core cluster must be in to be considered as part of the core genome). Must be a number between 100 and 85, (default: 95). You can change it later by using the
$core_levelfield once the object was created.sepA separator. By default is '__'(two underscores). It will be used to create a unique
gid(gene identifier) for each gene.gids are created by pastingorgtogene, separated bysep.DFDeprecated. Use
datainstead.group_metaDeprecated. Use
cluster_metainstead.sequencesCan accept: 1) a named
listof namedcharactervector. Name of list are names of organisms, names of character vector are gene names; or 2) a namedlistofDNAStringSetListobjects (same requirements as (1), but with BStringSet names as gene names); or 3) aDNAStringSetList(same requirements as (2) butDNAStringSetListnames are organisms names).verboselogical. Whether to display progress messages when loading class.
Method core_seqs_4_phylo()
A field for obtaining core gene sequences is available (see below), but for creating a phylogeny with this sets is useful to: 1) have the possibility of extracting just one sequence of each organism on each cluster, in case paralogues are present, and 2) filling gaps with empty sequences in case the core_level was set below 100%, allowing more genes (some not in 100% of organisms) to be incorporated to the phylogeny. That is the purpose of this special function.