PgR6 with Methods and Sequences. Final users should use pagoo
instead of this, since is more easy to understand.
Inherits: PgR6M
Super classes
pagoo::PgR6
-> pagoo::PgR6M
-> PgR6MS
Active bindings
sequences
A
DNAStringSetList
with the set of sequences grouped by cluster. Each group is accessible as were a list. AllBiostrings
methods are available.core_sequences
Like
$sequences
, but only showing core sequences.cloud_sequences
Like
$sequences
, but only showing cloud sequences as defined above.shell_sequences
Like
$sequences
, but only showing shell sequences, as defined above.
Methods
Inherited methods
pagoo::PgR6$add_metadata()
pagoo::PgR6$drop()
pagoo::PgR6$recover()
pagoo::PgR6$save_pangenomeRDS()
pagoo::PgR6$write_pangenome()
pagoo::PgR6M$cg_exp_decay_fit()
pagoo::PgR6M$dist()
pagoo::PgR6M$gg_barplot()
pagoo::PgR6M$gg_binmap()
pagoo::PgR6M$gg_curves()
pagoo::PgR6M$gg_dist()
pagoo::PgR6M$gg_pca()
pagoo::PgR6M$gg_pie()
pagoo::PgR6M$pan_pca()
pagoo::PgR6M$pg_power_law_fit()
pagoo::PgR6M$rarefact()
pagoo::PgR6M$runShinyApp()
Method new()
Create a PgR6MS
object.
Usage
PgR6MS$new(
data,
org_meta,
cluster_meta,
core_level = 95,
sep = "__",
DF,
group_meta,
sequences,
verbose = TRUE
)
Arguments
data
A
data.frame
orDataFrame
containing at least the following columns:gene
(gene name),org
(organism name to which the gene belongs to), andcluster
(group of orthologous to which the gene belongs to). More columns can be added as metadata for each gene.org_meta
(optional) A
data.frame
orDataFrame
containing additional metadata for organisms. Thisdata.frame
must have a column named "org" with valid organisms names (that is, they should match with those provided indata
, columnorg
), and additional columns will be used as metadata. Each row should correspond to each organism.cluster_meta
(optional) A
data.frame
orDataFrame
containing additional metadata for clusters. Thisdata.frame
must have a column named "cluster" with valid organisms names (that is, they should match with those provided indata
, columncluster
), and additional columns will be used as metadata. Each row should correspond to each cluster.core_level
The initial core_level (that's the percentage of organisms a core cluster must be in to be considered as part of the core genome). Must be a number between 100 and 85, (default: 95). You can change it later by using the
$core_level
field once the object was created.sep
A separator. By default is '__'(two underscores). It will be used to create a unique
gid
(gene identifier) for each gene.gid
s are created by pastingorg
togene
, separated bysep
.DF
Deprecated. Use
data
instead.group_meta
Deprecated. Use
cluster_meta
instead.sequences
Can accept: 1) a named
list
of namedcharacter
vector. Name of list are names of organisms, names of character vector are gene names; or 2) a namedlist
ofDNAStringSetList
objects (same requirements as (1), but with BStringSet names as gene names); or 3) aDNAStringSetList
(same requirements as (2) butDNAStringSetList
names are organisms names).verbose
logical
. Whether to display progress messages when loading class.
Method core_seqs_4_phylo()
A field for obtaining core gene sequences is available (see below), but for creating a phylogeny with this sets is useful to: 1) have the possibility of extracting just one sequence of each organism on each cluster, in case paralogues are present, and 2) filling gaps with empty sequences in case the core_level was set below 100%, allowing more genes (some not in 100% of organisms) to be incorporated to the phylogeny. That is the purpose of this special function.