Title: | Manipulating DNA Sequences and Estimating Unambiguous Haplotype Network with Statistical Parsimony |
---|---|
Description: | Provides S4 classes and methods for reading and manipulating aligned DNA sequences, supporting an indel coding methods (only simple indel coding method is available in the current version), showing base substitutions and indels, calculating absolute pairwise distances between DNA sequences, and collapses identical DNA sequences into haplotypes or inferring haplotypes using user provided absolute pairwise character difference matrix. This package also includes S4 classes and methods for estimating genealogical relationships among haplotypes using statistical parsimony and plotting parsimony networks. |
Authors: | Caner Aktas |
Maintainer: | Caner Aktas <[email protected]> |
License: | GPL-2 |
Version: | 1.1.3.1 |
Built: | 2025-02-14 05:11:58 UTC |
Source: | https://github.com/cnrakt/haplotypes |
This package provides S4 classes and methods for reading and manipulating aligned DNA sequences, supporting an indel coding methods (only simple indel coding method is available in the current version), showing base substitutions and indels, calculating absolute pairwise distances between DNA sequences, and collapses identical DNA sequences into haplotypes or infering haplotypes using user provided absolute pairwise character difference matrix. This package also includes S4 classes and methods for estimating genealogical relationships among haplotypes using statistical parsimony and plotting parsimony networks.
Caner Aktas, [email protected]
## Read example FASTA file. f<-system.file("example.fas",package="haplotypes") # invalid character 'N' was replaced with '?' with a warning message x<-read.fas(file=f) # an object of class 'Dna' x ## or load DNA Sequence data set. data("dna.obj") x<-dna.obj ## Not run: x ## End(Not run) ## Compute an absolute pairwise character difference matrix from DNA sequences. # coding gaps using simple indel coding method d<- distance(x,indels="sic") ## Not run: d ## End(Not run) ## Infer haplotypes using the 'Dna' object. # coding gaps using simple indel coding method h<-haplotype(x,indels="s") ## Not run: h ## End(Not run) ## Conduct statistical parsimony analysis with 95% connection limit. #algortihmic method ## Not run: p<-parsimnet(x,prob=.95) p # plot network plot(p) ## End(Not run) ## Plotting pie charts on the statistical parsimony network ## Not run: data("dna.obj") x<-dna.obj h<-haplotypes::haplotype(x) ## Statistical parsimony with 95 p<-parsimnet(x) #randomly generated populations pop<-c("pop1","pop2","pop3","pop4","pop5","pop6","pop7","pop8") set.seed(5) pops<-sample(pop,nrow(x),replace=TRUE) # Plotting with default parameters. pieplot(p,h,1, pops) ## End(Not run)
## Read example FASTA file. f<-system.file("example.fas",package="haplotypes") # invalid character 'N' was replaced with '?' with a warning message x<-read.fas(file=f) # an object of class 'Dna' x ## or load DNA Sequence data set. data("dna.obj") x<-dna.obj ## Not run: x ## End(Not run) ## Compute an absolute pairwise character difference matrix from DNA sequences. # coding gaps using simple indel coding method d<- distance(x,indels="sic") ## Not run: d ## End(Not run) ## Infer haplotypes using the 'Dna' object. # coding gaps using simple indel coding method h<-haplotype(x,indels="s") ## Not run: h ## End(Not run) ## Conduct statistical parsimony analysis with 95% connection limit. #algortihmic method ## Not run: p<-parsimnet(x,prob=.95) p # plot network plot(p) ## End(Not run) ## Plotting pie charts on the statistical parsimony network ## Not run: data("dna.obj") x<-dna.obj h<-haplotypes::haplotype(x) ## Statistical parsimony with 95 p<-parsimnet(x) #randomly generated populations pop<-c("pop1","pop2","pop3","pop4","pop5","pop6","pop7","pop8") set.seed(5) pops<-sample(pop,nrow(x),replace=TRUE) # Plotting with default parameters. pieplot(p,h,1, pops) ## End(Not run)
Dna
Operators acting on sequence matrix to extract or replace parts.
## S4 method for signature 'Dna' x[i, j,..., drop = FALSE] ## S4 replacement method for signature 'Dna' x[i, j]<- value
## S4 method for signature 'Dna' x[i, j,..., drop = FALSE] ## S4 replacement method for signature 'Dna' x[i, j]<- value
x |
an object of class |
i , j
|
elements to extract or replace. |
... |
Additional arguments. In this context, ... is used primarily for as.matrix, which is a boolean. If as.matrix is TRUE (default), the function returns a matrix. If drop is also TRUE, and the subset has either a single row or column, the function will return a vector instead. If as.matrix is FALSE, the function returns an object of class |
drop |
boolean; if TRUE and a single row or column is selected, the function returns a vector instead of a matrix. This is only applicable when as.matrix is TRUE. |
value |
a character vector or a character matrix. |
The S4 method dispatch mechanism matches arguments based on those specified in the signature of the corresponding generic function. However, for some generics that include '...' in their signature, additional arguments can be incorporated in specific methods. Notably, the '[' function does not follow this pattern and restricts the arguments to those defined in its signature. In this context, the 'as.matrix' argument is not in the signature of the generic '[', so it is included within '...'. Then, within the body of the function, we check whether 'as.matrix' has been provided in the actual arguments when the function is called. If 'as.matrix' is not specified, the function defaults to 'TRUE', preserving the behavior of previous versions of the method.
returns an object of class matrix, vector or Dna
.
signature(x = "Dna", i = "ANY", j = "ANY", drop = "ANY")
Caner Aktas, [email protected]
data("dna.obj") x<-dna.obj ## Extract parts. # a matrix object x[1:2,1:3] # a Dna object, as.dna(x[1:2,1:3]) gives the same result x[1:2,1:3,as.matrix=FALSE] # a vector object x[1,1:4,drop=TRUE] ## Replace parts. #"G" "C" x[1,1:2] x[1,1:2]<-c("A","T") x[1,1:2]
data("dna.obj") x<-dna.obj ## Extract parts. # a matrix object x[1:2,1:3] # a Dna object, as.dna(x[1:2,1:3]) gives the same result x[1:2,1:3,as.matrix=FALSE] # a vector object x[1,1:4,drop=TRUE] ## Replace parts. #"G" "C" x[1,1:2] x[1,1:2]<-c("A","T") x[1,1:2]
Dna
objectsCombines two Dna
objects.
## S4 method for signature 'Dna' append(x,values)
## S4 method for signature 'Dna' append(x,values)
x |
an object of class |
values |
an object of class |
an object of class Dna
.
signature(x = "Dna", values= "Dna")
combines two Dna
objects.
data("dna.obj") x<-dna.obj y<-dna.obj nrow(x) ## Combining two 'Dna' objects. z<- append(x,y) nrow(z)
data("dna.obj") x<-dna.obj y<-dna.obj nrow(x) ## Combining two 'Dna' objects. z<- append(x,y) nrow(z)
Dna
object to a data.frameCoerces an object to a data.frame.
## S4 method for signature 'Dna' as.data.frame(x)
## S4 method for signature 'Dna' as.data.frame(x)
x |
an object of class |
returns a data frame.
signature(x = "Dna")
coerces a Dna object to a data.frame.
data("dna.obj") x<-dna.obj x<-as.dna(x[1:4,1:6]) ## Coercing a 'Dna' object to a data.frame. df<-as.data.frame(x) df # TRUE is.data.frame(df) ## Not run: # gives the same result df<-as.data.frame(x@sequence) df ## End(Not run)
data("dna.obj") x<-dna.obj x<-as.dna(x[1:4,1:6]) ## Coercing a 'Dna' object to a data.frame. df<-as.data.frame(x) df # TRUE is.data.frame(df) ## Not run: # gives the same result df<-as.data.frame(x@sequence) df ## End(Not run)
Dna
objectCoerces an object that contains DNA sequences to an object of Class Dna
.
## S4 method for signature 'matrix' as.dna(x) ## S4 method for signature 'data.frame' as.dna(x) ## S4 method for signature 'list' as.dna(x) ## S4 method for signature 'character' as.dna(x) ## S4 method for signature 'Haplotype' as.dna(x) ## S4 method for signature 'DNAbin' as.dna(x) ## S4 method for signature 'phyDat' as.dna(x)
## S4 method for signature 'matrix' as.dna(x) ## S4 method for signature 'data.frame' as.dna(x) ## S4 method for signature 'list' as.dna(x) ## S4 method for signature 'character' as.dna(x) ## S4 method for signature 'Haplotype' as.dna(x) ## S4 method for signature 'DNAbin' as.dna(x) ## S4 method for signature 'phyDat' as.dna(x)
x |
a matrix, a data.frame, a list, a character, an object of class |
Elements of the list must be vectors. Each element of the list contains a single DNA sequence. If the sequence lengths differ, the longest sequence is taken into account and gaps are introduced to the shorter sequences at the end of the matrix in the slot sequence
. Sequence length information is stored in the slot seqlengths
.
Valid characters for the slot sequence
are "A","C","G","T","a","c","g","t","-","?". During the conversion of the object to the class Dna
, integers 0,1,2,3,4,5 or characters "0","1","2","3","4","5" are converted to "?","A","C","G","T","-", respectively. Invalid characters are replaced with "?" with a warning message.
an object of class Dna.
signature(x = "matrix")
coerces matrix to a Dna
object.
signature(x = "data.frame")
coerces data.frame to a Dna
object.
signature(x = "list")
coerces list to a Dna
object.
signature(x = "character")
coerces characters to a Dna
object.
signature(x = "Haplotype")
coerces a Haplotype
object to a Dna
object.
signature(x = "DNAbin")
coerces a DNAbin
object to a Dna
object.
signature(x = "phyDat")
coerces a phyDat
object to a Dna
object.
## Coercing a matrix to a 'Dna' object. # all valid characters x<-matrix(c("?","A","C","g","t","-","0","1","2","3","4","5"),4,6) rownames(x)<-c("seq1","seq2","seq3","seq4") dna.obj<-as.dna(x) dna.obj # the sequence matrix dna.obj@sequence ## Not run: # includes invalid characters x<-matrix(c("X","y","*","?","t","-","0","1","2","3","4","5"),4,6) rownames(x)<-c("seq1","seq2","seq3","seq4") dna.obj<-as.dna(x) dna.obj dna.obj@sequence # all valid integers x<-matrix(c(0,1,2,3,4,5,0,1,2,3,4,5),4,6) rownames(x)<-c("seq1","seq2","seq3","seq4") dna.obj<-as.dna(x) dna.obj dna.obj@sequence ## Coercing a data.frame to a 'Dna' object. x<-data.frame(matrix(c("?","A","C","g","t","-","0","1","2","3","4","5"),4,6)) rownames(x)<-c("seq1","seq2","seq3","seq4") dna.obj<-as.dna(x) dna.obj dna.obj@sequence ## Coercing a list to a 'Dna' object. seq1<-c("?","A","C","g","t","-","0","1") seq2<-c("?","A","C","g","t","-","0","1","2") seq3<-c("?","A","C","g","t","-","0","1","2","3") x<-list(seq1=seq1,seq2=seq2,seq3=seq3) dna.obj<-as.dna(x) # sequence lengths differ dna.obj@seqlengths dna.obj@sequence ## Coercing a character vector to a Dna object. seq<-c("?","A","C","g","t","-","0","1") x<-as.dna(seq) x ## Coercing a Haplotype object to a Dna object. data("dna.obj") x<-dna.obj h<-haplotype(x) # DNA Sequences of unique haplotypes dna.obj<-as.dna(h) dna.obj d<-distance(x) # if 'Haplotype' object does not contain 'DNA' Sequences h<-haplotype(d) # returns an error as.dna(h) ## Coercing a DNAbin object to a Dna object. require(ape) data(woodmouse) x<-as.dna(woodmouse) x ## End(Not run)
## Coercing a matrix to a 'Dna' object. # all valid characters x<-matrix(c("?","A","C","g","t","-","0","1","2","3","4","5"),4,6) rownames(x)<-c("seq1","seq2","seq3","seq4") dna.obj<-as.dna(x) dna.obj # the sequence matrix dna.obj@sequence ## Not run: # includes invalid characters x<-matrix(c("X","y","*","?","t","-","0","1","2","3","4","5"),4,6) rownames(x)<-c("seq1","seq2","seq3","seq4") dna.obj<-as.dna(x) dna.obj dna.obj@sequence # all valid integers x<-matrix(c(0,1,2,3,4,5,0,1,2,3,4,5),4,6) rownames(x)<-c("seq1","seq2","seq3","seq4") dna.obj<-as.dna(x) dna.obj dna.obj@sequence ## Coercing a data.frame to a 'Dna' object. x<-data.frame(matrix(c("?","A","C","g","t","-","0","1","2","3","4","5"),4,6)) rownames(x)<-c("seq1","seq2","seq3","seq4") dna.obj<-as.dna(x) dna.obj dna.obj@sequence ## Coercing a list to a 'Dna' object. seq1<-c("?","A","C","g","t","-","0","1") seq2<-c("?","A","C","g","t","-","0","1","2") seq3<-c("?","A","C","g","t","-","0","1","2","3") x<-list(seq1=seq1,seq2=seq2,seq3=seq3) dna.obj<-as.dna(x) # sequence lengths differ dna.obj@seqlengths dna.obj@sequence ## Coercing a character vector to a Dna object. seq<-c("?","A","C","g","t","-","0","1") x<-as.dna(seq) x ## Coercing a Haplotype object to a Dna object. data("dna.obj") x<-dna.obj h<-haplotype(x) # DNA Sequences of unique haplotypes dna.obj<-as.dna(h) dna.obj d<-distance(x) # if 'Haplotype' object does not contain 'DNA' Sequences h<-haplotype(d) # returns an error as.dna(h) ## Coercing a DNAbin object to a Dna object. require(ape) data(woodmouse) x<-as.dna(woodmouse) x ## End(Not run)
DNAbin
objectThis function coerces Dna
object to DNAbin
{ape}
object .
## S4 method for signature 'Dna' as.DNAbin(x, endgaps=TRUE)
## S4 method for signature 'Dna' as.DNAbin(x, endgaps=TRUE)
x |
an object of class |
endgaps |
boolean; gaps at the end of the sequences are included if this is TRUE. |
an object of class DNAbin.
signature(x = "Dna")
coerces a Dna
object to a DNAbin
object.
## Coercing a Dna object to a DNAbin object. data("dna.obj") x<-dna.obj dBin<-as.DNAbin(x) dBin #gaps at the end removed dBin<-as.DNAbin(x, endgaps=FALSE) dBin
## Coercing a Dna object to a DNAbin object. data("dna.obj") x<-dna.obj dBin<-as.DNAbin(x) dBin #gaps at the end removed dBin<-as.DNAbin(x, endgaps=FALSE) dBin
as.list
in the Package haplotypes
Coerces an object to a list.
## S4 method for signature 'Dna' as.list(x) ## S4 method for signature 'Haplotype' as.list(x) ## S4 method for signature 'Parsimnet' as.list(x)
## S4 method for signature 'Dna' as.list(x) ## S4 method for signature 'Haplotype' as.list(x) ## S4 method for signature 'Parsimnet' as.list(x)
x |
If x
is a Dna
object, elements of the list are character vectors that contains the DNA sequences of length equal to corresponding value in the slot seqlengths
. If x
is Haplotype
or Parsimnet
objects, slots are converted to list elements.
returns a list.
signature(x = "Dna")
coerces an object of class Dna to a list.
signature(x = "Haplotype")
coerces an object of class Haplotype to a list.
signature(x = "Parsimnet")
coerces an object of class Parsimnet to a list.
data("dna.obj") ## Coercing a 'Dna' object to a list. x<-dna.obj[1:3,as.matrix=FALSE] as.list(x) ## Not run: ## Coercing a 'Haplotype' object to a list. x<-dna.obj h<-haplotype(x) as.list(h) ## Coercing a 'Parsimnet' object to a list. x<-dna.obj p<-parsimnet(x) as.list(p) ## End(Not run)
data("dna.obj") ## Coercing a 'Dna' object to a list. x<-dna.obj[1:3,as.matrix=FALSE] as.list(x) ## Not run: ## Coercing a 'Haplotype' object to a list. x<-dna.obj h<-haplotype(x) as.list(h) ## Coercing a 'Parsimnet' object to a list. x<-dna.obj p<-parsimnet(x) as.list(p) ## End(Not run)
as.matrix
in the Package haplotypes
Coerces an object to a matrix.
## S4 method for signature 'Dna' as.matrix(x)
## S4 method for signature 'Dna' as.matrix(x)
x |
an object of class |
returns a character matrix.
signature(x = "Dna")
coerces an object of class Dna to a matrix.
data("dna.obj") ## Coercing a 'Dna' object to a matrix. x<-dna.obj[1:4,1:6,as.matrix=FALSE] x as.matrix(x) ## Not run: # gives the same result dna.obj[1:4,1:6,as.matrix=TRUE] ## End(Not run)
data("dna.obj") ## Coercing a 'Dna' object to a matrix. x<-dna.obj[1:4,1:6,as.matrix=FALSE] x as.matrix(x) ## Not run: # gives the same result dna.obj[1:4,1:6,as.matrix=TRUE] ## End(Not run)
network
objectThis function coerces Parsimnet
object to network
{network}
object .
## S4 method for signature 'Parsimnet' as.network(x,net=1,...)
## S4 method for signature 'Parsimnet' as.network(x,net=1,...)
x |
an object of class |
net |
a numeric vector of length one indicating which network to convert. |
... |
additional arguments to function |
an object of class network.
signature(x = "Parsimnet")
coerces a Parsimnet
object to a network
object.
## Coercing a Parsimnet object to a network object. data("dna.obj") x<-dna.obj p<-parsimnet(x) n<-as.network(p) #Fourth network (with only two edges) p<-parsimnet(x,prob=.99) n<-as.network(p,net=4)
## Coercing a Parsimnet object to a network object. data("dna.obj") x<-dna.obj p<-parsimnet(x) n<-as.network(p) #Fourth network (with only two edges) p<-parsimnet(x,prob=.99) n<-as.network(p,net=4)
networx
objectThis function coerces Parsimnet
object to networx
{phangorn}
object .
## S4 method for signature 'Parsimnet' as.networx(x,net=1,...)
## S4 method for signature 'Parsimnet' as.networx(x,net=1,...)
x |
an object of class |
net |
a numeric vector of length one indicating which network to convert. |
... |
additional arguments to |
an object of class networx.
signature(x = "Parsimnet")
coerces a Parsimnet
object to a networx
object.
## Coercing a Parsimnet object to a networx object. data("dna.obj") x<-dna.obj p<-parsimnet(x) nx<-as.networx(p) plot(nx, "2D")
## Coercing a Parsimnet object to a networx object. data("dna.obj") x<-dna.obj p<-parsimnet(x) nx<-as.networx(p) plot(nx, "2D")
Dna
object to a numeric matrixConverts a character matrix to a numeric matrix.
## S4 method for signature 'Dna' as.numeric(x)
## S4 method for signature 'Dna' as.numeric(x)
x |
an object of class |
Function as.numeric()
coerces the character matrix in the slot sequence
to a numeric matrix.
Lower or upper case characters "?","A","C","G","T","-" are converted to integers 0,1,2,3,4,5, respectively.
returns a numeric matrix.
signature(x = "Dna")
coerces a Dna object to a numeric matrix.
x<-matrix(c("?","A","C","g","t","-","0","1","2","3","4","5"),4,6) rownames(x)<-c("seq1","seq2","seq3","seq4") x<-as.dna(x) # original character matrix as.matrix(x) ## Coercing a 'Dna' object to a numeric matrix. # numeric matrix as.numeric(x)
x<-matrix(c("?","A","C","g","t","-","0","1","2","3","4","5"),4,6) rownames(x)<-c("seq1","seq2","seq3","seq4") x<-as.dna(x) # original character matrix as.matrix(x) ## Coercing a 'Dna' object to a numeric matrix. # numeric matrix as.numeric(x)
phyDat
objectThis function coerces Dna
object to phyDat
{phangorn}
object .
## S4 method for signature 'Dna' as.phyDat(x, indels="sic",...)
## S4 method for signature 'Dna' as.phyDat(x, indels="sic",...)
x |
an object of class |
indels |
the indel coding method to be used. This must be one of "sic", "5th" or "missing". Any unambiguous substring can be given. See also ‘Details’ |
... |
additional arguments to |
Available indel coding methods:
sic
:Treating gaps as a missing character and coding them separately following the simple indel coding method.
5th
:Treating gaps as a fifth state character.
missing
:Treating gaps as a missing character.
an object of class phyDat.
signature(x = "Dna")
coerces a Dna
object to a phyDat
object.
data("dna.obj") x<-dna.obj ## Coercing a Dna object to a phyDat object. # Simple indel coding. phyd<-as.phyDat(x) phyd # Gaps as 5th state characters. phyd<-as.phyDat(x,indels="5") phyd # Gaps as 5th state characters. phyd<-as.phyDat(x,indels="m") phyd
data("dna.obj") x<-dna.obj ## Coercing a Dna object to a phyDat object. # Simple indel coding. phyd<-as.phyDat(x) phyd # Gaps as 5th state characters. phyd<-as.phyDat(x,indels="5") phyd # Gaps as 5th state characters. phyd<-as.phyDat(x,indels="m") phyd
Calculates base composition of Dna
object.
## S4 method for signature 'Dna' basecomp(x)
## S4 method for signature 'Dna' basecomp(x)
x |
an object of class |
a matrix with sequence as rows, DNA bases as columns and frequencies as entries.
signature(x = "Dna")
calculates base composition of Dna
object.
data("dna.obj") x <-dna.obj ## Calculating base compositions. basecomp(x)
data("dna.obj") x <-dna.obj ## Calculating base compositions. basecomp(x)
Methods for generating a single bootstrap replicate.
## S4 method for signature 'Dna' boot.dna(x,replacement=TRUE)
## S4 method for signature 'Dna' boot.dna(x,replacement=TRUE)
x |
an object of class |
replacement |
boolean; whether the sampling is done with replacement or without replacement. |
an object of class Dna.
signature(x = "Dna")
generates single bootstrap replicate from a Dna
object.
Caner Aktas, [email protected]
data("dna.obj") x<-dna.obj ## Generating a bootstrap replicate. # with replacement bxr<-boot.dna(x) image(bxr) # without replacement bx<-boot.dna(x,replacement=FALSE) image(bx)
data("dna.obj") x<-dna.obj ## Generating a bootstrap replicate. # with replacement bxr<-boot.dna(x) image(bxr) # without replacement bx<-boot.dna(x,replacement=FALSE) image(bx)
Dna
objectComputes and returns an absolute pairwise character difference matrix from DNA sequences.
## S4 method for signature 'Dna' distance(x,subset=NULL,indels="sic")
## S4 method for signature 'Dna' distance(x,subset=NULL,indels="sic")
x |
an object of class |
subset |
a vector of integers in the range [1,nrow(x)], specifying which sequence(s) are used in the distance calculation. Only distance between selected sequence(s) and the rest of the sequences are calculated. If it is NULL, all comparisons are done. |
indels |
the indel coding method to be used. This must be one of "sic", "5th" or "missing". Any unambiguous substring can be given. See also ‘Details’ |
Available indel coding methods:
sic
:Treating gaps as a missing character and coding them separately following the simple indel coding method.
5th
:Treating gaps as a fifth state character.
missing
:Treating gaps as a missing character.
returns an object of class dist
.
signature(x = "Dna")
Computes and returns an absolute pairwise character difference matrix from Dna
objects.
Caner Aktas, [email protected]
Giribet, G. and Wheeler, W.C. (1999) On gaps. Molecular Phylogenetics and Evolution 13, 132-143.
Simmons, M., Ochoterena, H. (2000) Gaps as characters in sequence-based phylogenetic analyses. Systematic Biology 49, 369-381.
indelcoder
and subs
data("dna.obj") x<-dna.obj[4:7,13:22,as.matrix=FALSE] ## Simple indel coding. distance(x,indels="s") ## Gaps as 5th state characters. distance(x,indels="5") ## Gaps as missing characters. distance(x,indels="m") ## Not run: ## Using 'subset'. x<-dna.obj[4:10,13:22,as.matrix=FALSE] distance(x, NULL) distance(x, subset=c(1)) distance(x, subset=c(2,4)) ## End(Not run)
data("dna.obj") x<-dna.obj[4:7,13:22,as.matrix=FALSE] ## Simple indel coding. distance(x,indels="s") ## Gaps as 5th state characters. distance(x,indels="5") ## Gaps as missing characters. distance(x,indels="m") ## Not run: ## Using 'subset'. x<-dna.obj[4:10,13:22,as.matrix=FALSE] distance(x, NULL) distance(x, subset=c(1)) distance(x, subset=c(2,4)) ## End(Not run)
"Dna"
in the Package haplotypes
S4 class to hold DNA sequence data.
Objects can be created by calls of the form new("Dna", sequence, seqlengths, seqnames)
, however reading fasta file using read.fas
function or coerce matrix, data.frame or list objects to a Dna object using as.dna
methods is preferable.
sequence
:Object of class "matrix"
containing DNA sequence data, rows represent sequences and columns represent sites. See also ‘Note’.
seqlengths
:Object of class "numeric"
containing the length of each DNA sequence.
seqnames
:Object of class "character"
containing the name of each DNA sequence.
signature(x = "Dna", i = "ANY", j = "ANY")
: extracts part of a DNA sequence as an object of class matrix.
signature(x = "Dna", i = "ANY", j = "ANY", value = "ANY")
: replaces part of a Dna sequence with an object of class "matrix"
, "numeric"
or "character"
.
signature(x = "Dna", value = "ANY")
: combines two Dna objects.
signature(x = "Dna")
: coerces an object of class Dna to a data.frame.
signature(x = "Dna")
: coerces an object of class Dna to a list; elements of the list are character vectors that contains the DNA sequences of length equal to corresponding value in the slot seqlengths
.
signature(x = "Dna")
: coerces an object of class Dna to a matrix.
signature(x = "Dna")
: coerces an object of class Dna to a numeric matrix.
signature(x = "Dna")
: coerces an object of class Dna to a DNAbin
object.
signature(x = "Dna")
: coerces an object of class Dna to a phyDat
object.
signature(x = "Dna")
: calculates base composition of Dna object.
signature(x = "Dna")
: generates single bootstrap replicate.
signature(x = "Dna")
: computes and returns an absolute pairwise character difference matrix from DNA sequences.
signature(x = "Dna")
: infers haplotypes from DNA sequences.
signature(x = "Dna")
: displays DNA sequences
signature(x = "Dna")
: supports simple indel coding method.
signature(x = "Dna")
: returns the longest sequence length.
signature(x = "Dna", value = "ANY")
: combines two Dna objects.
signature(x = "Dna")
: gets the names of an object Dna.
signature(x = "Dna")
: sets the names of an object Dna.
signature(x = "Dna")
: returns the longest sequence length.
signature(x = "Dna")
: returns the total sequence number.
signature(x = "Dna")
: calculates pairwise Nei's average number of differences between populations.
signature(x = "Dna")
: calculates pairwise PhiST between populations.
signature(x = "Dna")
: estimates genealogies using statistical parsimony.
signature(x = "Dna")
: displays information about DNA polymorphisms of two sequences; indels and base substitutions, respectively.
signature(object = "Dna")
: returns a vector containing the minimum and maximum length of DNA sequences.
signature(object = "Dna")
: removes alignment gaps.
signature(object = "Dna")
: retrieve the row names of a DNA sequence matrix.
signature(object = "Dna")
: set the row names of a DNA sequence matrix.
signature(object = "Dna")
: displays Dna object briefly.
signature(x = "Dna")
: displays information about base substitutions.
signature(x = "Dna")
: Translate characters in DNA sequence matrix from upper to lower case.
signature(x = "Dna")
: Translate characters in DNA sequence matrix from lower to upper case.
signature(x = "Dna")
: returns a list with duplicate DNA sequences removed.
Valid characters for the slot sequence
are "A","C","G","T","a","c","g","t","-","?". Numeric entries (integers) between 0-5 will be converted to "?","A","C","G","T","-", respectively. Invalid characters will be replaced with "?" with a warning message.
Caner Aktas, [email protected]
An example object of the class Dna
.
data(dna.obj)
data(dna.obj)
dna.obj
contains a Dna
object.
data(dna.obj) dna.obj
data(dna.obj) dna.obj
Function for creating a matrix with haplotypes as rows, grouping factor (populations, species, etc.) as columns and abundance as entries.
## S4 method for signature 'Haplotype' grouping(x,factors)
## S4 method for signature 'Haplotype' grouping(x,factors)
x |
an object of class |
factors |
a vector or factor giving the grouping variable (populations, species, etc.), with one element per individual. |
a list with two components:
hapmat
:a matrix with haplotypes as rows, levels of the grouping factor (populations, species, etc.) as columns and abundance as entries.
hapvec
:a vector giving the haplotype identities of individuals.
signature(x = "Haplotype")
Caner Aktas, [email protected]
data("dna.obj") x<-dna.obj[1:6,,as.matrix=FALSE] # inferring haplotypes from DNA sequences h<-haplotype(x) ## Grouping haplotypes. # character vector 'populations' is a grouping factor. populations<-c("pop1","pop1","pop2","pop3","pop3","pop3") # length of the argument 'factor' is equal to the number of sequences g<-grouping(h,factors=populations) g
data("dna.obj") x<-dna.obj[1:6,,as.matrix=FALSE] # inferring haplotypes from DNA sequences h<-haplotype(x) ## Grouping haplotypes. # character vector 'populations' is a grouping factor. populations<-c("pop1","pop1","pop2","pop3","pop3","pop3") # length of the argument 'factor' is equal to the number of sequences g<-grouping(h,factors=populations) g
"Haplotype"
in the Package haplotypes
S4 class to store haplotype information.
Objects can be created by calls of the form new("Haplotype", haplist, hapind, uniquehapind, sequence, d, freq, nhap)
, however use function haplotype
instead.
haplist
:Object of class "list"
, containing the names of individuals that share the same haplotype.
hapind
:Object of class "list"
, containing the index of individuals that share the same haplotype.
uniquehapind
:Object of class "numeric"
, containing the index of the first occurrence of unique haplotypes.
sequence
:Object of class "matrix"
if present, giving the DNA sequence matrix of unique haplotypes.
d
:Object of class "matrix"
, giving the absolute pairwise character difference matrix of unique haplotypes.
freq
:Object of class "numeric"
, giving the haplotype frequencies.
nhap
:Object of class "numeric"
, giving the total number of haplotypes.
signature(x = "Haplotype")
: if Haplotype object contains dna sequences, coerces an object of class Haplotype to an object of class Dna, else returns an error message.
signature(x = "Haplotype")
: assigns slots of an object Haplotype to list elements.
signature(x = "Haplotype")
: creates a matrix with haplotypes as rows, grouping factor (populations, species, etc.) as columns and abundance as entries.
signature(x = "Haplotype")
: reorders haplotypes according to the ordering factor.
signature(x = "Haplotype")
: returns the number of haplotypes.
signature(x = "Parsimnet", y = "Haplotype")
: plot pie charts on statistical parsimony network.
signature(x = "Parsimnet", y = "Haplotype")
: add legends to pie charts produced using pieplot
.
signature(object = "Haplotype")
: displays the object briefly.
Caner Aktas, [email protected]
haplotype
in the package haplotypes
Collapses identical DNA sequences into haplotypes or infering haplotypes using user provided absolute pairwise character difference matrix.
## S4 method for signature 'Dna' haplotype(x,indels="sic") ## S4 method for signature 'dist' haplotype(x) ## S4 method for signature 'matrix' haplotype(x)
## S4 method for signature 'Dna' haplotype(x,indels="sic") ## S4 method for signature 'dist' haplotype(x) ## S4 method for signature 'matrix' haplotype(x)
x |
an object of class |
indels |
the indel coding method to be used. This must be one of "sic", "5th" or "missing". Any unambiguous substring can be given. See |
haplotype
returns an object of class Haplotype
, as.list-methods
can be used to coerce the object to a list.
signature(x = "Dna")
Inferring haplotypes from DNA sequences.
signature(x = "dist")
Inferring haplotypes using an absolute pairwise character difference matrix (dist object).
signature(x = "matrix")
Inferring haplotypes using an absolute pairwise character difference matrix.
Caner Aktas, [email protected]
data("dna.obj") x<-dna.obj[1:6,,as.matrix=FALSE] ##Inferring haplotypes using 'Dna' object. # coding gaps using simple indel coding method h<-haplotype(x,indels="sic") h # giving DNA sequences of haplotypes as.dna(h) ## Not run: ## Slots of an object Haplotype h@haplist #haplotype list (names) h@hapind #haplotype list (index) h@uniquehapind #getting index of the first occurrence of haplotypes h@sequence #DNA sequences of haplotypes h@d #distance matrix of haplotypes h@freq #haplotype frequencies h@nhap #total number of haplotypes ## End(Not run) ## Inferring haplotypes using dist object. d<-distance(x) h<-haplotype(d) h ## Not run: # returns an error message as.dna(h) ## End(Not run) ## Inferring haplotypes using distance matrix. d<-as.matrix(distance(x)) h<-haplotype(d) h ## Not run: # returns an error message as.dna(h) ## End(Not run)
data("dna.obj") x<-dna.obj[1:6,,as.matrix=FALSE] ##Inferring haplotypes using 'Dna' object. # coding gaps using simple indel coding method h<-haplotype(x,indels="sic") h # giving DNA sequences of haplotypes as.dna(h) ## Not run: ## Slots of an object Haplotype h@haplist #haplotype list (names) h@hapind #haplotype list (index) h@uniquehapind #getting index of the first occurrence of haplotypes h@sequence #DNA sequences of haplotypes h@d #distance matrix of haplotypes h@freq #haplotype frequencies h@nhap #total number of haplotypes ## End(Not run) ## Inferring haplotypes using dist object. d<-distance(x) h<-haplotype(d) h ## Not run: # returns an error message as.dna(h) ## End(Not run) ## Inferring haplotypes using distance matrix. d<-as.matrix(distance(x)) h<-haplotype(d) h ## Not run: # returns an error message as.dna(h) ## End(Not run)
Reorders haplotypes according to the ordering factor.
## S4 method for signature 'Haplotype' hapreord(x,order=c(1:x@nhap))
## S4 method for signature 'Haplotype' hapreord(x,order=c(1:x@nhap))
x |
an object of class |
order |
a vector giving the order of haplotypes, with one element per haplotype. |
returns an object of class Haplotype
.
signature(x = "Haplotype")
Reorders haplotypes.
Caner Aktas, [email protected]
data("dna.obj") x<-dna.obj[1:6,,as.matrix=FALSE] # inferring haplotypes from DNA sequences h<-haplotype(x) ## Reordering haplotypes. # length of the argument 'order' is equal to the number of haplotypes rh<-hapreord(h,order=c(4,3,1,2)) rh
data("dna.obj") x<-dna.obj[1:6,,as.matrix=FALSE] # inferring haplotypes from DNA sequences h<-haplotype(x) ## Reordering haplotypes. # length of the argument 'order' is equal to the number of haplotypes rh<-hapreord(h,order=c(4,3,1,2)) rh
This function returs the list of homoplastic indels and substitutions.
## S4 method for signature 'Dna' homopoly(x,indels="sic",...)
## S4 method for signature 'Dna' homopoly(x,indels="sic",...)
x |
an object of class |
indels |
the indel coding method to be used. This must be one of "sic", "5th" or "missing". Any unambiguous substring can be given. See |
... |
additional arguments to |
a list with following components:
indels |
a character vector of homoplastic indels sitewise Consistency Index, names of the character vector gives the site of homoplastic indel. |
subs |
a character vector of homoplastic substitutions sitewise Consistency Index, names of the character vector gives the site of substitution. |
signature(x = "Dna")
Caner Aktas, [email protected]
data("dna.obj") ### Method for signature 'Dna'. x<-dna.obj homopoly(x)
data("dna.obj") ### Method for signature 'Dna'. x<-dna.obj homopoly(x)
Display an image of DNA sequences .
## S4 method for signature 'Dna' image(x,all=FALSE,fifth=TRUE, col=c("#BFBFBF","#0B99FD","#FD0B0B","#11A808","#F5FD0B","#F8F8FF"), chars=TRUE,cex=1,show.names=TRUE,show.sites=TRUE,xlab="",ylab="",...)
## S4 method for signature 'Dna' image(x,all=FALSE,fifth=TRUE, col=c("#BFBFBF","#0B99FD","#FD0B0B","#11A808","#F5FD0B","#F8F8FF"), chars=TRUE,cex=1,show.names=TRUE,show.sites=TRUE,xlab="",ylab="",...)
x |
an object of class |
all |
boolean; should entire sequence be displayed or only the polymorphic sites? |
fifth |
boolean; if all==FALSE, should gaps be displayed? |
col |
an integer or character vector for the colors. By default it is blue for "A", red for "C", green for "G", yellow for "T", white for "-", and grey for "?". |
chars |
boolean; should characters be displayed on image? |
cex |
a numeric vector of expansion factor for characters. |
show.names |
boolean; should sequence names be displayed on the left side. |
show.sites |
boolean; should site labels be displayed on the bottom side. |
xlab |
a title for the x axis. |
ylab |
a title for the y axis. |
... |
additional arguments to |
signature(x = "Dna")
Display an image of Dna objects
Caner Aktas, [email protected].
data("dna.obj") x<-dna.obj ## Display only polymorphic sites without gaps image(x,all=FALSE,fifth=FALSE,show.names=TRUE,cex=0.6) ## Display only polymorphic sites with gaps image(x,all=FALSE,fifth=TRUE,show.names=TRUE,cex=0.6) ## Not run: ## Display entire sequences image(x,all=FALSE,show.names=TRUE,cex=0.6) ## End(Not run)
data("dna.obj") x<-dna.obj ## Display only polymorphic sites without gaps image(x,all=FALSE,fifth=FALSE,show.names=TRUE,cex=0.6) ## Display only polymorphic sites with gaps image(x,all=FALSE,fifth=TRUE,show.names=TRUE,cex=0.6) ## Not run: ## Display entire sequences image(x,all=FALSE,show.names=TRUE,cex=0.6) ## End(Not run)
Function for coding gaps separately. Only simple indel coding method is available in the current version.
## S4 method for signature 'Dna' indelcoder(x)
## S4 method for signature 'Dna' indelcoder(x)
x |
an object of class |
a list with two components:
indels
:a matrix giving the indel positions (beginnings and ends) and lengths.
codematrix
:a binary matrix giving the indel codings. Missing values are denoted by -1.
signature(x = "Dna")
Function for coding gaps separately.
Caner Aktas, [email protected]
Simmons, M., Ochoterena, H. (2000) Gaps as characters in sequence-based phylogenetic analyses. Systematic Biology 49, 369-381.
data("dna.obj") x<-dna.obj ## Simple indel coding. indelcoder(x)
data("dna.obj") x<-dna.obj ## Simple indel coding. indelcoder(x)
length
in the package haplotypes
Methods for function length
.
## S4 method for signature 'Dna' length(x) ## S4 method for signature 'Haplotype' length(x) ## S4 method for signature 'Parsimnet' length(x)
## S4 method for signature 'Dna' length(x) ## S4 method for signature 'Haplotype' length(x) ## S4 method for signature 'Parsimnet' length(x)
x |
returns a non-negative integer vector.
signature(x = "Dna")
returns the longest sequence length.
signature(x = "Haplotype")
returns the number of haplotypes.
signature(x = "Parsimnet")
returns the length of network(s).
data("dna.obj") x<-dna.obj ## Longest sequence length length(x) ## Total number of haplotypes h<-haplotype(x) length(h) ## Length of network(s) p<-parsimnet(x,prob=.95) # length of the network length(p) p<-parsimnet(x,prob=.99) # length of the networks length(p)
data("dna.obj") x<-dna.obj ## Longest sequence length length(x) ## Total number of haplotypes h<-haplotype(x) length(h) ## Length of network(s) p<-parsimnet(x,prob=.95) # length of the network length(p) p<-parsimnet(x,prob=.99) # length of the networks length(p)
Dna
object or Parsimnet
objectFunction to get or set sequence names of Dna
object or names of network in Parsimnet
object.
## S4 method for signature 'Dna' names(x) ## S4 method for signature 'Parsimnet' names(x) ## S4 replacement method for signature 'Dna' names(x)<-value ## S4 replacement method for signature 'Parsimnet' names(x)<-value
## S4 method for signature 'Dna' names(x) ## S4 method for signature 'Parsimnet' names(x) ## S4 replacement method for signature 'Dna' names(x)<-value ## S4 replacement method for signature 'Parsimnet' names(x)<-value
x |
|
value |
a character vector of the same length as number of sequence or networks. |
signature(x = "Dna")
Function to get or set names of an object of Dna.
signature(x = "Parsimnet")
Function to get or set names of networks in Parsimnet object.
data("dna.obj") x<-dna.obj x<-as.dna(x[1:4,1:6]) ## Getting sequence names. names(x) ## Setting sequence names. names(x)<-c("u","v","z","y") names(x) x<-dna.obj p<-parsimnet(x,prob=.99) ##Getting network names in parsimnet object names(p) ## Setting network names names. names(p)<-c("a","b","c","d","f","g") names(p)
data("dna.obj") x<-dna.obj x<-as.dna(x[1:4,1:6]) ## Getting sequence names. names(x) ## Setting sequence names. names(x)<-c("u","v","z","y") names(x) x<-dna.obj p<-parsimnet(x,prob=.99) ##Getting network names in parsimnet object names(p) ## Setting network names names. names(p)<-c("a","b","c","d","f","g") names(p)
ncol
returns the number of columns present in a matrix.
## S4 method for signature 'Dna' ncol(x)
## S4 method for signature 'Dna' ncol(x)
x |
an object of class |
an integer of length one.
signature(x = "Dna")
ncol
returns the number of columns present in the sequence matrix (length of the longest DNA sequence).
data("dna.obj") x <-dna.obj ## Giving the length of the longest sequence. ncol(x) # gives the same result length(x)
data("dna.obj") x <-dna.obj ## Giving the length of the longest sequence. ncol(x) # gives the same result length(x)
nrow
returns the number of rows present in a matrix.
## S4 method for signature 'Dna' nrow(x)
## S4 method for signature 'Dna' nrow(x)
x |
an object of class |
an integer of length one.
signature(x = "Dna")
nrow
returns the number of rows present in the sequence matrix (number of sequences).
data("dna.obj") x <-dna.obj ## Giving the number of sequences. nrow(x)
data("dna.obj") x <-dna.obj ## Giving the number of sequences. nrow(x)
Function provides pairwise Nei's raw number of nucleotide differences between populations.
## S4 method for signature 'Dna' pairnei(x,populations,indels="sic",nperm=99, subset=NULL,showprogbar=FALSE) ## S4 method for signature 'dist' pairnei(x,populations,nperm=99, subset=NULL,showprogbar=FALSE) ## S4 method for signature 'matrix' pairnei(x,populations,nperm=99, subset=NULL,showprogbar=FALSE)
## S4 method for signature 'Dna' pairnei(x,populations,indels="sic",nperm=99, subset=NULL,showprogbar=FALSE) ## S4 method for signature 'dist' pairnei(x,populations,nperm=99, subset=NULL,showprogbar=FALSE) ## S4 method for signature 'matrix' pairnei(x,populations,nperm=99, subset=NULL,showprogbar=FALSE)
x |
an object of class |
populations |
a vector giving the populations, with one element per individual. |
indels |
the indel coding method to be used. This must be one of "sic", "5th" or "missing". Any unambiguous substring can be given. See |
nperm |
the number of permutations. Set this to 0 to skip the permutation procedure. |
subset |
a vector of integers in the range [1, |
showprogbar |
boolean; whether the progress bar is displayed or not displayed. |
The null distribution of pairwise Nei's differences under the hypothesis of no difference between the populations is obtained by permuting individuals between populations.
a list with following components:
neidist |
a matrix giving the average number of pairwise Nei's (D) differences between populations (below diagonal elements) and average number of pairwise differences within populations (diagonal elements). |
p |
a matrix giving the p-values, or NULL if permutation test is not performed. |
signature(x = "Dna")
signature(x = "dist")
signature(x = "matrix")
Caner Aktas, [email protected]
Nei, M. and Li, W. H. (1979) Mathematical model for studying genetic variation in terms of restriction endonucleases. Proceedings of the National Academy of Sciences of the United States of America 76, 5269-5273.
data("dna.obj") ### Method for signature 'Dna'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") ##skip permutation testing pn<-pairnei(x, populations, nperm=0) pn #Between populations as.dist(pn$neidist) #Within populations diag(pn$neidist) ##Gaps as missing characters. pn <-pairnei(x, populations, indels="m", nperm=0) pn ##using subset, third population against others pn<-pairnei(x, populations, nperm=0,subset=c(3)) pn ## Not run: ## 999 permutations. pn<-pairnei(x, populations, nperm=999, showprogbar=TRUE) pn ## random populations x<-dna.obj populations<-sample(1:4,nrow(x),replace=TRUE) pn<-pairnei(x, populations, nperm=999, showprogbar=TRUE) pn ## populations based on clusters x<-dna.obj d<-distance(x) hc<-hclust(d,method="ward.D") populations<-cutree(hc,4) pn<-pairnei(x, populations, nperm=999, showprogbar=TRUE) pn ## End(Not run) ### Method for signature 'dist'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") d<-distance(x) pn<-pairnei(d, populations,nperm=0) pn ### Method for signature 'matrix'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") d<-as.matrix(distance(x)) pn<-pairnei(d, populations,nperm=0) pn
data("dna.obj") ### Method for signature 'Dna'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") ##skip permutation testing pn<-pairnei(x, populations, nperm=0) pn #Between populations as.dist(pn$neidist) #Within populations diag(pn$neidist) ##Gaps as missing characters. pn <-pairnei(x, populations, indels="m", nperm=0) pn ##using subset, third population against others pn<-pairnei(x, populations, nperm=0,subset=c(3)) pn ## Not run: ## 999 permutations. pn<-pairnei(x, populations, nperm=999, showprogbar=TRUE) pn ## random populations x<-dna.obj populations<-sample(1:4,nrow(x),replace=TRUE) pn<-pairnei(x, populations, nperm=999, showprogbar=TRUE) pn ## populations based on clusters x<-dna.obj d<-distance(x) hc<-hclust(d,method="ward.D") populations<-cutree(hc,4) pn<-pairnei(x, populations, nperm=999, showprogbar=TRUE) pn ## End(Not run) ### Method for signature 'dist'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") d<-distance(x) pn<-pairnei(d, populations,nperm=0) pn ### Method for signature 'matrix'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") d<-as.matrix(distance(x)) pn<-pairnei(d, populations,nperm=0) pn
This function calculates pairwise PhiST between populations using the AMOVA framework.
## S4 method for signature 'Dna' pairPhiST(x,populations,indels="sic",nperm=99, negatives=FALSE, subset =NULL, showprogbar=TRUE) ## S4 method for signature 'dist' pairPhiST(x,populations,nperm=99, negatives=FALSE, subset=NULL,showprogbar=TRUE) ## S4 method for signature 'matrix' pairPhiST(x,populations,nperm=99,negatives=FALSE, subset=NULL,showprogbar=TRUE)
## S4 method for signature 'Dna' pairPhiST(x,populations,indels="sic",nperm=99, negatives=FALSE, subset =NULL, showprogbar=TRUE) ## S4 method for signature 'dist' pairPhiST(x,populations,nperm=99, negatives=FALSE, subset=NULL,showprogbar=TRUE) ## S4 method for signature 'matrix' pairPhiST(x,populations,nperm=99,negatives=FALSE, subset=NULL,showprogbar=TRUE)
x |
an object of class |
populations |
a vector giving the populations, with one element per individual. |
indels |
the indel coding method to be used. This must be one of "sic", "5th" or "missing". Any unambiguous substring can be given. See |
nperm |
the number of permutations. Set this to 0 to skip the permutation procedure. |
negatives |
boolean; if it is FALSE all negative PhiST values are replaced with zero. |
subset |
a vector of integers in the range [1, |
showprogbar |
boolean; whether the progress bar is displayed or not displayed. |
The null distribution of pairwise PhiST under the hypothesis of no difference between the populations is obtained by permuting individuals between populations.
a list with following components:
PhiST |
a matrix giving the PhiST values between populations. |
p |
a matrix giving the p-values, or NULL if permutation test is not performed. |
signature(x = "Dna")
signature(x = "dist")
signature(x = "matrix")
An internal code Statphi
is taken from package ade4 version 1.7-8 without any modification, author Sandrine Pavoine. Function amova
from package pegas is used internally to estimate variance components, author Emmanuel Paradis.
Caner Aktas, [email protected]
Excoffier, L., Smouse, P.E. and Quattro, J.M. (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics, 131, 479-491.
data("dna.obj") ### Method for signature 'Dna'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") ##skip permutation testing pst<-pairPhiST(x, populations, nperm=0) pst ##allow negative PhiST values pst<-pairPhiST(x, populations, nperm=0, negatives=TRUE) pst ##Gaps as missing characters. pst<-pairPhiST(x, populations, indels="m", nperm=0, negatives=TRUE) pst ##using subset, second population against others pst <-pairPhiST(x, populations, nperm=0,subset=c(2)) pst ## Not run: ## 999 permutations. pst<-pairPhiST(x, populations, nperm=999,showprogbar=TRUE) pst ## random populations x<-dna.obj populations<-sample(1:4,nrow(x),replace=TRUE) pst<-pairPhiST(x, populations, nperm=999,showprogbar=TRUE) pst ## populations based on clusters x<-dna.obj d<-distance(x) hc<-hclust(d,method="ward.D") populations<-cutree(hc,4) pst<-pairPhiST(x, populations, nperm=999,showprogbar=TRUE) pst ## End(Not run) ### Method for signature 'dist'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") d<-distance(x) pst<-pairPhiST(d, populations, nperm=0) pst ### Method for signature 'matrix'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") d<-as.matrix(distance(x)) pst<-pairPhiST(d, populations, nperm=0) pst
data("dna.obj") ### Method for signature 'Dna'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") ##skip permutation testing pst<-pairPhiST(x, populations, nperm=0) pst ##allow negative PhiST values pst<-pairPhiST(x, populations, nperm=0, negatives=TRUE) pst ##Gaps as missing characters. pst<-pairPhiST(x, populations, indels="m", nperm=0, negatives=TRUE) pst ##using subset, second population against others pst <-pairPhiST(x, populations, nperm=0,subset=c(2)) pst ## Not run: ## 999 permutations. pst<-pairPhiST(x, populations, nperm=999,showprogbar=TRUE) pst ## random populations x<-dna.obj populations<-sample(1:4,nrow(x),replace=TRUE) pst<-pairPhiST(x, populations, nperm=999,showprogbar=TRUE) pst ## populations based on clusters x<-dna.obj d<-distance(x) hc<-hclust(d,method="ward.D") populations<-cutree(hc,4) pst<-pairPhiST(x, populations, nperm=999,showprogbar=TRUE) pst ## End(Not run) ### Method for signature 'dist'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") d<-distance(x) pst<-pairPhiST(d, populations, nperm=0) pst ### Method for signature 'matrix'. x<-dna.obj x<-dna.obj[c(1,20,21,26,27,28,30,3,4,7,13,14,15,16,23,24,25),,as.matrix=FALSE] populations<-c("pop1","pop1","pop1","pop1","pop1","pop1","pop1","pop2", "pop2","pop2","pop2","pop2","pop3","pop4","pop3","pop4","pop4") d<-as.matrix(distance(x)) pst<-pairPhiST(d, populations, nperm=0) pst
"Parsimnet"
in the Package haplotypes S4 class to store statistical parsimony networks and additional information.
Objects can be created by calls of the form new("Parsimnet", d, tempProbs, conlimit, prob, nhap, rowindex), however use function parsimnet instead
.
d
:Object of class "list"
containing the geodesic distance matrix of haplotypes and intermediates for each network.
tempProbs
:Object of class "numeric"
giving the probabilities of parsimony for mutational steps beyond the connection limit.
conlimit
:Object of class "numeric"
giving the number of maximum connection steps at connection limit.
prob
:Object of class "numeric"
giving the user defined connection limit.
nhap
:Object of class "numeric"
giving the number of haplotypes in each network.
rowindex
:Object of class "list"
containing vectors giving the index of haplotypes in each network.
signature(x = "Parsimnet")
: assigns slots of an object Parsimnet to list elements.
signature(x = "Parsimnet")
: coerces Parsimnet
object to network
{network}
object
signature(x = "Parsimnet")
: coerces Parsimnet
object to networx
{phangorn}
object
signature(x = "Parsimnet")
: returns the length of network(s).
signature(x = "Parsimnet")
: gets names of networks in Parsimnet
object
signature(x = "Parsimnet")
: sets names of networks in Parsimnet
object
signature(x = "Parsimnet")
: plots statistical parsimony networks.
signature(x = "Parsimnet", y = "Haplotype")
: plots pie charts on statistical parsimony networks
signature(x = "Parsimnet", y = "Haplotype")
: add legends to pie charts produced using pieplot
.
signature(x = "Parsimnet")
: gets names of vertices in networks.
signature(x = "Parsimnet")
: gets names of vertices in networks
signature(object = "Parsimnet")
: displays the object briefly.
Caner Aktas, [email protected]
Function for estimating gene genealogies from DNA sequences or user provided absolute pairwise character difference matrix using statistical parsimony.
## S4 method for signature 'Dna' parsimnet(x,indels="sic",prob=.95) ## S4 method for signature 'dist' parsimnet(x,seqlength,prob=.95) ## S4 method for signature 'matrix' parsimnet(x,seqlength,prob=.95)
## S4 method for signature 'Dna' parsimnet(x,indels="sic",prob=.95) ## S4 method for signature 'dist' parsimnet(x,seqlength,prob=.95) ## S4 method for signature 'matrix' parsimnet(x,seqlength,prob=.95)
x |
an object of class |
indels |
the indel coding method to be used. This must be one of "sic", "5th" or "missing". Any unambiguous substring can be given. See |
seqlength |
an integer of length one giving the sequence length information (number of characters). |
prob |
a numeric vector of length one in the range [0.01, 0.99] giving the probability of parsimony as defined in Templeton et al. (1992). In order to set maximum connection steps to Inf (to connect all the haplotypes in a single network), set the probability to NULL. |
The network estimation methods implemented in parsimnet
function finds one of the most parsimonious network (or sub-networks if connection between haplotypes exceeds the parsimony limit). This is an implemetation of the TCS method proposed in Templeton et al. (1992) and Clement et al. (2002).parsimnet
function generates an unambiguous haplotype network without loops. If more than one best networks found (results in ambiguous connections), only a network with the lowest average all-pairs distance is returned. Loops may occur only if they are present in initial haplotype distance matrix.
S4 methods for signature 'Dna', 'matrix' or 'dist' returns an object of class Parsimnet
.
signature(x = "Dna")
estimating gene genealogies from DNA sequences.
signature(x = "dist")
estimating gene genealogies from distance matrix (dist object).
signature(x = "matrix")
estimating gene genealogies from distance matrix.
Duplicate names in the final distance matrices in slot d are renamed without warning. An internal function .TempletonProb
is taken from package pegas version 0.6 without any modification, authors Emmanuel Paradis, Klaus Schliep.
Caner Aktas, [email protected].
Clement, M., Q. Snell, P. Walker, D. Posada, and K. A. Crandall (2002) TCS: Estimating Gene Genealogies in First IEEE International Workshop on High Performance Computational Biology (HiCOMB)
Templeton, A. R., Crandall, K. A. and Sing, C. F. (1992) A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics, 132, 619-635.
network
, plot-methods
and pieplot-methods
## Not run: data("dna.obj") x<-dna.obj ### Method for signature 'Dna'. ## statistical parsimony with 95 p<-parsimnet(x) p plot(p) ## statistical parsimony with 99 p<-parsimnet(x,prob=.99) p # plot the first network plot(p,net=1) ## statistical parsimony with 99 #indels are coded as missing p<-parsimnet(x,indels="m",prob=.99) p plot(p) # statistical parsimony without connection limit. p<-parsimnet(x,prob=NULL) p plot(p) # plot the first network plot(p,net=1) ### Method for signature 'dist'. d<-distance(x) seqlength<-length(x) ## statistical parsimony with 95 p<-parsimnet(d,seqlength) p plot(p) ### Method for signature 'matrix'. d<-as.matrix(distance(x)) seqlength<-length(x) ## statistical parsimony with 95 p<-parsimnet(d,seqlength) p plot(p) ## End(Not run)
## Not run: data("dna.obj") x<-dna.obj ### Method for signature 'Dna'. ## statistical parsimony with 95 p<-parsimnet(x) p plot(p) ## statistical parsimony with 99 p<-parsimnet(x,prob=.99) p # plot the first network plot(p,net=1) ## statistical parsimony with 99 #indels are coded as missing p<-parsimnet(x,indels="m",prob=.99) p plot(p) # statistical parsimony without connection limit. p<-parsimnet(x,prob=NULL) p plot(p) # plot the first network plot(p,net=1) ### Method for signature 'dist'. d<-distance(x) seqlength<-length(x) ## statistical parsimony with 95 p<-parsimnet(d,seqlength) p plot(p) ### Method for signature 'matrix'. d<-as.matrix(distance(x)) seqlength<-length(x) ## statistical parsimony with 95 p<-parsimnet(d,seqlength) p plot(p) ## End(Not run)
This function can be used to add legends to pie charts produced using pieplot
.
## S4 method for signature 'Parsimnet,Haplotype' pielegend(p,h,net=1,factors,...)
## S4 method for signature 'Parsimnet,Haplotype' pielegend(p,h,net=1,factors,...)
p |
a |
h |
a |
net |
a numeric vector of length one indicating which network to plot. |
factors |
a vector or factor giving the grouping variable (populations, species, etc.), with one element per individual. |
... |
arguments to be passed to |
This method calls legend
{graphics}
, some default parameters changed:
an integer or character vector for the edge colors. By default, it is rainbow
.
an integer or character vector for the filling colors. By default, it is rainbow
a numeric or character vector to appear in the legend. By default, it is the levels of the grouping factor for haplotypes.
position of the legend. Default is set to "topright"
.
See ‘ legend
{graphics}
’
signature(p = "Parsimnet", h = "Haplotype")
Caner Aktas, [email protected].
plot,Parsimnet-method
,floating.pie
, plot.default
, plot.network.default
and legend
.
data("dna.obj") x<-dna.obj h<-haplotypes::haplotype(x) ### Statistical parsimony with 95% connection limit p<-parsimnet(x) #randomly generated populations pop<-c("pop1","pop2","pop3","pop4","pop5","pop6","pop7","pop8") set.seed(5) pops<-sample(pop,nrow(x),replace=TRUE) ## Plotting with default parameters. pieplot(p,h,1, pops) ## Add legend with default parameters. pielegend(p,h,1,pops) ## Change colors for the populations. #8 colors for 8 populations cols<-colors()[c(30,369,552,558,538,642,142,91)] pieplot(p,h,1, pops,col=cols) pielegend(p,h,1,pops,col= cols)
data("dna.obj") x<-dna.obj h<-haplotypes::haplotype(x) ### Statistical parsimony with 95% connection limit p<-parsimnet(x) #randomly generated populations pop<-c("pop1","pop2","pop3","pop4","pop5","pop6","pop7","pop8") set.seed(5) pops<-sample(pop,nrow(x),replace=TRUE) ## Plotting with default parameters. pieplot(p,h,1, pops) ## Add legend with default parameters. pielegend(p,h,1,pops) ## Change colors for the populations. #8 colors for 8 populations cols<-colors()[c(30,369,552,558,538,642,142,91)] pieplot(p,h,1, pops,col=cols) pielegend(p,h,1,pops,col= cols)
Plotting pie charts on the statistical parsimony network.
## S4 method for signature 'Parsimnet,Haplotype' pieplot(x,y,net=1,factors, coord = NULL,inter.labels=FALSE,interactive=FALSE,rex=1,...)
## S4 method for signature 'Parsimnet,Haplotype' pieplot(x,y,net=1,factors, coord = NULL,inter.labels=FALSE,interactive=FALSE,rex=1,...)
x |
a |
y |
a |
net |
a numeric vector of length one indicating which network to plot. |
factors |
a vector or factor giving the grouping variable (populations, species, etc.), with one element per individual. |
coord |
a matrix that contains user specified coordinates of the vertices, or NULL to generate vertex layouts with |
inter.labels |
boolean; should vertex labels of intermediate haplotypes be displayed? |
interactive |
boolean; should vertices be interactively adjusted? |
rex |
expansion factor for the pie radius. |
... |
arguments to be passed to |
This method calls floating.pie
{plotrix}
, network.vertex
{network}
, and plot.default
, lines
, and text
{graphics}
. This method also uses some internal structures of plot.network.default
from package network. The following additional arguments can be passed to these functions:
the vertex placement algorithm. Default is set to "fruchtermanreingold"
.
amount to pad the plotting range; useful if labels are being clipped. Default is set to 1.
boolean; should vertex labels be displayed?
a vector of vertex labels. By default, the rownames of the distance matrix (rownames(p@d[[net]])
) are used. If inter.labels==FALSE
only haplotype labels are displayed.
character expansion factor for labels. Default is set to 0.75.
an integer or character vector for the label colors. By default, it is 1 (black).
position at which labels should be placed relative to vertices. 0 and 6 results in labels which are placed away from the center of the plotting region; 1, 2, 3, and 4 result in labels being placed below, to the left of, above, and to the right of vertices, respectively; and label.pos 5 or greater than 6 results in labels which are plotted with no offset (i.e., at the vertex positions). Default is set to 0.
amount to pad the labels. This setting is available only if the labels are plotted with offset relative to vertex positions. Default is set to 1.
a numeric vector of expansion factor for intermediate vertices (only). By default it is (0.5)*min(radius)
. Use 'radius' to specify size of pie charts.
the colors of the pie sectors (i.e., colors for populations), by default rainbow
.
an integer or character vector for the intermediate vertex colors. By default, it is 1 (black).
an integer or character vector for the edge colors. By default, it is 1 (black).
a numeric vector, edges line width. By default, it is 1.
a numeric vector of length one, specifies the line type for the edges. By default it is 1.
the number of lines forming a pie circle, By default, it is 200.
a numeric vector of length p@nhap[net]
for the radius of drawn pie charts. Useful for specifying the radius independent of the haplotype frequencies. Default is (0.8*(haplotype frequencies)*rex)/max(haplotype frequencies)
.
number of polygon sides for vertices. Default is set to 50.
x axis label.
y axis label.
A two-column matrix containing the vertex positions as x,y coordinates.
signature(x = "Parsimnet", y = "Haplotype")
Some internal structures of plot.network.default
is taken from package network with modifications, author Carter T. Butts.
Caner Aktas, [email protected].
plot,Parsimnet-method
,floating.pie
, plot.default
and plot.network.default
data("dna.obj") x<-dna.obj h<-haplotypes::haplotype(x) ### Statistical parsimony with 95% connection limit p<-parsimnet(x) #randomly generated populations pop<-c("pop1","pop2","pop3","pop4","pop5","pop6","pop7","pop8") set.seed(5) pops<-sample(pop,nrow(x),replace=TRUE) ## Plotting with default parameters. pieplot(p,h,1, pops) ## Change colors for the populations. #8 colors for 8 populations cols<-colors()[c(30,369,552,558,538,642,142,91)] pieplot(p,h,1, pops,col=cols) ## Expanding pie charts and intermediate vertices. pieplot(p,h,1, pops,rex=2) ## Adjusting intermediate vertex sizes. pieplot(p,h,1, pops, vertex.cex=rep(0.2, nrow(p@d[[1]])-p@nhap)) ## Expanding pie charts and intermediate vertices, adjusting intermediate vertex sizes. pieplot(p,h,1, pops,rex=2, vertex.cex=rep(0.1, nrow(p@d[[1]])-p@nhap)) ## Adjusting radius of pie charts. pieplot(p,h,1, pops,radius=rep(1, p@nhap)) ## Not run: ## Interactively adjusting vertex positions. pieplot(p,h,1, pops, interactive=TRUE) ## End(Not run) ### Multiple networks with 99% connection limit. p<-parsimnet(x,prob=.99) ## Plotting first network with default parameters. pieplot(p,h,1, pops) ## Change colors for the populations. #8 colors for 8 populations cols<-colors()[c(30,369,552,558,538,642,142,91)] pieplot(p,h,1, pops,col=cols)
data("dna.obj") x<-dna.obj h<-haplotypes::haplotype(x) ### Statistical parsimony with 95% connection limit p<-parsimnet(x) #randomly generated populations pop<-c("pop1","pop2","pop3","pop4","pop5","pop6","pop7","pop8") set.seed(5) pops<-sample(pop,nrow(x),replace=TRUE) ## Plotting with default parameters. pieplot(p,h,1, pops) ## Change colors for the populations. #8 colors for 8 populations cols<-colors()[c(30,369,552,558,538,642,142,91)] pieplot(p,h,1, pops,col=cols) ## Expanding pie charts and intermediate vertices. pieplot(p,h,1, pops,rex=2) ## Adjusting intermediate vertex sizes. pieplot(p,h,1, pops, vertex.cex=rep(0.2, nrow(p@d[[1]])-p@nhap)) ## Expanding pie charts and intermediate vertices, adjusting intermediate vertex sizes. pieplot(p,h,1, pops,rex=2, vertex.cex=rep(0.1, nrow(p@d[[1]])-p@nhap)) ## Adjusting radius of pie charts. pieplot(p,h,1, pops,radius=rep(1, p@nhap)) ## Not run: ## Interactively adjusting vertex positions. pieplot(p,h,1, pops, interactive=TRUE) ## End(Not run) ### Multiple networks with 99% connection limit. p<-parsimnet(x,prob=.99) ## Plotting first network with default parameters. pieplot(p,h,1, pops) ## Change colors for the populations. #8 colors for 8 populations cols<-colors()[c(30,369,552,558,538,642,142,91)] pieplot(p,h,1, pops,col=cols)
plot
in the package haplotypes
Plots statistical parsimony networks.
## S4 method for signature 'Parsimnet' plot(x,y,net=1,inter.labels=FALSE,...)
## S4 method for signature 'Parsimnet' plot(x,y,net=1,inter.labels=FALSE,...)
x |
an object of class |
y |
not used (needed for compatibility with generic plot function). |
net |
a numeric vector of length one indicating which network to plot. |
inter.labels |
boolean; should vertex labels of intermediate haplotypes to be displayed? |
... |
additional arguments to |
These methods call plot.network.default
from package network. Some default parameters are changed:
a vector of vertex labels. By default the row names of the distance matrices in slot d
are used. If inter.labels==FALSE
only haplotype labels are displayed.
boolean; should arrows (rather than line segments) be used to indicate edges? Default is set to FALSE.
the vertex placement algorithm. Default is set to "kamadakawai"
.
amount to pad the plotting range; useful if labels are being clipped. Default is set to 1.
character expansion factor for label text. Default is set to 0.75.
a numeric vector of expansion factor for vertices. By default it is 0.8 for haplotypes and 0.5 for intermediates.
an integer or character vector for the vertex colors. By default it is 2 (red) for haplotypes and 4 (blue) for intermediates.
A two-column matrix containing the vertex positions as x,y coordinates.
signature(x = "Parsimnet",y = "missing")
Plots Parsimnet objects.
Caner Aktas, [email protected].
parsimnet
, plot.default
and plot.network.default
## Not run: data("dna.obj") x<-dna.obj ### Method for signature 'Parsimnet'. ## Statistical parsimony with 95 p<-parsimnet(x) p ## Plotting with default parameters. plot(p) ## Displaying vertex labels of intermediate haplotypes. plot(p, inter.labels=TRUE) ## Interactively adjusting vertex positions. plot(p, interactive=TRUE) ## Interactively adjusting and saving vertex positions. p<-parsimnet(x) #saving vertex positions as x,y coordinates. coo<-plot(p,interactive=TRUE) #reuse saved coordinates. plot(p,coord=coo) ## Adjusting vertex sizes. plot(p, vertex.cex=c(rep(3,nrow(p@d[[1]])))) # different sizes for haplotypes and intermediates plot(p, vertex.cex=c(rep(3,p@nhap),rep(1,c(nrow(p@d[[1]])-p@nhap)))) ## Adjusting vertex colors # different color for haplotypes and intermediates plot(p, vertex.col=c(rep("magenta",p@nhap),rep("deepskyblue3",c(nrow(p@d[[1]])-p@nhap)))) ## Statistical parsimony with 98 p<-parsimnet(x,prob=.99) p #plot the first network plot(p,net=1) #plot the second network plot(p,net=2) #plot the third network. It is a single vertex. plot(p,net=3) ## End(Not run)
## Not run: data("dna.obj") x<-dna.obj ### Method for signature 'Parsimnet'. ## Statistical parsimony with 95 p<-parsimnet(x) p ## Plotting with default parameters. plot(p) ## Displaying vertex labels of intermediate haplotypes. plot(p, inter.labels=TRUE) ## Interactively adjusting vertex positions. plot(p, interactive=TRUE) ## Interactively adjusting and saving vertex positions. p<-parsimnet(x) #saving vertex positions as x,y coordinates. coo<-plot(p,interactive=TRUE) #reuse saved coordinates. plot(p,coord=coo) ## Adjusting vertex sizes. plot(p, vertex.cex=c(rep(3,nrow(p@d[[1]])))) # different sizes for haplotypes and intermediates plot(p, vertex.cex=c(rep(3,p@nhap),rep(1,c(nrow(p@d[[1]])-p@nhap)))) ## Adjusting vertex colors # different color for haplotypes and intermediates plot(p, vertex.col=c(rep("magenta",p@nhap),rep("deepskyblue3",c(nrow(p@d[[1]])-p@nhap)))) ## Statistical parsimony with 98 p<-parsimnet(x,prob=.99) p #plot the first network plot(p,net=1) #plot the second network plot(p,net=2) #plot the third network. It is a single vertex. plot(p,net=3) ## End(Not run)
This function displays the polymorphic sites (base substitutions and indels) between the two sequences.
## S4 method for signature 'Dna' polymorp(x,pair,indels="sic")
## S4 method for signature 'Dna' polymorp(x,pair,indels="sic")
x |
an object of class |
pair |
a vector of integers in the range [1,nrow(x)] of length two, specifying sequence pair. |
indels |
the indel coding method to be used. This must be one of "sic", "5th" or "missing". Any unambiguous substring can be given. See |
a list with two components:
indels
:a list of matrices of the indel regions if indels=="sic"
. The component names of the list gives the position of the indels.
subst
:a list of matrices of the base substitutions. If indels=="5th"
, each gap is treated as a base substitution. The component names of the list gives the position of the base substitutions.
signature(x = "Dna")
Showing base substitutions and indels between the two sequences.
Caner Aktas, [email protected]
indelcoder
and subs
data("dna.obj") x<-dna.obj ## Showing base substitutions and indels between seq1 and seq6. # gaps are coded following the simple indel coding method polymorp(x,c(1,6),indels="s") # gaps are coded as a fifth state character polymorp(x,c(1,6),indels="5") # gaps are treated as missing character polymorp(x,c(1,6),indels="m")
data("dna.obj") x<-dna.obj ## Showing base substitutions and indels between seq1 and seq6. # gaps are coded following the simple indel coding method polymorp(x,c(1,6),indels="s") # gaps are coded as a fifth state character polymorp(x,c(1,6),indels="5") # gaps are treated as missing character polymorp(x,c(1,6),indels="m")
range
returns the lengths of shortest and longest DNA sequences.
## S4 method for signature 'Dna' range(x)
## S4 method for signature 'Dna' range(x)
x |
an object of class |
an integer of length two.
signature(x = "Dna")
range
data("dna.obj") x <-dna.obj ## shortest and longest DNA sequence lengths range(x)
data("dna.obj") x <-dna.obj ## shortest and longest DNA sequence lengths range(x)
Read DNA sequences from a file in FASTA Format.
read.fas(file)
read.fas(file)
file |
the name of the file, which the sequence in the FASTA format is to be read from. If it does not contain an absolute path, the file name is relative to the current working directory,
|
read.fas
returns an object of class Dna
.
By default, valid characters are "A","C","G","T","a","c","g","t","-","?" for the class Dna
. Numeric entries (integers) between 0-5 will be converted to "?","A","C","G","T","-", respectively. Invalid characters will be replaced with "?" with a warning message.
Caner Aktas, [email protected]
##Reading example file. f<-system.file("example.fas",package="haplotypes") # invalid character 'N' was replaced with '?' with a warning message x<-read.fas(file=f) # an object of class 'Dna' x
##Reading example file. f<-system.file("example.fas",package="haplotypes") # invalid character 'N' was replaced with '?' with a warning message x<-read.fas(file=f) # an object of class 'Dna' x
Dna
objectRemoving gaps("-") from Dna
object
## S4 method for signature 'Dna' remove.gaps(x,entire.col=FALSE)
## S4 method for signature 'Dna' remove.gaps(x,entire.col=FALSE)
x |
an object of class |
entire.col |
boolean; entire columns with gaps are removed if this is TRUE. See also ‘Details’. |
If entire.col==TRUE, alignment is preserved. If it is FALSE, end gaps are introduced to sequence matrix.
an object of class Dna
.
signature(x = "Dna")
Caner Aktas, [email protected]
data("dna.obj") ## original data x<-dna.obj range(x) x@seqlengths ## Only gaps '-' are removed from sequences. x<-remove.gaps(dna.obj, entire.col=FALSE) range(x) x@seqlengths ## entire columns with gaps are removed. x<-remove.gaps(dna.obj, entire.col=TRUE) range(x) x@seqlengths
data("dna.obj") ## original data x<-dna.obj range(x) x@seqlengths ## Only gaps '-' are removed from sequences. x<-remove.gaps(dna.obj, entire.col=FALSE) range(x) x@seqlengths ## entire columns with gaps are removed. x<-remove.gaps(dna.obj, entire.col=TRUE) range(x) x@seqlengths
Function to get or set row names of a sequence matrix in a Dna
object or distance matrix (or matrices) in a Parsimnet
object.
## S4 method for signature 'Dna' rownames(x) ## S4 method for signature 'Parsimnet' rownames(x) ## S4 replacement method for signature 'Dna' rownames(x)<-value ## S4 replacement method for signature 'Parsimnet' rownames(x)<-value
## S4 method for signature 'Dna' rownames(x) ## S4 method for signature 'Parsimnet' rownames(x) ## S4 replacement method for signature 'Dna' rownames(x)<-value ## S4 replacement method for signature 'Parsimnet' rownames(x)<-value
x |
|
value |
a character vector of the same length as number of sequences or a list of the same length as number of networks including vertex names for each network. See ‘Examples’ |
signature(x = "Dna")
signature(x = "Parsimnet")
data("dna.obj") x<-dna.obj ### Method for signature 'Dna'. ## Getting sequence names. rownames(x) ## Setting sequence names. rownames(x)<-c(1:nrow(x)) rownames(x) ### Method for signature 'Parsimnet'. x<-dna.obj ##single network p<-parsimnet(x) ##Getting vertex names rownames(p) ## Setting vertex names. rownames(p)<-list(c(1:nrow(p@d[[1]]))) rownames(p) plot(p) ## Multiple networks with 99% connection limit. p<-parsimnet(x,prob=.99) ## Getting vertex names rownames(p) ## Setting vertex names. rownames(p)<-list(1:9, 10, 11,12:13,14,15:16) rownames(p)
data("dna.obj") x<-dna.obj ### Method for signature 'Dna'. ## Getting sequence names. rownames(x) ## Setting sequence names. rownames(x)<-c(1:nrow(x)) rownames(x) ### Method for signature 'Parsimnet'. x<-dna.obj ##single network p<-parsimnet(x) ##Getting vertex names rownames(p) ## Setting vertex names. rownames(p)<-list(c(1:nrow(p@d[[1]]))) rownames(p) plot(p) ## Multiple networks with 99% connection limit. p<-parsimnet(x,prob=.99) ## Getting vertex names rownames(p) ## Setting vertex names. rownames(p)<-list(1:9, 10, 11,12:13,14,15:16) rownames(p)
show
in the package haplotypes
Show objects of classes Dna
, Haplotype
, and Parsimnet
signature(object = "Dna")
displays Dna
object briefly: The total number of DNA sequences, names of the first six sequences (if nrow(x)>=6), length of the shortest and longest sequences and the names of the slots.
signature(object = "Haplotype")
displays Haplotype
object briefly: The list of individuals that share the same haplotypes, the total number of haplotypes and the names of the slots.
signature(object = "Parsimnet")
displays Parsimnet
object briefly: The total number of networks, the maximum connection steps at chosen probability, the total number of haplotypes in each network,the total number of intermediates in each network, total network lengths (scores) of each network and the names of the slots.
This function displays all base substitutions. If fifth=="TRUE"
, each gap is treated as a fifth state character.
## S4 method for signature 'Dna' subs(x,fifth=FALSE)
## S4 method for signature 'Dna' subs(x,fifth=FALSE)
x |
an object of class |
fifth |
boolean; should gaps be treated as a fifth state character? |
a list with three components:
subsmat
:a sequence matrix showing substitutions.
subs
:a list of matrices of the substitutions.
subsmnum
:total number of substitutions.
signature(x = "Dna")
Caner Aktas, [email protected].
data("dna.obj") x<-dna.obj ## Base substitutions. subs(x) ## Gaps are treated as a fifth state character. subs(x,fifth=TRUE)
data("dna.obj") x<-dna.obj ## Base substitutions. subs(x) ## Gaps are treated as a fifth state character. subs(x,fifth=TRUE)
Convert sequence characters in a Dna
object from upper to lower case or vice versa.
## S4 method for signature 'Dna' tolower(x) ## S4 method for signature 'Dna' toupper(x)
## S4 method for signature 'Dna' tolower(x) ## S4 method for signature 'Dna' toupper(x)
x |
an object of class |
an object of class Dna
.
signature(x = "Dna")
## Coercing a list to a 'Dna' object. seq1<-c("?","A","C","g","t","-","0","1") seq2<-c("?","A","C","g","t","-","0","1","2") seq3<-c("?","A","C","g","t","-","0","1","2","3") x<-list(seq1=seq1,seq2=seq2,seq3=seq3) dna.obj<-as.dna(x) #characters in Dna object table(as.matrix(dna.obj)) ##all lower case lowc<-tolower(dna.obj) #characters table(as.matrix(lowc)) ##all upper case upc<-toupper(dna.obj) #characters table(as.matrix(upc))
## Coercing a list to a 'Dna' object. seq1<-c("?","A","C","g","t","-","0","1") seq2<-c("?","A","C","g","t","-","0","1","2") seq3<-c("?","A","C","g","t","-","0","1","2","3") x<-list(seq1=seq1,seq2=seq2,seq3=seq3) dna.obj<-as.dna(x) #characters in Dna object table(as.matrix(dna.obj)) ##all lower case lowc<-tolower(dna.obj) #characters table(as.matrix(lowc)) ##all upper case upc<-toupper(dna.obj) #characters table(as.matrix(upc))
unique
returns a list with duplicate sequences removed.
## S4 method for signature 'Dna' unique(x,gaps=FALSE)
## S4 method for signature 'Dna' unique(x,gaps=FALSE)
x |
an object of class |
gaps |
boolean; gaps are removed if this is FALSE. |
This function behaves somehow similar to haplotype
, however indels and missing characters are not taken into account.
signature(x = "Dna")
data("dna.obj") x<-dna.obj[1:6,,as.matrix=FALSE] ##gaps removed. unique(x) ##gaps not removed. unique(x,gaps=TRUE) ##unique vs. haplotype #unique returns 5 unique sequences. unique(x) length(unique(x)) #haplotype returns 4 unique haplotypes with simple indel coding. h<-haplotype(x) as.list(as.dna(h)) length(h) #haplotype returns 3 unique haplotypes with gaps as missing. h<-haplotype(x, indels="m") as.list(as.dna(h)) length(h)
data("dna.obj") x<-dna.obj[1:6,,as.matrix=FALSE] ##gaps removed. unique(x) ##gaps not removed. unique(x,gaps=TRUE) ##unique vs. haplotype #unique returns 5 unique sequences. unique(x) length(unique(x)) #haplotype returns 4 unique haplotypes with simple indel coding. h<-haplotype(x) as.list(as.dna(h)) length(h) #haplotype returns 3 unique haplotypes with gaps as missing. h<-haplotype(x, indels="m") as.list(as.dna(h)) length(h)