Title: | Linkage Map Construction using the MSTmap Algorithm |
---|---|
Description: | Functions for Accurate and Speedy linkage map construction, manipulation and diagnosis of Doubled Haploid, Backcross and Recombinant Inbred 'R/qtl' objects. This includes extremely fast linkage map clustering and optimal marker ordering using 'MSTmap' (see Wu et al.,2008). |
Authors: | Julian Taylor [aut, cre], David Butler. [aut] |
Maintainer: | Julian Taylor <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0-8 |
Built: | 2025-03-05 03:04:12 UTC |
Source: | https://github.com/drj001/asmap |
Additional functions for linkage map construction and manipulation of R/qtl objects. This includes extremely fast linkage map clustering and marker ordering using MSTmap (see Wu et al., 2008).
Package: | ASMap |
Type: | Package |
Version: | 1.0-4 |
Date: | 2018-10-24 |
License: | GPL 2 |
Welcome to the ASMap package!
One of the fundamental reasons why this package exists was to utilize and implement the source code for the the Minimum Spanning Tree algorithm derived in Wu et al. (2008) (reference below) for linkage map construction. The algorithm is lightning quick at linkage group clustering and optimal marker ordering and can handle large numbers of markers.
The package contains two very efficient functions, mstmap.data.frame
and mstmap.cross
, that provide users with a highly flexible set
linkage map construction methods using the MSTmap
algorithm. mstmap.data.frame
constructs a linkage map from
a data frame of genetic marker data and will use the entire
contents of the object to form linkage groups and optimally order
markers within each linkage group. mstmap.cross
is a
linkage map construction function for qtl package objects and can
be used to construct linkage maps in a flexible number of ways.
See ?mstmap.cross
for complete details.
To complement the computationally efficient linkage map construction
functions, the package also contains functions pullCross
and
pushCross
that allow the pulling/pushing markers of different
types to and from the linkage map. This system gives users the ability
to initially pull markers aside that are not needed for immediate
construction and push them back later if required. There are also
functions for fast numerical and graphical diagnosis of unconstructed
and constructed linkage maps. Specifically, there is an improved
heatMap
that graphically displays pairwise recombination
fractions and LOD scores with separate legends for
each. profileGen
can be used to simultaneously profile multiple
statistics such as recombination counts and double recombination
counts for individual lines across the constructed linkage
map. profileMark
allows simultaneous graphical visualization of
marker or interval statistics profiles across the genome or subsetted
for a predefined set of linkage groups. Graphical identification and
orientation of linkage groups using reference linkage maps can be conducted using
alignCross
. All of these graphical functions utilize the power of
the advanced graphics package lattice to provide seamless multiple
displays.
Other miscellaneous utilities for qtl objects include
mergeCross
: Merging of linkage groups
breakCross
: Breaking of linkage groups
combineMap
: Combining linkage maps
quickEst
: Very quick estimation of genetic map distances
genClones
: Reporting genotype clones
fixClones
: Consensus genotypes for clonal groups
A comprehensive vignette showcasing the package is now available! It contains detailed explanations of the functions in the package and how they can be used to perform efficient map construction. There is a fully worked example that involves pre-construction diagnostics, linkage map construction and post construction diagnostics. This example also shows how functions of the package can be used for post linkage map construction techniques such as fine mapping and combining linkage maps. The vignette has be succinctly summarised in the Journal of Statistical Software publication Taylor and Butler (2017) referenced below.
Julian Taylor, Dave Butler, Timothy Close, Yonghui Wu, Stefano Lonardi Maintainer: Julian Taylor <[email protected]>
Wu, Y., Bhat, P., Close, T.J, Lonardi, S. (2008) Efficient and Accurate Construction of Genetic Linkage Maps from Minimum Spanning Tree of a Graph. Plos Genetics, 4, Issue 10.
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
A graphical tool for identity and alignment of linkage groups in qtl cross objects using reference maps.
alignCross(object, chr, maps, ...)
alignCross(object, chr, maps, ...)
object |
A qtl cross |
chr |
A character string of linkage group names or a logical vector equal to
the length of the number of linkage groups (see |
maps |
A named list of qtl cross objects or |
... |
Other arguments to be passed to the high level lattice plot. |
If any list elements of map
are qtl "cross"
objects
then marker names, linkage group identity and genetic distance
information are extracted. List elements of map
that are data.frame
objects must explicitly contain named columns
"marker"
, "ref.chr"
, "ref.dist"
otherwise an error will be produced.
For each linkage group determined by chr
, the contents of the
listed maps
are checked for matching markers in
object
. For each chr
and reference map combination, a
scatter plot of the object
genetic distances against the
reference distances is displayed with reference
linkage group names as the plotting character. If a linkage group is in
correct orientation the overall slope of the scatter plot should be
positive. If a linkage group requires inverting then the overall slope
should be negative.
A lattice panel plot is displayed with panels labelled by a combination
of chr
and the maps
used as a reference. A data frame of
these results is also invisibly returned.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") chrl <- sample(c(TRUE,FALSE), 23, replace = TRUE) mapDH1 <- subset(mapDH, chr = chrl) alignCross(mapDH, maps = list(DH = mapDH1), layout = c(3,5), col = 1:7)
data(mapDH, package = "ASMap") chrl <- sample(c(TRUE,FALSE), 23, replace = TRUE) mapDH1 <- subset(mapDH, chr = chrl) alignCross(mapDH, maps = list(DH = mapDH1), layout = c(3,5), col = 1:7)
Breaks linkage groups of an qtl cross object from a user specified list.
breakCross(cross, split = NULL, suffix = "numeric", sep = ".")
breakCross(cross, split = NULL, suffix = "numeric", sep = ".")
cross |
An qtl |
split |
A list named by the linkage groups required for splitting and containing marker names immediately preceding where the splits are to be made (see Details). |
suffix |
This can be a vector of character strings containing |
sep |
The character separator to be used to separate the linkage group name and the suffix. |
The splitting of any linkage group only needs to be defined by the markers immediately preceding where the splits are to be made. Multiple splits in the one linkage group are possible as well as splitting across multiple linkage groups with one call.
The cross object is returned with identical class structure as the
inputted cross object. The "geno"
element will contain
separate linkage groups for the user defined splits.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") mapDH1 <- breakCross(mapDH, split = list("4A" = "4A.m.8")) pull.map(mapDH1)[["4A.1"]] pull.map(mapDH1)[["4A.2"]] ## manually choose suffix mapDH1 <- breakCross(mapDH, split = list("4A" = "4A.m.8"), suffix = list("4A" = c("4AA","4AB")))
data(mapDH, package = "ASMap") mapDH1 <- breakCross(mapDH, split = list("4A" = "4A.m.8")) pull.map(mapDH1)[["4A.1"]] pull.map(mapDH1)[["4A.2"]] ## manually choose suffix mapDH1 <- breakCross(mapDH, split = list("4A" = "4A.m.8"), suffix = list("4A" = c("4AA","4AB")))
Combine map information, marker data and phenotype data from multiple qtl cross objects
combineMap(..., id = "Genotype", keep.all = TRUE, merge.by = "genotype")
combineMap(..., id = "Genotype", keep.all = TRUE, merge.by = "genotype")
... |
An unlimited set of arguments with each argument defining an qtl cross object. All qtl objects can have any class structure but it must be identical across objects. (see Details for more information.) |
id |
The name of the common column in the |
keep.all |
A logical value determining whether all genotypes should be kept in the
final linkage map regardless of their absence in some linkage maps (see
Details). Default is |
merge.by |
A character string. If "genotype" then combining of maps occurs by common genotypes and if "marker" combining of maps occurs by common markers. Default is "genotype". (see Details for more information.) |
This function combines linkage maps from multiple qtl cross
objects by merging marker data and map information as well as phenotypic
data if present. The function contains some initial checks before
proceeding with the combining. Firstly, all qtl cross objects must
have the same class structure and have a column in the pheno
element of the object named by the argument id
. The symbol ";"
should be avoided in markers as this is reserved for string manipulation
within the function.
If merge.by = "genotype"
then the combining occurs sequentially
across linkage maps based on common genotype names. If
keep.all=TRUE
then the marker set and phenotypic data are
"padded out" when genotype names are not shared between maps. If
keep.all=FALSE
then the marker set and phenotype data are shrunk
to only include genotypes that are shared among all linkage
maps. Marker names must be unique across the set of linkage
maps. Non-matching genotype names between linkage maps will expand the
final marker data and phenotypic data so it is prudent to check genotype
names are correct in each of the linkage maps before combining.
If merge.by = "marker"
then the combining occurs
sequentially across linkage maps based on common markers. If
keep.all=TRUE
then the marker set is
"padded out" when marker names are not shared between maps. If
keep.all=FALSE
then the marker set is shrunk
to only include markers that are shared among all linkage
maps. Genotypes must be unique across the set of linkage maps. It should
be noted, this function does not use a consensus map
algorithm to determine chromosome identification and genetic distances
of common markers. These are both calculated using the first instance of
the markers appearance across the sequential maps. This makes it ideal
for potentially pushing additional genotypes into an established map.
For both merge.by
types, if a linkage group name is shared across
linkage maps then the marker data from the shared linkage group in each
of the maps will be merged. If maps share the same
linkage group names and do not require merging the duplicate linkage
group names in one of the linkage maps will need to be altered before
combining. As a final process, markers are ordered within linkage groups
according to distances supplied in each of the linkage maps.
It should also be noted that this function does not re-construct the
final linkage map after combining the set of linkage maps. For efficient
linkage map reconstruction of a combined qtl object see
mstmap.cross()
.
A single R/qtl cross object is returned with identical class structure as the inputted cross objects.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
breakCross
and mergeCross
data(mapDH, package = "ASMap") ## create copy of mapDH with some different linkage groups ## and change marker names so they are unique mapDH1 <- mapDH names(mapDH1$geno)[5:14] <- paste("L",1:10, sep = "") mapDH1$geno <- lapply(mapDH1$geno, function(el){ nam <- paste(names(el$map), "A", sep = "") names(el$map) <- dimnames(el$data)[[2]] <- nam el}) mapDHc <- combineMap(mapDH, mapDH1) nmar(mapDHc)
data(mapDH, package = "ASMap") ## create copy of mapDH with some different linkage groups ## and change marker names so they are unique mapDH1 <- mapDH names(mapDH1$geno)[5:14] <- paste("L",1:10, sep = "") mapDH1$geno <- lapply(mapDH1$geno, function(el){ nam <- paste(names(el$map), "A", sep = "") names(el$map) <- dimnames(el$data)[[2]] <- nam el}) mapDHc <- combineMap(mapDH, mapDH1) nmar(mapDHc)
Consensus genotypes for clonal genotype groups of an R/qtl object.
fixClones(object, gc, id = "Genotype", consensus = TRUE)
fixClones(object, gc, id = "Genotype", consensus = TRUE)
object |
An qtl |
gc |
A data frame of genotype clone infomation usually from a call to
|
id |
Character string defining the column of |
consensus |
A logical value. If |
This function provides a very efficient way of dealing with genotype clones in a genetic marker set. This function can be used at any stage of the map construction process as it retains linkage group and marker position information.
The gc
argument needs to be a data frame of clone
information and is easily obtained from a call to genClones
. If
this function is not used then the data frame must contain at least three
columns with the first two columns named "G1"
and "G2"
containing the pairs of genotypes that are clones and a "group"
column that indicates the clonal group the pairs of genotypes belongs to.
If consensus = TRUE
then the function will intelligently collapse
the alleles for each marker to form a consensus genotype. Specifically, the allele
value will remain unchanged when there are observed allele values
across all genotypes in the clone group. For cases where there are
missing alleles for some but not all of the
genotypes, the consensus genotype will be given the common allele value
from the genotypes that contained observed allele values. If there is
more than one unique allele value across the genotypes for any marker
then it is set to missing.
The cross object is returned with identical class structure as the imputted cross object.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
comparegeno
and genClones
data(mapDH, package = "ASMap") gc <- genClones(mapDH) mapDHf <- fixClones(mapDH, gc$cgd, consensus = TRUE)
data(mapDH, package = "ASMap") gc <- genClones(mapDH) mapDHf <- fixClones(mapDH, gc$cgd, consensus = TRUE)
Find and report genotype clones for qtl objects.
genClones(object, chr, tol = 0.9, id = "Genotype")
genClones(object, chr, tol = 0.9, id = "Genotype")
object |
An qtl |
chr |
A character string of linkage group names. |
tol |
Pairs of genotypes with a proporion of matching alleles above this tolerance will be returned. |
id |
Character string defining the column of |
This function extends the functionality of comparegeno
in the
qtl package by providing breakdown statistics for the pairs of
genotypes that have a proportion of matching alleles above tol
.
A list is returned with the matrix from comparegeno
as an element
cgm
and the breakdown statistics for returned genotype pairs in
cgd
. Specifically, the statistics contain a "group"
column
which determines the clonal group the pair of genotypes belongs to.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
comparegeno
and fixClones
data(mapDH, package = "ASMap") gc <- genClones(mapDH)
data(mapDH, package = "ASMap") gc <- genClones(mapDH)
Heat map of the estimated pairwise recombination fractions and LOD
linkage between markers that provides extended functionality of Bromans
qtl package plotRF
function.
heatMap(x, chr, mark, what = c("both", "lod", "rf"), lmax = 12, rmin = 0, markDiagonal = FALSE, color = rev(colorRampPalette(brewer.pal(11,"Spectral"))(256)), ...)
heatMap(x, chr, mark, what = c("both", "lod", "rf"), lmax = 12, rmin = 0, markDiagonal = FALSE, color = rev(colorRampPalette(brewer.pal(11,"Spectral"))(256)), ...)
x |
A |
chr |
A character string of linkage group names to subset the cross object. |
mark |
An argument to subset linkage groups further into marker subsets. This can be a single numerical vector of markers positions which will subset all linkage groups in the same manner. Or it may be a list of numerical vectors named by the linkage group names with which to subset the linkage groups separately. |
what |
A character string of either |
lmax |
The threshold LOD score to implemented. Scores above this threshold will be plotted at the same colour. |
rmin |
The threshold recombination fraction to be implemented. Recombination fractions below this threshold will be plotted at the same colour. |
markDiagonal |
Logical value. If |
color |
The colour spectrum used to display the heat map. The default is a
|
... |
There are additional features available through this argument that can be used to customize the heatmap (see Details). |
This function is a rewrite of Bromans qtl package function
plot.rf
that provides extended functionality. When what =
"lod"
is chosen the pairwise LOD linkage
between markers is displayed on the heat map with
a legend on the right hand side spanning zero to lmax
across the
color
spectrum. If what = "rf"
the pairwise estimated recombination
fractions are displayed on the heat map with a legend on the right hand side spanning
rmin
to 0.5 across the color
spectrum. The legend also
extends past 0.5 to display estimated recombination fractions between
0.5 and one through a colour spectrum of the maximum color
value
to white. This functionality now gives users the ability to detect
markers that may be problematic or possibly out of phase. For what
= "both"
the pairwise LOD linkage is displayed on the lower triangle of the
heat map and the pairwise estimated recombination fractions are
displayed on the upper triangle. If this option is chosen, legends are displayed for both
components of the heat map.
The default colour spectrum is the diverging palette "Spectral"
from the RcolorBrewer package. This diverging palette
provides an aesthetically pleasing colour spectrum for the
diagnosis of pairwise linkage between markers. Specifically, the palette
displays weak linkage and/or low recombination between markers as blue
or "cool" areas and strong linkage and/or recombination between markers are
shown as red or "hot" areas.
Much of the extra functionality of this function comes from the use of
image.plot
in the fields package. This function allows the
partitioning of the plotting region into a bigplot
region for the
heat map and a smallplot
region for the legend. This is called
twice when what = "both"
. The size of the regions can be
manipulated by passing the bigplot
or smallplot
arguments
to the function but it is advised to use the defaults. Further
manipulation of the heat map can achieved by passing other arguments of
the function image.plot
. Users should consult the help file for
image.plot
for more details. It should be noted that the
argument legend.args
needs to be avoided as it used in this
function.
A heat map is displayed on the current plotting device.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") ## bulking linkage groups and reconstructing entire linkage map test1 <- mstmap(mapDH, bychr = FALSE, dist.fun = "kosambi", trace = FALSE) ## plot heat map of result heatMap(test1, lmax = 30)
data(mapDH, package = "ASMap") ## bulking linkage groups and reconstructing entire linkage map test1 <- mstmap(mapDH, bychr = FALSE, dist.fun = "kosambi", trace = FALSE) ## plot heat map of result heatMap(test1, lmax = 30)
A constructed linkage map for a backcross barley population in the form of a constructed qtl object.
data(mapBC)
data(mapBC)
This data relates to a fully constructed linkage map of 3019
markers genotyped on 300 individuals spanning the 7 linkage groups of
the barley genome. The map was constructed using the MSTmap algorithm
integrated in mstmap.cross
with geentic distances estimated
using the "kosambi"
mapping function. The data is in qtl
format with a class structure c("bc","cross")
. See
read.cross()
documentation for more details on the
format of this object.
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapBC, package = "ASMap")
data(mapBC, package = "ASMap")
An unconstructed marker set for a backcross barley population in the form of an qtl object.
data(mapBCu)
data(mapBCu)
This data relates to an unconstructed version of
mapBC
and consists of 3023 markers genotyped on 326
individuals with markers randomly assorted on one large linkage group.
The data is in qtl format with a class structure
c("bc","cross")
. See read.cross()
documentation for more
details on the format of this object. This data set forms the basis of
the worked example in Chapter 3 of the vignette (see vignette("ASMap")
for complete details)
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapBCu, package = "ASMap")
data(mapBCu, package = "ASMap")
A constructed linkage map for a doubled haploid wheat population in the form of a constructed qtl object.
data(mapDH)
data(mapDH)
This data relates to a fully constructed linkage map of 599
markers genotyped on 218 individuals. The linkage map consists of 23
linkage groups spanning the whole genome. 584 markers are from the
orignal map with an additonal 12 co-located markers and 3 slightly
distorted markers. The map was constructed using the MSTmap algorithm
integrated inmstmap.cross
with geentic distances estimated
using the "kosambi"
mapping function. The data is in qtl
format with a class structure c("bc","cross")
. See
read.cross
documentation for more details on the
format of this object.
data(mapDH, package = "ASMap")
data(mapDH, package = "ASMap")
An unconstructed marker set for a doubled haploid wheat population in the form of a constructed qtl object.
data(mapDHf)
data(mapDHf)
This data is the unconstructed version of mapDH
and consists of 599 markers genotyped on 218 individuals. 584 markers
are from the orignal map with an additonal 12 co-located markers and
3 slightly distorted markers. The data is in a data.frame
format
with genotypes in columns and randomly assorted markers in rows. See
mstmap.data.frame
documentation for more details on the format
of this object.
data(mapDHf, package = "ASMap")
data(mapDHf, package = "ASMap")
Simulated constructed linkage map for a self pollinated F2 barley population in the form of an qtl object.
data(mapF2)
data(mapF2)
This data relates to a fully constructed linkage map of 700
simulated markers genotyped on 250
individuals. The map consists of 7 linkage groups, each contaning 100
markers spanning an approximate linkage group length of 200cM. The map was
constructed using mstmap.cross
from the ASMap package and
map distances were estimated using the "kosambi"
mapping function. The data is in R/qtl format with a class structure
c("bcsft","cross")
.
data(mapF2, package = "ASMap")
data(mapF2, package = "ASMap")
Merges linkage groups of an qtl cross object from a user specified list.
mergeCross(cross, merge = NULL, gap = 5)
mergeCross(cross, merge = NULL, gap = 5)
cross |
An qtl |
merge |
A list with elements containing the linkage groups to be merged with each element named by the proposed linkage group name (see Examples). |
gap |
The cM gap to put between the merged map elements in the complete linkage group. |
This merging function allows you to perform multiple merges of two or more linkage groups in one call. Users should ensure linkage group names are correct and that proposed linkage group names do not already exist.
The cross object is returned with identical class structure as the
inputted cross object. The "geno"
element should now contain
merged linkage groups for the user defined merges.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") mapDH1 <- breakCross(mapDH, split = list("4A" = "4A.m.8")) pull.map(mapDH1)[["4A.1"]] pull.map(mapDH1)[["4A.2"]] mapDH2 <- mergeCross(mapDH1, merge = list("4A" = c("4A.1","4A.2"))) pull.map(mapDH2)[["4A"]]
data(mapDH, package = "ASMap") mapDH1 <- breakCross(mapDH, split = list("4A" = "4A.m.8")) pull.map(mapDH1)[["4A.1"]] pull.map(mapDH1)[["4A.2"]] mapDH2 <- mergeCross(mapDH1, merge = list("4A" = c("4A.1","4A.2"))) pull.map(mapDH2)[["4A"]]
Extremely fast linkage map construction for qtl objects using the source code for MSTmap (see Wu et al., 2008). The construction includes linkage group clustering, marker ordering and genetic distance calculations.
## S3 method for class 'cross' mstmap(object, chr, id = "Genotype", bychr = TRUE, suffix = "numeric", anchor = FALSE, dist.fun = "kosambi", objective.fun = "COUNT", p.value = 1e-06, noMap.dist = 15, noMap.size = 0, miss.thresh = 1, mvest.bc = FALSE, detectBadData = FALSE, return.imputed = FALSE, trace = FALSE, ...)
## S3 method for class 'cross' mstmap(object, chr, id = "Genotype", bychr = TRUE, suffix = "numeric", anchor = FALSE, dist.fun = "kosambi", objective.fun = "COUNT", p.value = 1e-06, noMap.dist = 15, noMap.size = 0, miss.thresh = 1, mvest.bc = FALSE, detectBadData = FALSE, return.imputed = FALSE, trace = FALSE, ...)
object |
A |
chr |
A character string of linkage group names that require re-construction and/or optimal ordering of the markers they contain. (see Details). |
id |
The name of the column in |
bychr |
Logical value. For a given set of linkage groups defined by |
suffix |
Character string either |
anchor |
Logical value. The MSTmap algorithm does not respect the inputted marker
order of the linkage map required for construction. For a given set of
linkage groups defined by |
dist.fun |
Character string defining the distance function used for calculation of genetic distances. Options are "kosambi" and "haldane". Default is "kosambi". |
objective.fun |
Character string defining the objective function to be used when
constructing the map. Options are |
p.value |
Numerical value to specify the threshold to use when clustering
markers. Defaults to |
noMap.dist |
Numerical value to specify the smallest genetic distance a set of
isolated markers can appear distinct from other linked markers. Isolated
markers will appear in their own linkage groups and will be of size
specified by |
noMap.size |
Numerical value to specify the maximum size of isolated marker linkage
groups that have been identified using |
miss.thresh |
Numerical value to specify the threshold proportion of missing marker scores allowable in each of the markers. Markers above this threshold will not be included in the linkage map. Default is 1. |
mvest.bc |
Logical value. If |
detectBadData |
Logical value. If |
return.imputed |
Logical value. If |
trace |
An automatic tracing facility. If |
... |
Currently ignored. |
The qtl cross object needs to inherit one of the allowable classes
"bc","dh","riself", "bcsft"
. This provides a safeguard against
attempts to construct a map for more complex populations that can
exist in qtl. Users should be aware when doubled haploid
populations are read in using read.cross()
from the qtl
package they inherit the class "bc"
. Users can apply the class
"dh"
by simply changing the class of the object. For the purpose
of linkage map construction the classes "bc"
and "dh"
will
provide equivalent results.
MSTmap supports "RILn"
populations, where n is the number of generations
of selfing. Markers in these populations are required to be fully
informative i.e. contain 3 distinct allele types such as AA, BB for
parental homozygotes and AB for phase unknown heterozygotes.
If read.cross
is used to import the "RILn"
population the resultant
object will initially be given a class "f2"
. The level of selfing
would then have to be encoded into the object by applying one of the two conversion
functions available in the qtl package. For a
population that has been generated by selfing n times the conversion
function convert2bcsft
can be used by setting the arguments
F.gen = n
and BC.gen = 0
. Populations that are genuine
advanced RILs can be converted using the convert2riself
function.
This method function is designed to be an "all-in-one" function that
will allow you to construct linkage maps extremely fast in multiple
different ways from the supplied cross object
. Initially, the map
can be kept complete or a subset of selected linkage groups can be chosen
using the chr
argument. Setting bychr = FALSE
will
bulk the marker information for the selected linkage groups and, if
necessary, form new linkage groups and optimise the marker order within
each. Setting bychr = TRUE
will ensure that markers
are optimally ordered within each linkage group. This will also break
linkage groups depending on the p-value given in the call (see
below for details of the use of p.value
). If the
linkage map was initially subsetted, the linkage groups not involved in
the subset are returned to ensure the map is complete.
The algorithm allows an adjustment of the p.value
threshold for
clustering of markers to distinct linkage groups (see Wu et al.,
2008) and is highly dependent on the number of individuals in
the population. As the number of individuals increases the
p.value
threshold should be decreased accordingly. This may
require some trial and error to achieve desired results.
When bychr = TRUE
, established linkage groups may also split
depending on the p.value
given. To prevent this the p.value
threshold
may be increased to a desired value or the splitting may be prevented
altogether by supplying a value greater than one to this argument.
If mvest.bc = TRUE
and the population type is "bc","dh","riself"
then missing values are imputed before markers are clustered into
linkage groups. This is only a simple imputation that places a 0.5
probability of the missing observation being one allele or the other and
is used to assist the clustering algorithm when there is known to be high numbers of
missing observations between pairs of markers.
It should be highlighted that for population types
"bc","dh","riself"
, imputation of missing values occurs
regardless of the value of mvest.bc
. This is achieved using an EM algorithm that is
tightly coupled with marker ordering (see Wu et al., 2008). Initially
a marker order is obtained omitting missing marker scores and then
imputation is performed based on the underlying recombinant probabilities
of the flanking markers with the markers containing the missing
value. The recombinant probabilities are then recomputed and an update of
the pairwise distances are calculated. The ordering algorithm is then
run again and the complete process is repeated until
convergence. Note, the imputed probability matrix for the linkage map
being constructed is returned if return.imputed = TRUE
.
For populations "bc","dh","riself"
, if detectBadData =
TRUE
the marker ordering algorithm also
includes the detection of genotyping errors. For any individual
genotype, the detection method is based on a weighted Euclidean metric
(see Wu et al., 2008) that is a function of the
recombination probabilities of all the markers with the marker containing
the suspicious observation. Any genotyping errors detected are set to
missing and the missing values are then imputed as part of the marker
ordering algorithm. Note, the detection of these errors and their
amendment can be returned in the imputed probability matrix if
return.imputed = TRUE
.
If return.imputed = TRUE
and the object has class
"bc","dh","riself"
then the marker probability matrix is
returned for the linkage groups that have been constructed using the
algorithm. Each linkage group is named identically to the linkage groups
of the map and contains an ordered "map"
element and a "data"
element consisting of marker probabilities of the A allele being
present (i.e. P(A) = 1, P(B) = 0). Both elements contain a
possibly reduced version of the marker set that includes all
non-colocating markers as well as the first marker of any set of
co-locating markers.
The function returns a cross object with an identical class
structure to the cross object
inputted. The object is a list
with usual components "pheno"
and "geno"
. If markers were
omitted for any reason during the construction, the object will have an
"omit"
component with all omitted markers in a collated
matrix. If return.imputed = TRUE
then the object will also
contain an "imputed.geno"
element.
Julian Taylor, Dave Butler, Timothy Close, Yonghui Wu, Stefano Lonardi
Wu, Y., Bhat, P., Close, T.J, Lonardi, S. (2008) Efficient and Accurate Construction of Genetic Linkage Maps from Minimum Spanning Tree of a Graph. Plos Genetics, 4, Issue 10.
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
mstmap.data.frame
and breakCross
data(mapDH, package = "ASMap") ## bulking linkage groups and reconstructing entire linkage map test1 <- mstmap(mapDH, bychr = FALSE, dist.fun = "kosambi", trace = FALSE) pull.map(test1) ## one linkage group at a time (possibly break established linkage ## groups) test2 <- mstmap(mapDH, bychr = TRUE, dist.fun = "kosambi", trace = FALSE) pull.map(test2) ## one linkage group at a time (do not break established linkage groups) test3 <- mstmap(mapDH, bychr = TRUE, dist.fun = "kosambi", p.value = 2, trace = FALSE) pull.map(test3) ## impute before clustering and detect genotyping errors, pipe output to ## file test4 <- mstmap(mapDH, bychr = FALSE, dist.fun = "kosambi", trace = TRUE, mvest.bc = TRUE, detectBadData = TRUE) pull.map(test4) unlink("MSToutput.txt")
data(mapDH, package = "ASMap") ## bulking linkage groups and reconstructing entire linkage map test1 <- mstmap(mapDH, bychr = FALSE, dist.fun = "kosambi", trace = FALSE) pull.map(test1) ## one linkage group at a time (possibly break established linkage ## groups) test2 <- mstmap(mapDH, bychr = TRUE, dist.fun = "kosambi", trace = FALSE) pull.map(test2) ## one linkage group at a time (do not break established linkage groups) test3 <- mstmap(mapDH, bychr = TRUE, dist.fun = "kosambi", p.value = 2, trace = FALSE) pull.map(test3) ## impute before clustering and detect genotyping errors, pipe output to ## file test4 <- mstmap(mapDH, bychr = FALSE, dist.fun = "kosambi", trace = TRUE, mvest.bc = TRUE, detectBadData = TRUE) pull.map(test4) unlink("MSToutput.txt")
Extremely fast linkage map construction for data frame objects utilizing the source code for MSTmap (see Wu et al., 2008). The construction includes linkage group clustering, marker ordering and genetic distance calculations.
## S3 method for class 'data.frame' mstmap(object, pop.type = "DH", dist.fun = "kosambi", objective.fun = "COUNT", p.value = 1e-06, noMap.dist = 15, noMap.size = 0, miss.thresh = 1, mvest.bc = FALSE, detectBadData = FALSE, as.cross = TRUE, return.imputed = FALSE, trace = FALSE, ...)
## S3 method for class 'data.frame' mstmap(object, pop.type = "DH", dist.fun = "kosambi", objective.fun = "COUNT", p.value = 1e-06, noMap.dist = 15, noMap.size = 0, miss.thresh = 1, mvest.bc = FALSE, detectBadData = FALSE, as.cross = TRUE, return.imputed = FALSE, trace = FALSE, ...)
object |
A |
pop.type |
Character string specifying the population type of the data frame
|
dist.fun |
Character string defining the distance function used for calculation of
genetic distances. Options are |
objective.fun |
Character string defining the objective function to be used when
constructing the map. Options are |
p.value |
Numerical value to specify the threshold to use when clustering
markers. Defaults to |
noMap.dist |
Numerical value to specify the smallest genetic distance a set of
isolated markers can appear distinct from other linked markers. Isolated
markers will appear in their own linkage groups ad will be of size
specified by |
noMap.size |
Numerical value to specify the maximum size of isolated marker linkage
groups that have been identified using |
miss.thresh |
Numerical value to specify the threshold proportion of missing marker scores allowable in each of the markers. Markers above this threshold will not be included in the linkage map. Default is 1. |
mvest.bc |
Logical value. If |
detectBadData |
Logical value. If |
as.cross |
Logical value. If |
return.imputed |
Logical value. If |
trace |
An automatic tracing facility. If |
... |
Currently ignored. |
The data frame object
must have an explicit format with markers
in rows and genotypes in columns. The marker names are required to be in
the rownames
component and the genotype names are
required to be in the names
component of the object
. In
each set of names there must be no spaces. If spaces are detected they
are exchanged for a "-". Each of the columns of the data frame must be of class
"character"
(not factors). If converting from a matrix, this can
easily be achieved by using the stringAsFactors = FALSE
argument
for any data.frame
method.
It is important to know what population type the data frame
object
is and to correctly input this into pop.type
. If
pop.type = "ARIL"
then it is assumed that the minimal number of heterozygotes have been
set to missing before proceeding. The advanced RIL population is then
treated like a backcross population for the purpose of linkage map
construction. Genetic distances are adjusted post construction.
For non-advanced RIL populations pop.type =
"RILn"
, the number of generations of selfing is limited to 20 to
ensure sensible input.
The content of the markers in object
can either be all numeric
(see below) or all character. If markers are of type character then
the following allelic content must be explicitly adhered to. For pop.type
"BC"
,
"DH"
or "ARIL"
the two allele types should
be represented as ("A"
or "a"
) and ("B"
or
"b"
). For non-advanced RIL populations (pop.type = "RILn"
)
phase unknown heterozygotes should be represented as
"X"
. For all populations, missing marker scores should be represented
as ("U"
or "-"
).
This function also extends the functionality of the MSTmap
algorithm by allowing users to input a complete numeric data frame of
marker probabilities for pop.type
"BC"
, "DH"
or
"ARIL"
. The values must be inclusively between 1 (A) and 0 (B) and be
representative of the probability that the A allele is present. No
missing values are allowed.
The algorithm allows an adjustment of the p.value
threshold for
clustering of markers to distinct linkage groups (see Wu et al.,
2008) and is highly dependent on the number of individuals in
the population. As the number of individuals increases the
p.value
threshold should be decreased accordingly. This may
require some trial and error to achieve desired results.
If mvest.bc = TRUE
and the population type is "BC","DH","ARIL"
then missing values are imputed before markers are clustered into
linkage groups. This is only a simple imputation that places a 0.5
probability of the missing observation being one allele or the other and
is used to assist the clustering algorithm when there is known to be high numbers of
missing observations between pairs of markers.
It should be highlighted that for population types
"BC","DH","ARIL"
, imputation of missing values occurs
regardless of the value of mvest.bc
. This is achieved using an EM algorithm that is
tightly coupled with marker ordering (see Wu et al., 2008). Initially
a marker order is obtained omitting missing marker scores and then
imputation is performed based on the underlying recombinant probabilities
of the flanking markers with the markers containing the missing
value. The recombinant probabilities are then recomputed and an update of
the pairwise distances are calculated. The ordering algorithm is then
run again and the complete process is repeated until
convergence. Note, the imputed probability matrix for the linkage map
being constructed is returned if return.imputed = TRUE
.
For populations "BC","DH","ARIL"
, if detectBadData = TRUE
,
the marker ordering algorithm also
includes the detection of genotyping errors. For any individual
genotype, the detection method is based on a weighted Euclidean metric
(see Wu et al., 2008) that is a function of the
recombination probabilities of all the markers with the marker containing
the suspicious observation. Any genotyping errors detected are set to
missing and the missing values are then imputed if mv.est =
TRUE
. Note, the detection of these errors and their
amendment is returned in the imputed probability matrix if
return.imputed = TRUE
If as.cross = TRUE
then the constructed object is returned as a
qtl cross object with the appropriate class structure. For "RILn"
populations the constructed object is given the class "bcsft"
by
using the qtl package conversion function convert2bcsft
with arguments F.gen = n
and BC.gen =
0
. For "ARIL"
populations the constructed object is given the
class "riself"
.
If return.imputed = TRUE
and pop.type
is one of
"BC","DH","ARIL"
, then the marker probability matrix is
returned for the linkage groups that have been constructed using the
algorithm. Each linkage group is named identically to the linkage groups
of the map and, if as.cross = TRUE
, contains an ordered
"map"
element and a "data"
element consisting of marker probabilities of the A allele being present
(i.e. P(A) = 1, P(B) = 0). Both elements contain a
possibly reduced version of the marker set that includes all
non-colocating markers as well as the first marker of any set of
co-locating markers. If as.cross = FALSE
then an ordered data frame of matrix
probabilities is returned.
If as.cross = TRUE
the function returns an R/qtl cross object with the appropriate
class structure. The object is a list with usual components
"pheno"
and "geno"
. If as.cross = FALSE
the
function returns an ordered data frame object
with additional columns that indicate the linkage group, the position
and marker names and genetic distance of the markers within in each
linkage group. If markers were omitted for any reason during the
construction, the object will have an "omit"
component with
all omitted markers in a collated matrix. If return.imputed =
TRUE
then the object will also contain an "imputed.geno"
element.
Julian Taylor, Dave Butler, Timothy Close, Yonghui Wu, Stefano Lonardi
Wu, Y., Bhat, P., Close, T.J, Lonardi, S. (2008) Efficient and Accurate Construction of Genetic Linkage Maps from Minimum Spanning Tree of a Graph. Plos Genetics, 4, Issue 10.
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") ## forming data frame object from R/qtl object dfg <- t(do.call("cbind", lapply(mapDH$geno, function(el) el$data))) dimnames(dfg)[[2]] <- as.character(mapDH$pheno[["Genotype"]]) dfg <- dfg[sample(1:nrow(dfg), nrow(dfg), replace = FALSE),] dfg[dfg == 1] <- "A" dfg[dfg == 2] <- "B" dfg[is.na(dfg)] <- "U" dfg <- cbind.data.frame(dfg, stringsAsFactors = FALSE) ## construct map testd <- mstmap(dfg, dist.fun = "kosambi", trace = FALSE) pull.map(testd) ## let's get a timing on that ... system.time(testd <- mstmap(dfg, dist.fun = "kosambi", trace = FALSE))
data(mapDH, package = "ASMap") ## forming data frame object from R/qtl object dfg <- t(do.call("cbind", lapply(mapDH$geno, function(el) el$data))) dimnames(dfg)[[2]] <- as.character(mapDH$pheno[["Genotype"]]) dfg <- dfg[sample(1:nrow(dfg), nrow(dfg), replace = FALSE),] dfg[dfg == 1] <- "A" dfg[dfg == 2] <- "B" dfg[is.na(dfg)] <- "U" dfg <- cbind.data.frame(dfg, stringsAsFactors = FALSE) ## construct map testd <- mstmap(dfg, dist.fun = "kosambi", trace = FALSE) pull.map(testd) ## let's get a timing on that ... system.time(testd <- mstmap(dfg, dist.fun = "kosambi", trace = FALSE))
Parameter initialization function for pushCross
and pullCross
pp.init(seg.thresh = 0.05, seg.ratio = NULL, miss.thresh = 0.1, max.rf = 0.25, min.lod = 3)
pp.init(seg.thresh = 0.05, seg.ratio = NULL, miss.thresh = 0.1, max.rf = 0.25, min.lod = 3)
seg.thresh |
Numerical value between zero and one determining the p-value threshold for the test of marker segregation distortion. |
seg.ratio |
A character string of the form "AA:BB" or "AA:AB:BB" describing the ratio of the alleles. |
miss.thresh |
Numerical value between zero and one determining the proportion of missing values. |
max.rf |
The maximum recombination fraction to consider when attempting to cluster pushed markers back into linkage groups. |
min.lod |
The minimum LOD score to consider when attempting to cluster pushed markers back into linkage groups. |
This parameter initialization function is used by the function pullCross
to
pull markers from a linkage map and pushCross
to push markers
back into a linkage map. How the arguments seg.thresh
,
seg.ratio
and miss.thresh
are used depends on which
function is called. See pushCross
and pullCross
for
more details.
Return user defined parameter values for each of the parameters.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") ## pull markers from a linkage map with a segregation distortion pars <- pp.init(seg.thresh = 0.05) mapDH.s <- pullCross(mapDH, type = "seg.distortion", pars = pars) mapDH.s$seg.distortion$table
data(mapDH, package = "ASMap") ## pull markers from a linkage map with a segregation distortion pars <- pp.init(seg.thresh = 0.05) mapDH.s <- pullCross(mapDH, type = "seg.distortion", pars = pars) mapDH.s$seg.distortion$table
Profile individual genotype statistics for the current linkage map order of and R/qtl cross object
profileGen(cross, chr, bychr = TRUE, stat.type = c("xo", "dxo", "miss"), id = "Genotype", xo.lambda = NULL, ...)
profileGen(cross, chr, bychr = TRUE, stat.type = c("xo", "dxo", "miss"), id = "Genotype", xo.lambda = NULL, ...)
cross |
An qtl |
chr |
Character vector of linkage group names used for subsetting the linkage map. |
bychr |
Logical vector determining whether statistics should be plotted by chromosome (see Details). |
stat.type |
Character string of any combination of |
id |
Character string determining the column of |
xo.lambda |
A numerical value for the expected rate of recombination. (see Details). |
... |
Other arguments to be passed to the high level lattice plot. |
This function uses statGen
to profile statistics for the
genotypes for the current order of the linkage map. Any combination of
"xo"
or "dxo"
or "miss"
may be given to
simultaneous plot. If bychr = TRUE
then the plots will be further partitioned by
linkage groups given by chr
.
If a numerical value is given for xo.lambda
then the
recombination count for each genotype is tested against the expected
recombination rate xo.lambda
using a simple one-tailed test of a
Poisson mean. Any lines that have a p-value less than than a family wise
error rate based on bonferroni adjustment of the usual alpha level of 0.05 are
annotated on the profiles being plotted.
A lattice panel plot with panels described by the stat.type
given
in the call and genotype statistics are returned invisibly. If
xo.lambda
is not NULL then these statistics also include a
logical vector named "xo.lambda"
that is returned from testing
the individuals for inflated recombination rates (see Details).
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") ## profile all genotype crossover and double crossover statistics profileGen(mapDH, bychr = FALSE, stat.type = c("xo","dxo"), xo.lambda = 25, layout = c(1,3))
data(mapDH, package = "ASMap") ## profile all genotype crossover and double crossover statistics profileGen(mapDH, bychr = FALSE, stat.type = c("xo","dxo"), xo.lambda = 25, layout = c(1,3))
Graphically profile individual marker and interval statistics for an R/qtl cross object
profileMark(cross, chr, stat.type = "marker", use.dist = TRUE, map.function = "kosambi", crit.val = NULL, display.markers = FALSE, mark.line = FALSE, ...)
profileMark(cross, chr, stat.type = "marker", use.dist = TRUE, map.function = "kosambi", crit.val = NULL, display.markers = FALSE, mark.line = FALSE, ...)
cross |
An R/qtl |
chr |
Character vector of linkage group names used for subsetting the linkage map. |
stat.type |
Character string of either |
use.dist |
Logical value determining whether the actual map distances should be use
to represent marker positions. If |
map.function |
Character string of either |
crit.val |
The critical value to be used in displaying marker or intervals above a certain threshold (see Details). |
display.markers |
A logical value determining whether marker names should be displayed on the bottom axis. |
mark.line |
A logical value determining whether vertical lines should be drawn at marker positions. This may be useful to line up marker positions across several plots. |
... |
Other arguments to be passed to the high level lattice plot. |
This graphical function calls the function statMark
to retrieve
marker and interval statistics. If "marker"
is given as the
stat.type
then the complete set of marker statistics is plotted
simultaneously. If "interval"
is given as the
stat.type
then the function simultaneously plots the complete set
of interval statistics. Both can also be chosen.
This function also allows users to choose any combination of marker or interval statistics they would like to view. The set of available marker statistics that can be profiled are given below
"seg.dist"
: Profile the -log10 p-value.
results from a test of segregation distortion for each marker.
"miss"
: Profile the proportion of missing values
for each marker.
"prop"
: Profile the allele proportions for each
marker.
"dxo"
: Profile the number of double crossovers
occurring at each marker.
The set of available interval statistics that can be profiled are given below
"erf"
: Profile the recombination fractions for the
intervals.
"lod"
: Profile the LOD score for the test of no linkage
between markers in an interval.
"dist"
: Profile the interval map distance taken
from the map component of each linkage group.
"mrf"
: Profile the map recombination fraction for
the intervals.
"recomb"
: Profile the actual number of recombinations
within each of the intervals.
If crit.val="bonf"
and marker statistics are plotted then any
markers that have p-value for the test of segregation distortion less
than the family wise error rate based on a bonferroni adjustment of the
usual 0.05 alpha level, are annotated on each of the marker plots. If any interval statistics
are being plotted then any intervals that have a p-value for the test of
no linkage that is less than a bonferroni adjustment of the usual 0.05
alpha level are annotated on each of the interval statistics plots.
A lattice panel plot is displayed with panels described by the
stat.type
given in the call and the complete marker/interval statistics
are returned invisibly. If crit.val
is not NULL then both
the marker/interval statistics are returned with an extra logical column called
"crit.val"
from testing markers for segregation distortion and
intervals for weak linkage (see Details).
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") ## profile chosen statistics profileMark(mapDH, stat.type = c("seg.dist","prop","erf"), layout = c(1,4), type = "l")
data(mapDH, package = "ASMap") ## profile chosen statistics profileMark(mapDH, stat.type = c("seg.dist","prop","erf"), layout = c(1,4), type = "l")
Pull markers of a certain type from a linkage map and place them aside in the R/qtl object and, if appropriate, keeping their connections with the reduced linkage map.
pullCross(object, chr, type = c("co.located","seg.distortion", "missing"), pars = NULL, replace = FALSE, ...)
pullCross(object, chr, type = c("co.located","seg.distortion", "missing"), pars = NULL, replace = FALSE, ...)
object |
An qtl |
chr |
A character vector of linkage group names with which to subset the linkage map before pulling any markers. |
type |
A character string determining the type of markers to be pulled from the map (see Details). |
pars |
A list of parameters that are used by |
replace |
A logical value determining whether the markers and summary of marker information that is
pulled from the map replaces information that is already residing in the
|
... |
Currently ignored. |
This function gives users the ability to "pull" markers of several
different types from the linkage map and place them in appropriately named
elements of the cross object. These elements can be examined by the
user and can even be "pushed" back using the complementary command
pushCross
.
Currently supported types are:
type = "co.located"
. This type gives the user the ability to
reduce a linkage map to a unique set of markers for the purpose of
efficient map construction. Co-located markers are pulled from the
linkage map using the technology of findDupMarkers
from the
qtl package and places them aside
in a separate list element called "co.located"
. This element
contains the removed marker data as well as a table
that displays the connections between the co-located markers with
markers that remain in the linkage map. If required, this table is used
by pushCross
to "push" the co-located markers back into the
linkage map.
type = "seg.distortion"
. Users can pull markers with
segregation distortion from a linkage map with two different
thresholding mechanisms called using pars
. If the list argument
pars
is used with an element called seg.thresh
then markers are pulled from the map if the p-value from the test for segregation distortion
is LESS than seg.thresh
. Values of seg.thresh
must be
between 0 and 1. If pars
contains an element
seg.ratio
then markers are pulled from the map based on the
ratio provided. The ratio must be in character format and of the type
"AA:BB" for two allele populations and "AA:AB:BB" for three allele
populations (see Examples for more details). Markers are pulled if their
allele proportions are GREATER than the largest proportional ratio or LESS
than the smallest proportional ratio given in seg.thresh
. If neither
thresholding mechanisms are given then the default is
to use seg.thresh = 0.05
. If markers are found matching the above
criteria they are pulled from the linkage map and placed aside in an
element called "seg.distortion"
. This element contains the
removed distorted marker data as well as a table summarizing each of the
markers. See examples below for more detail.
type = "missing"
. Users can pull markers with a
proportional amount of missing allele scores. If pars
contains an
element miss.thresh
then markers are pulled from the linkage map
that have a proportion of missing values GREATER than
miss.thresh
. If no value is given for miss.thresh
then it defaults
to 0.1 or 10% missing values. If markers are found matching the above
criteria they are pulled from the map and are placed aside in an separate list element
called "missing"
. This element contains the
removed marker data as well as a table summarizing each of the
markers. See examples below for more detail.
The cross object is returned with identical class structure as the inputted cross object and an additional elements corresponding to the marker types being pulled from the map.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") ## pull co-located markers from linkage map mapDH.c <- pullCross(mapDH, type = "co.located") mapDH.c$co.located$table ## pull distorted markers from linkage map using seg.thresh mapDH.s <- pullCross(mapDH, type = "seg.distortion", pars = list(seg.thresh = 0.05)) mapDH.s$seg.distortion$table ## pull distorted markers from linkage map using seg.ratio mapDH.s <- pullCross(mapDH, type = "seg.distortion", pars = list(seg.ratio = "56:44")) mapDH.s$seg.distortion$table
data(mapDH, package = "ASMap") ## pull co-located markers from linkage map mapDH.c <- pullCross(mapDH, type = "co.located") mapDH.c$co.located$table ## pull distorted markers from linkage map using seg.thresh mapDH.s <- pullCross(mapDH, type = "seg.distortion", pars = list(seg.thresh = 0.05)) mapDH.s$seg.distortion$table ## pull distorted markers from linkage map using seg.ratio mapDH.s <- pullCross(mapDH, type = "seg.distortion", pars = list(seg.ratio = "56:44")) mapDH.s$seg.distortion$table
Push unlinked markers or markers that were originally placed aside by
pullCross
back into linkage groups of an established R/qtl linkage map.
pushCross(object, type = c("co.located","seg.distortion", "missing","unlinked"), unlinked.chr = NULL, pars = NULL, ...)
pushCross(object, type = c("co.located","seg.distortion", "missing","unlinked"), unlinked.chr = NULL, pars = NULL, ...)
object |
An R/qtl |
type |
A character string determining the type of markers to be pushed into the linkage map (see Details). |
unlinked.chr |
A character string of linkage group names containing markers that
require pushing into the remaining linkage groups of the object. This is only useful when
|
pars |
A list of parameters that are used by |
... |
Currently ignored. |
This function was written explicitly to complement pullCross
by
"pushing" markers of certain types back into linkage groups of an
established linkage map.
Currently supported marker types are:
type = "co.located"
. Users can push co-located markers back
into the linkage map that have been set aside in the cross object element
co.located
. To ensure this can be used at any stage of the linkage map
construction process the function disregards the linkage group information
provided in the table formed by using pullCross
. Instead it uses the
current positions of the markers in the reduced linkage map to determine
where to push the co-located markers back to.
type = "seg.distortion"
. Users can push markers from the
"seg.distortion"
element of the object back into a linkage map using the thresholding
mechanisms seg.thresh
and seg.ratio
called using
pars
. If seg.thresh
is given then the markers are pushed
back that have p-values that are GREATER than seg.thresh
. If
pars
contains an element seg.ratio
then markers are pushed
back based on the ratio provided. The ratio must be in character format and of the type
"AA:BB" for two allele populations and "AA:AB:BB" for three allele
populations (see Examples for more details). Markers are pushed back if their
allele proportions are LESS than the largest proportional ratio or GREATER
than the smallest proportional ratio given in seg.thresh
. If neither
thresholding mechanisms are given then the default is to use seg.thresh = 0.05
.
type = "missing"
. Users can push markers from the object
element "missing"
back into the linkage map using the
thresholding parameter miss.thresh
called using
pars
. Markers will be pushed back that have a
proportion of missing values LESS than miss.thresh
. If no value
is given for this parameter it defaults to 0.1 or 10% missing values.
type = "unlinked"
. Users can push unlinked markers that
reside in linkage groups of the established linkage map. If this type is
chosen unlinked.chr
must be a character string of linkage group
names in the object.
For types "seg.distortion"
, "missing"
and
"unlinked"
a fast clustering method is used to allocate markers
to established linkage groups. This is done very
efficiently by reducing the constructed linkage map to a skeleton set
of markers before checking linkages. How these linkages are formed can
be tweaked by setting max.rf
and min.lod
when calling
pars
. These currently default to max.rf = 0.25
and
min.lod = 3
.
Users should explicitly avoid the use of "UL" as part of a linkage group name as
this is used internally to name unlinked groups of markers if required.
It should also be noted that this function does not re-construct the
object after allocating markers to linkage groups. For efficient linkage map
reconstruction of an R/qtl object see mstmap.cross()
.
The cross object is returned with an identical class structure as the inputted cross object with additional markers from the marker types pushed into linkage groups of the established linkage map. If all markers of an element type are pushed back then the element type is removed from the object.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") ## pull co-located markers from map mapDH.c <- pullCross(mapDH, type = "co.located") mapDH.c$co.located$table ## push co-located markers back into linkage map mapDH.z <- pushCross(mapDH.c, type = "co.located") pull.map(mapDH.z)
data(mapDH, package = "ASMap") ## pull co-located markers from map mapDH.c <- pullCross(mapDH, type = "co.located") mapDH.c$co.located$table ## push co-located markers back into linkage map mapDH.z <- pushCross(mapDH.c, type = "co.located") pull.map(mapDH.z)
P-value graph to determine threshold for marker clustering
pValue(dist = seq(25,40, by = 5), pop.size = 100:500, map.function = "kosambi", LOD = FALSE)
pValue(dist = seq(25,40, by = 5), pop.size = 100:500, map.function = "kosambi", LOD = FALSE)
dist |
Numeric range of genetic distances in cM. |
pop.size |
Numeric range of population sizes. |
map.function |
Character string of either |
LOD |
If |
This function provides the ability to create a user specified p-value plot similar to Figure 1.1 in the vignette for the package.
A plot is displayed showing minus log10 pvalue (or LOD score) of linkage vs the range of specified population sizes for different specified genetic distances.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
mstmap.cross
and mstmap.data.frame
pValue(dist = seq(25, 40, by = 2))
pValue(dist = seq(25, 40, by = 2))
Very quick estimation of genetic map distances for a constructed R/qtl object
quickEst(object, chr, map.function = "kosambi", ...)
quickEst(object, chr, map.function = "kosambi", ...)
object |
An R/qtl |
chr |
A character string of linkage group names that require (re)estimation of their genetic map distances. |
map.function |
Character string of either |
... |
Other arguments passed to |
For linkage groups with large numbers of markers, the Hidden Markov algorithm in est.map
can be extremely slow. The computational burden for this algorithm
increases as the number of missing values and genotyping errors
increase. quickEst
circumvents this by using the Viterbi
algorithm computationally implemented in argmax.geno
of the
qtl package. Initial conservative estimates of the map distances
are calculated from inverting recombination fractions outputted from
est.rf
. These are then passed to argmax.geno
and
imputation of missing allele scores is performed along with
re-estimation of map distances.
The cross object is returned with identical class structure as the inputted cross object.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") mapDH1 <- quickEst(mapDH, map.function = "kosambi")
data(mapDH, package = "ASMap") mapDH1 <- quickEst(mapDH, map.function = "kosambi")
Individual genotype statistics for the current linkage map order of and R/qtl cross object
statGen(cross, chr, bychr = TRUE, stat.type = c("xo","dxo", "miss"), id = "Genotype")
statGen(cross, chr, bychr = TRUE, stat.type = c("xo","dxo", "miss"), id = "Genotype")
cross |
An R/qtl |
chr |
Character vector of linkage group names used for subsetting the linkage map. |
bychr |
Logical vector determining whether statistics should be plotted by chromosome (see Details). |
stat.type |
Character string of any combination of |
id |
Character string determining the column of |
This function is used in profileGen
to plot any combination of
returned linkage map statistics on a single graphical display.
A list with elements named by the stat.type
used in the call. If
bychr = TRUE
then each element is a data frame of statistics with
columns named by the linkage groups. If bychr = FALSE
then each
element is a vector of statistics named by the stat.type
.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") ## produce all genotype crossover and double crossover statistics sg <- statGen(mapDH, stat.type = c("xo","dxo"))
data(mapDH, package = "ASMap") ## produce all genotype crossover and double crossover statistics sg <- statGen(mapDH, stat.type = c("xo","dxo"))
Individual marker and interval statistics for an R/qtl cross object
statMark(cross, chr, stat.type = c("marker","interval"), map.function = "kosambi")
statMark(cross, chr, stat.type = c("marker","interval"), map.function = "kosambi")
cross |
An qtl |
chr |
Character vector of linkage group names used for subsetting the linkage map. |
stat.type |
Character string of either |
map.function |
Character string of either |
If "marker"
is chosen then a call to geno.table
from
qtl is used to return individual marker statistics for segregation distortion,
as well as allele and missing value proportions. For the current map
order the number of double crossovers at each marker are also returned.
If "interval"
is chosen then interval statistics are returned for
the current map order. These include the estimated recombination
fraction and LOD score between adjacent markers, calculated from
est.rf
in qtl. Also returned are the map interval distances and
converted map recombination fractions extracted from the "map"
component of each linkage group as well as the actual number of
recombinations between markers.
This function is used in profileMark
to plot any combination of
returned linkage map statistics on a single graphical display.
A list named by the stat.type
used in the call. Each element is a
data frame of statistics with columns named by the statistic.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
data(mapDH, package = "ASMap") ## produce all statistics sm <- statMark(mapDH, stat.type = c("marker","interval"))
data(mapDH, package = "ASMap") ## produce all statistics sm <- statMark(mapDH, stat.type = c("marker","interval"))
Subset an R/qtl object by chromosome or by individuals for populations used within the R/ASMap package.
subsetCross(cross, chr, ind, ...)
subsetCross(cross, chr, ind, ...)
cross |
A |
chr |
Optional vector specifying which chromosomes to keep or
discard. This may be a logical, numeric, or character string
vector. See |
ind |
Optional vector specifying which individuals to keep or discard. This may be a logical or numeric vector (see Details). |
... |
Kept for compatability with |
This function is a replacement version of subset.cross
that should be
used if the cross
object contains any or all of the components "co.located"
,
"seg.distortion"
and "missing"
created by a
pullCross
call. For a given ind
, the function calls
subset.cross
to ensure that all elements created from calls to
native R/qtl functions are subsetted appropriately. In addition,
the "co.located"
, "seg.distortion"
and "missing"
elements are also subsetted and if components "seg.distortion"
and "missing"
exist, statistics in their respective tables are
recalculated.
It provides identical functionality to subset.cross
with the exception that ind
can only be a logical or numeric
vector.
The cross object is returned with the appropriate subsetting.
Julian Taylor
Taylor, J., Butler, D. (2017) R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis. Journal of Statistical Software, 79(6), 1–29.
subset.cross
and pullCross
data(mapDH, package = "ASMap") mapDH.s <- pullCross(mapDH, type = "seg.distortion") mapDH.s <- subsetCross(mapDH.s, ind = 3:218) dim(mapDH.s$seg.distortion$data)
data(mapDH, package = "ASMap") mapDH.s <- pullCross(mapDH, type = "seg.distortion") mapDH.s <- subsetCross(mapDH.s, ind = 3:218) dim(mapDH.s$seg.distortion$data)