genomicSimulationC 0.2.6
Functions
Recombination Calculators

Experimental functions for retroactively calculating number of recombinations. More...

Functions

int * gsc_calculate_min_recombinations_fw1 (gsc_SimData *d, gsc_MapID mapid, char *parent1, unsigned int p1num, char *parent2, unsigned int p2num, char *offspring, int certain)
 Identify markers in the genotype of offspring where recombination from its parents occured. More...
 
int * gsc_calculate_min_recombinations_fwn (gsc_SimData *d, gsc_MapID mapid, char *parent1, unsigned int p1num, char *parent2, unsigned int p2num, char *offspring, int window_size, int certain)
 Identify markers in the genotype of offspring where recombination from its parents occured, as judged by the marker itself and a short window around it. More...
 
static int gsc_has_same_alleles (const char *p1, const char *p2, const size_t i)
 Simple operator to determine if at marker i, two genotypes share at least one allele. More...
 
static int gsc_has_same_alleles_window (const char *g1, const char *g2, const size_t start, const size_t w)
 Simple operator to determine if at markers with indexes i to i+w inclusive, two genotypes share at least one allele. More...
 
int gsc_calculate_recombinations_from_file (gsc_SimData *d, const char *input_file, const char *output_file, int window_len, int certain)
 Provides guesses as to the location of recombination events that led to the creation of certain genotypes from certain other genotypes. More...
 

Detailed Description

Experimental functions for retroactively calculating number of recombinations.

This functionality is for interest only. It is not clear, or tidy, or checked against real data.

Function Documentation

◆ gsc_calculate_min_recombinations_fw1()

int * gsc_calculate_min_recombinations_fw1 ( gsc_SimData d,
gsc_MapID  mapid,
char *  parent1,
unsigned int  p1num,
char *  parent2,
unsigned int  p2num,
char *  offspring,
int  certain 
)

Identify markers in the genotype of offspring where recombination from its parents occured.

This function is a little lower-level (see the kinds of parameters required) and so a wrapper like gsc_calculate_recombinations_from_file is suggested for end users.

See also
gsc_calculate_recombinations_from_file()

The function reads start to end along each chromosome. At each marker, it checks if the alleles the offspring has could only have come from one parent/there is known parentage of that allele. If that is the case, it saves the provided id number of the source parent to the matching position in the result vector. If it is not the case, its behaviour depends on the certain parameter.

Parents do not have to be directly identified as parents by the pedigree functionality of this library. A sample usage is performing a cross then multiple generations of selfing, then comparing the final inbred to the original two lines of the cross.

Parameters
dpointer to the gsc_SimData struct whose genetic map matches the provided genotypes.
mapidID of the map from which to calculate potential historical crossovers, or NO_MAP to use the first-loaded/primary map by default
parent1a character vector containing one parent's alleles at each marker in the gsc_SimData.
p1numan integer that will be used to identify areas of the genome that come from the first parent in the returned vector.
parent2a character vector containing the other parent's alleles at each marker in the gsc_SimData.
p2numan integer that will be used to identify areas of the genome that come from the second parent in the returned vector.
offspringa character vector containing the alleles at each marker in the gsc_SimData of the genotype whose likely recombinations we want to identify.
certaina boolean. If TRUE, markers where the parent of origin cannot be identified will be set to 0, if FALSE, the value will be set to the id of the parent that provided the most recently identified allele in that chromosome.
Returns
a heap vector of length d->n_markers containing the id of the parent of origin at each marker in the offspring genotype.

Definition at line 7148 of file sim-operations.c.

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ gsc_calculate_min_recombinations_fwn()

int * gsc_calculate_min_recombinations_fwn ( gsc_SimData d,
gsc_MapID  mapid,
char *  parent1,
unsigned int  p1num,
char *  parent2,
unsigned int  p2num,
char *  offspring,
int  window_size,
int  certain 
)

Identify markers in the genotype of offspring where recombination from its parents occured, as judged by the marker itself and a short window around it.

This function is a little lower-level (see the kinds of parameters required) and so a wrapper like gsc_calculate_recombinations_from_file is suggested for end users.

See also
gsc_calculate_recombinations_from_file()

The function reads start to end along each chromosome. At each marker, it checks if the alleles the offspring has in the window centered at that marker could have come from one parent but could not have come from the other/there is known parentage of that allele. If that is the case, it saves the provided id number of the source parent to the matching position in the result vector. If it is not the case, its behaviour depends on the certain parameter.

Parents do not have to be directly identified as parents by the pedigree functionality of this library. A sample usage is performing a cross then multiple generations of selfing, then comparing the final inbred to the original two lines of the cross.

Behaviour when the window size is not an odd integer has not been tested.

Parameters
dpointer to the gsc_SimData struct whose genetic map matches the provided genotypes.
mapidID of the map from which to calculate potential historical crossovers, or NO_MAP to use the first-loaded/primary map by default
parent1a character vector containing one parent's alleles at each marker in the gsc_SimData.
p1numan integer that will be used to identify areas of the genome that come from the first parent in the returned vector.
parent2a character vector containing the other parent's alleles at each marker in the gsc_SimData.
p2numan integer that will be used to identify areas of the genome that come from the second parent in the returned vector.
offspringa character vector containing the alleles at each marker in the gsc_SimData of the genotype whose likely recombinations we want to identify.
window_sizean odd integer representing the number of markers to check for known parentage around each marker
certaina boolean. If TRUE, markers where the parent of origin cannot be identified will be set to 0, if FALSE, the value will be set to the id of the parent that provided the most recently identified allele in that chromosome.
Returns
a heap vector of length d->n_markers containing the id of the parent of origin at each marker in the offspring genotype.

Definition at line 7257 of file sim-operations.c.

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ gsc_calculate_recombinations_from_file()

int gsc_calculate_recombinations_from_file ( gsc_SimData d,
const char *  input_file,
const char *  output_file,
int  window_len,
int  certain 
)

Provides guesses as to the location of recombination events that led to the creation of certain genotypes from certain other genotypes.

The input file (which pairs up which targets and their parents the calculation should be carried out on) should have format:

[target name] [parent1name] [parent2name]

[target name] [parent1name] [parent2name]

...

The tab-separated output file produced by this function will have format:

[marker 1 name] [marker 2 name]...

[target name] [tab-separated recombination vector, containing the index at each marker of the parent the function guesses the target's alleles came from, or 0 if this is unknow]

...

Parents do not have to be directly identified as parents by the pedigree functionality of this library. A sample usage is performing a cross then multiple generations of selfing, then comparing the final inbred to the original two lines of the cross.

Parameters
dpointer to the gsc_SimData struct containing the genotypes and map under consideration.
input_filestring containing the name of the file with the pairs of parents and offsprings of which to calculate recombinations
output_filestring containing the filename to which to save the results.
window_lenan odd integer representing the number of markers to check for known parentage around each marker
certainTRUE to fill locations where parentage is unknown with 0, FALSE to fill locations where parentage is unknown with the most recent known parent
Returns
0 on success.

Definition at line 7374 of file sim-operations.c.

+ Here is the call graph for this function:

◆ gsc_has_same_alleles()

static int gsc_has_same_alleles ( const char *  p1,
const char *  p2,
const size_t  i 
)
inlinestatic

Simple operator to determine if at marker i, two genotypes share at least one allele.

Checks only 3 of four possible permutations because assumes there cannot be more than two alleles at a given marker.

Parameters
p1pointer to a character array genotype of the type stored in an gsc_AlleleMatrix (2*n_markers long, representing the two alleles at a marker consecutively) for the first of the genotypes to compare.
p2pointer to a character array genotype for the second of the genotypes to compare.
iindex of the marker at which to perform the check
Returns
boolean result of the check

Definition at line 1815 of file sim-operations.h.

+ Here is the caller graph for this function:

◆ gsc_has_same_alleles_window()

static int gsc_has_same_alleles_window ( const char *  g1,
const char *  g2,
const size_t  start,
const size_t  w 
)
inlinestatic

Simple operator to determine if at markers with indexes i to i+w inclusive, two genotypes share at least one allele.

Checks only 3 of four possible permutations at each marker because assumes there cannot be more than two alleles at a given marker. For the return value to be true, there must be at least one match at every one of the markers in the window.

Parameters
g1pointer to a character array genotype of the type stored in an gsc_AlleleMatrix (2*n_markers long, representing the two alleles at a marker consecutively) for the first of the genotypes to compare.
g2pointer to a character array genotype for the second of the genotypes to compare.
startindex of the first marker in the window over which to perform the check
wlength of the window over which to perform the check
Returns
boolean result of the check

Definition at line 1832 of file sim-operations.h.

+ Here is the caller graph for this function: