genomicSimulationC 0.3
|
How the simulation stores data. More...
Modules | |
Deletor Functions | |
For deleting and free associated memory of data structures. | |
Data Structures | |
struct | gsc_TableSize |
struct | gsc_MarkerBlocks |
A struct used to store a set of blocks of markers. More... | |
struct | gsc_DecimalMatrix |
A row-major heap matrix that contains floating point numbers. More... | |
struct | gsc_PedigreeID |
A type representing a program-lifetime-unique identifier for a genotype, to be used in tracking pedigree. More... | |
struct | gsc_GroupNum |
A type representing the identifier of a group of genotypes. More... | |
struct | gsc_EffectID |
A type representing a particular loaded set of marker effects. More... | |
struct | gsc_LabelID |
A type representing a particular custom label. More... | |
struct | gsc_MapID |
A type representing a particular loaded recombination map. More... | |
struct | gsc_MultiIDSet |
Simple crate that stores a GroupNum, a MapID, and an EffectID. More... | |
struct | gsc_GenOptions |
A type that contains choices of settings for gsc_SimData functions that create a new gsc_AlleleMatrix/generation. More... | |
struct | gsc_SimpleLinkageGroup |
Parameters for simulating meiosis on a linkage group whose markers are stored contiguously in the simulation. More... | |
struct | gsc_ReorderedLinkageGroup |
Parameters for simulating meiosis on a linkage group whose markers are re-ordered compared to the first recombination map. More... | |
struct | gsc_LinkageGroup |
A generic store for a linkage group, used to simulate meiosis on a certain subset of markers. More... | |
struct | gsc_RecombinationMap |
A type that stores linkage groups and crossover probabilities for simulating meiosis. More... | |
struct | gsc_KnownGenome |
A type that stores the genome structure used in simulation. More... | |
struct | gsc_AlleleMatrix |
struct | gsc_MarkerEffects |
A type that stores the information needed to calculate breeding values from alleles at markers. More... | |
struct | gsc_SimData |
Composite type that is used to run crossing simulations. More... | |
struct | gsc_TableFileReader |
Stream reader for files of some tabular format. More... | |
struct | gsc_TableFileCell |
Represent a cell read by a gsc_TableFileReader. More... | |
struct | gsc_MapfileUnit |
Unprocessed data for one marker (linkage group and position) loaded from a map file. More... | |
struct | gsc_EffectfileUnit |
Unprocessed data for one marker effect loaded from an effect file. More... | |
struct | gsc_GenotypeFile_MatrixFormat |
Variants in the format of a genotype matrix file. More... | |
struct | gsc_FileFormatSpec |
File format specifier for the genotype input file. More... | |
Macros | |
#define | GSC_CREATE_BUFFER(n, type, length) |
Macro to create a stretchy buffer of any type and some length. More... | |
#define | GSC_BUFFER_ISHEAP(n) n##cap >= sizeof(n##stack)/sizeof(n##stack[0]) |
For debugging purposes. More... | |
#define | GSC_FINALISE_BUFFER(n, as, nentries) |
Macro to convert a stretchy buffer to a solid heap vector. More... | |
#define | GSC_DELETE_BUFFER(n) |
Macro to delete a stretchy buffer. More... | |
#define | GSC_STRETCH_BUFFER(n, newlen) |
Macro to expand the capacity of a stretchy buffer. More... | |
#define | GSC_NO_PEDIGREE (gsc_PedigreeID){.id=GSC_NA_ID} |
Empty/null value for pedigree fields. More... | |
#define | GSC_NO_GROUP (gsc_GroupNum){.num=GSC_NA_ID} |
Empty/null value for group allocations. More... | |
#define | GSC_NO_EFFECTSET (gsc_EffectID){.id=GSC_NA_ID} |
Empty/null value for effect set identifiers. More... | |
#define | GSC_NO_LABEL (gsc_LabelID){.id=GSC_NA_ID} |
Empty/null value for custom label identifiers. More... | |
#define | GSC_NO_MAP (gsc_MapID){.id=GSC_NA_ID} |
Empty/null value for recombination map identifiers. More... | |
#define | GSC_DETECT_FILE_FORMAT ((gsc_FileFormatSpec){.filetype=GSC_GENOTYPEFILE_UNKNOWN}) |
File format specifier to instruct genomicSimulation loaders to auto-detect all details of the file format. More... | |
Typedefs | |
typedef struct gsc_AlleleMatrix | gsc_AlleleMatrix |
A linked list entry that stores a matrix of alleles for a set of SNP markers and genotypes. More... | |
Enumerations | |
enum | gsc_TableFileCurrentStatus { GSC_TABLEFILE_NEWLINE , GSC_TABLEFILE_COLUMNGAP , GSC_TABLEFILE_CONTENTS , GSC_TABLEFILE_ERROR_EOF , GSC_TABLEFILE_ERROR_EOBUF } |
Represent possible states of the cursor of a gsc_TableFileReader. More... | |
enum | gsc_GenotypeFileCellStyle { GSC_GENOTYPECELLSTYLE_PAIR , GSC_GENOTYPECELLSTYLE_COUNT , GSC_GENOTYPECELLSTYLE_ENCODED , GSC_GENOTYPECELLSTYLE_SLASHPAIR , GSC_GENOTYPECELLSTYLE_UNKNOWN } |
Represent possible representations of alleles at a marker in a genotype file. More... | |
enum | gsc_GenotypeFileType { GSC_GENOTYPEFILE_UNKNOWN , GSC_GENOTYPEFILE_MATRIX , GSC_GENOTYPEFILE_BED , GSC_GENOTYPEFILE_PED , GSC_GENOTYPEFILE_VCF } |
Enumerate types of genotype files that the simulation knows how to load. More... | |
Functions | |
gsc_TableFileReader | gsc_tablefilereader_create (const char *filename) |
Open a file for reading with gsc_TableFileReader. More... | |
void | gsc_tablefilereader_close (gsc_TableFileReader *tbl) |
Close a gsc_TableFileReader's file pointer. More... | |
void | gsc_helper_tablefilereader_refill_buffer (gsc_TableFileReader *tbl) |
Read another buffer's worth of characters from a gsc_TableFileReader's file. More... | |
enum gsc_TableFileCurrentStatus | gsc_helper_tablefilereader_classify_char (gsc_TableFileReader *tbl) |
Classify the character under the cursor of a TableFileReader as cell contents or otherwise. More... | |
void | gsc_tablefilecell_deep_copy (gsc_TableFileCell *c) |
Allocate memory to store a deep copy of a gsc_TableFileCell, if previously only a shallow copy. More... | |
gsc_TableFileCell | gsc_tablefilereader_get_next_cell (gsc_TableFileReader *tbl) |
Read forwards in TableFileReader and return the next cell's contents, as well as how many column gaps and newlines preceeded it. More... | |
gsc_FileFormatSpec | gsc_define_matrix_format_details (const GSC_LOGICVAL has_header, const GSC_LOGICVAL markers_as_rows, const enum gsc_GenotypeFileCellStyle cell_style) |
Give genomicSimulation hints on the format of a genotype matrix file to be loaded. More... | |
Variables | |
const gsc_GenOptions | GSC_BASIC_OPT |
Default parameter values for GenOptions, to help with quick scripts and prototypes. More... | |
How the simulation stores data.
genomicSimulation is a state-based package. These data structures are used to store the library's data/state. Many use dynamically allocated memory so call the relevant delete_ function if one exists when you are finished with them.
gsc_SimData is the central state/data storage struct, and so is a required parameter to most user-facing functions. It contains pointers to a gsc_KnownGenome (storing the loaded markers and any recombination maps), a gsc_MarkerEffects (storing the loaded allele effects), and an gsc_AlleleMatrix (storing metadata and genotypes of founders and simulated offspring).
Other structs in this group (gsc_TableSize, gsc_MarkerBlocks, gsc_GenOptions) represented specific types of data and are used as parameters and return value of certain functions.
#define GSC_BUFFER_ISHEAP | ( | n | ) | n##cap >= sizeof(n##stack)/sizeof(n##stack[0]) |
For debugging purposes.
Definition at line 448 of file sim-operations.h.
#define GSC_CREATE_BUFFER | ( | n, | |
type, | |||
length | |||
) |
Macro to create a stretchy buffer of any type and some length.
After this macro is run, a buffer with the requested type and capacity will exist in the scope under the requested name.
This macro will also create two helper variables. Their names will be generated based on the name of the buffer:
Use this buffer only within one local scope. These functions won't work for buffers that have escaped their scope and so left their helper variables behind.
The buffer will be allocated on the stack if its length will not make it exceed CONTIG_WIDTH*sizeof(int) in size in bytes. Otherwise, it will be allocated on the heap. For safety, you should call GSC_DELETE_BUFFER on the buffer once you have finished using it, even if you believe it is small enough to have been allocated on the stack.
n | name for the buffer. |
type | type of each entry in the buffer (eg int). |
length | number of entries the buffer should be able to hold. |
Definition at line 438 of file sim-operations.h.
#define GSC_DELETE_BUFFER | ( | n | ) |
Macro to delete a stretchy buffer.
The buffer named {n}, and its assistant variable {n}cap and {n}stack, must exist in the current scope. They would be created by GSC_CREATE_BUFFER
n | name of the buffer. |
Definition at line 477 of file sim-operations.h.
#define GSC_DETECT_FILE_FORMAT ((gsc_FileFormatSpec){.filetype=GSC_GENOTYPEFILE_UNKNOWN}) |
File format specifier to instruct genomicSimulation loaders to auto-detect all details of the file format.
Definition at line 1093 of file sim-operations.h.
#define GSC_FINALISE_BUFFER | ( | n, | |
as, | |||
nentries | |||
) |
Macro to convert a stretchy buffer to a solid heap vector.
The buffer named {n}, and its assistant variable {n}cap and {n}stack, must exist in the current scope. They would be created by GSC_CREATE_BUFFER
This is an alternative to GSC_DELETE_BUFFER, if you want to keep the results.
n | name of the buffer. |
as | name of the finalised buffer. |
nentries | number of entries to copy, if less than buffer capacity |
Definition at line 463 of file sim-operations.h.
#define GSC_NO_EFFECTSET (gsc_EffectID){.id=GSC_NA_ID} |
Empty/null value for effect set identifiers.
Has short name: NO_EFFECTSET
Definition at line 590 of file sim-operations.h.
#define GSC_NO_GROUP (gsc_GroupNum){.num=GSC_NA_ID} |
Empty/null value for group allocations.
Has short name: NO_GROUP
Definition at line 578 of file sim-operations.h.
#define GSC_NO_LABEL (gsc_LabelID){.id=GSC_NA_ID} |
Empty/null value for custom label identifiers.
Has short name: NO_LABEL
Definition at line 602 of file sim-operations.h.
Empty/null value for recombination map identifiers.
Has short name: NO_MAP
Definition at line 614 of file sim-operations.h.
#define GSC_NO_PEDIGREE (gsc_PedigreeID){.id=GSC_NA_ID} |
Empty/null value for pedigree fields.
Has short name: NO_PEDIGREE
Definition at line 565 of file sim-operations.h.
#define GSC_STRETCH_BUFFER | ( | n, | |
newlen | |||
) |
Macro to expand the capacity of a stretchy buffer.
The buffer named {n}, and its assistant variables {n}cap and {n}stack, must exist in the current scope. They would be created by GSC_CREATE_BUFFER
After this macro executes, the buffer named {n} will have the capacity to hold {n}cap entries. Unless memory allocation failed, {n}cap will be greater than or equal to the requested new length. Check the value of {n}cap to check that resizing succeeded.
n | name of the buffer. |
newlen | after execution, the buffer should be able to hold this many entries, unless memory allocation failed (can be checked with n{cap} >= newlen ) |
Definition at line 498 of file sim-operations.h.
typedef struct gsc_AlleleMatrix gsc_AlleleMatrix |
A linked list entry that stores a matrix of alleles for a set of SNP markers and genotypes.
The simulation stores its genotypes in a list of AlleleMatrix nodes. Each node can store up to CONTIG_WIDTH genotypes.
Has short name: AlleleMatrix
Definition at line 787 of file sim-operations.h.
Represent possible representations of alleles at a marker in a genotype file.
Enumerator | |
---|---|
GSC_GENOTYPECELLSTYLE_PAIR | |
GSC_GENOTYPECELLSTYLE_COUNT | |
GSC_GENOTYPECELLSTYLE_ENCODED | |
GSC_GENOTYPECELLSTYLE_SLASHPAIR | |
GSC_GENOTYPECELLSTYLE_UNKNOWN |
Definition at line 959 of file sim-operations.h.
enum gsc_GenotypeFileType |
Enumerate types of genotype files that the simulation knows how to load.
The format of the file cannot be automatically in all cases. This type exists so that users can specify the format of input files.
Has short name: GenotypeFileType
The format of the file is decoded separately from the format of the alleles of each genotype at each marker. In the templates of each format, "[alleles]" can represent:
IUPAC nucleotide encoding: Code => Alleles key: A => AA ; C => CC ; G => GG ; T => TT ; R => AG ; Y => CT ; S => CG ; W => AT ; K => GT ; M => AC
The format of the "[alleles]" cells in a file can be automatically determined out of these options.
Definition at line 1011 of file sim-operations.h.
Represent possible states of the cursor of a gsc_TableFileReader.
Enumerator | |
---|---|
GSC_TABLEFILE_NEWLINE | |
GSC_TABLEFILE_COLUMNGAP | |
GSC_TABLEFILE_CONTENTS | |
GSC_TABLEFILE_ERROR_EOF | |
GSC_TABLEFILE_ERROR_EOBUF |
Definition at line 898 of file sim-operations.h.
gsc_FileFormatSpec gsc_define_matrix_format_details | ( | const GSC_LOGICVAL | has_header, |
const GSC_LOGICVAL | markers_as_rows, | ||
const enum gsc_GenotypeFileCellStyle | cell_style | ||
) |
Give genomicSimulation hints on the format of a genotype matrix file to be loaded.
Sometimes genomicSimulation's automatic file formatting detection may misinterpret the formatting of a genotype matrix (eg assuming markers are columns, when they are actually rows of the matrix; assuming there is no header row even though there is one; being unable to determine that the body of the matrix are alternate allele counts because there are some confusingly-placed "NA"s). For particularly large files, the file formatting detection process might slow down file imports or require more memory.
To bypass part or all of the formatting detection steps when importing a genotype matrix file, this function can be used to provide the final parameter for
has_header | GSC_TRUE if the genotype matrix to be imported definitely has a header row, GSC_FALSE if the genotype matrix has no header row, or some other value (eg GSC_NAGSC_NA) to not bypass the header detection steps of the import process. |
markers_as_rows | GSC_TRUE if each row in the genotype matrix represents a genetic marker, GSC_FALSE if each column of the genotype matrix represents a genetic marker, or some other value (eg GSC_NA) to not bypass the orientation detection steps of the import process. |
cell_style | The style in which the alleles of a candidate at a marker are encoded in the body cells of the genotype matrix. Use GSC_GENOTYPECELLSTYLE_UNKNOWN to not bypass the cell style detection step of the import process. |
Definition at line 7051 of file sim-operations.c.
enum gsc_TableFileCurrentStatus gsc_helper_tablefilereader_classify_char | ( | gsc_TableFileReader * | tbl | ) |
Classify the character under the cursor of a TableFileReader as cell contents or otherwise.
Does not update tbl->cursor, so repeated calls of this same function without updating the cursor in between will return the same result.
Definition at line 5167 of file sim-operations.c.
void gsc_helper_tablefilereader_refill_buffer | ( | gsc_TableFileReader * | tbl | ) |
Read another buffer's worth of characters from a gsc_TableFileReader's file.
Definition at line 5153 of file sim-operations.c.
void gsc_tablefilecell_deep_copy | ( | gsc_TableFileCell * | c | ) |
Allocate memory to store a deep copy of a gsc_TableFileCell, if previously only a shallow copy.
The deep copy will be a null-terminated string even if the shallow copy was not null-terminated.
After this call, the cell is stored in heap memory and will need to be freed once the cell is no longer needed. Schema for doing this: if (!mycell.isCellShallow) { GSC_FREE(mycell.cell); }
Definition at line 5196 of file sim-operations.c.
void gsc_tablefilereader_close | ( | gsc_TableFileReader * | tbl | ) |
Close a gsc_TableFileReader's file pointer.
Definition at line 5141 of file sim-operations.c.
gsc_TableFileReader gsc_tablefilereader_create | ( | const char * | filename | ) |
Open a file for reading with gsc_TableFileReader.
On successfully opening file, fills the TableFileReader buffer for the first time.
Definition at line 5121 of file sim-operations.c.
gsc_TableFileCell gsc_tablefilereader_get_next_cell | ( | gsc_TableFileReader * | tbl | ) |
Read forwards in TableFileReader and return the next cell's contents, as well as how many column gaps and newlines preceeded it.
Cells can be of unlimited length, as long as they fit in memory.
Definition at line 5211 of file sim-operations.c.
|
extern |
Default parameter values for GenOptions, to help with quick scripts and prototypes.
Has short name: BASIC_OPT
Definition at line 10 of file sim-operations.c.