FastqPuri
Classes | Typedefs | Functions
tree.h File Reference

Construction of tree, check paths, write tree, read in tree. More...

#include <stdint.h>
#include "defines.h"
#include "fa_read.h"
Include dependency graph for tree.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  _node
 Node structure: formed out of T_ACGT pointers to Node structure. More...
 
struct  _tree
 structure containing a T_ACGT-tree. More...
 

Typedefs

typedef struct _node Node
 Node structure: formed out of T_ACGT pointers to Node structure.
 
typedef struct _tree Tree
 structure containing a T_ACGT-tree. More...
 

Functions

Nodeget_new_pool (Tree *tree_ptr)
 reallocs pool_2D (++NPOOL_2D) if all existing nodes have been used More...
 
Nodenew_node_buf (Tree *tree_ptr)
 moves to the next node (allocating new memory if necessary) More...
 
void free_all_nodes (Tree *tree_ptr)
 frees the whole tree structure More...
 
void insert_Lmer (Tree *tree_ptr, char *Lmer)
 Lmer insertion in the tree (depth L).
 
void insert_entry (Tree *tree_ptr, Fa_entry *entry)
 fasta entry insertion in the tree (depth L).
 
double check_path (Tree *tree_ptr, char *read, int Lread)
 checks if read is found in tree and outputs a score More...
 
Treetree_from_fasta (Fa_data *fasta, int L)
 create Tree structure from fasta structure. More...
 
void save_tree (Tree *tree_ptr, char *filename)
 saves Tree to disk in filename More...
 
Treeread_tree (char *filename)
 read tree from file More...
 

Detailed Description

Construction of tree, check paths, write tree, read in tree.

Author
Paula Perez paula.nosp@m.pere.nosp@m.zrubi.nosp@m.o@gm.nosp@m.ail.c.nosp@m.om
Date
18.08.2017

Typedef Documentation

◆ Tree

typedef struct _tree Tree

structure containing a T_ACGT-tree.

The tree structure is stored in a pointer to pointer to Node. We grow the structure on the flight as we need more memory. In the outer direction, we start by allocating NPOOL_2D pointers to Node. In the inner direction, we allocate NPOOL_1D Nodes and fill them as we read the fasta file. When all of them are allocated, we allocate again NPOOL_1D. If NPOOL_2D pointers to Node are allocated, the outer dimension is reallocated with +NPOOL_2D extra elements. L is the depth of the tree, pool_count is the number on Node* elements used so far, pool_available is the number of Nodes available in every moment, and nnodes is the total number of nodes filled in. We limit the number of allocated nodes to UINT_MAX (we cannot count more nodes!).

Function Documentation

◆ check_path()

double check_path ( Tree tree_ptr,
char *  read,
int  Lread 
)

checks if read is found in tree and outputs a score

Parameters
tree_ptrpointer to Tree structure
readRead or reverse complement
Lreadlength of read
Returns
score = (number of Lmers of reads found in read) / (Lread-L+1)

◆ free_all_nodes()

void free_all_nodes ( Tree tree_ptr)

frees the whole tree structure

Parameters
tree_ptrpointer to Tree structure

This function deallocates the memory allocated in a Tree structure.

◆ get_new_pool()

Node* get_new_pool ( Tree tree_ptr)

reallocs pool_2D (++NPOOL_2D) if all existing nodes have been used

Parameters
tree_ptrpointer to Tree structure

◆ new_node_buf()

Node* new_node_buf ( Tree tree_ptr)

moves to the next node (allocating new memory if necessary)

Parameters
tree_ptrpointer to Tree structure
Returns
address to next node

The function checks if there are available nodes (information stored in the variable tree_ptr -> pool_available) and goes to the next node. If there is no nodes left, it allocates a new pool_1D, and if there is no room left in the outter dimension, it reallocates NPOOL_2D more Node*'s. If the number of nodes reaches UINT_MAX, the program returns an error message and exits.

◆ read_tree()

Tree* read_tree ( char *  filename)

read tree from file

Parameters
filenamestring with the filename
Returns
pointer to Tree structure

This function unwinds the process carried out in save_tree and assigns addresses to the children of every given node.

◆ save_tree()

void save_tree ( Tree tree_ptr,
char *  filename 
)

saves Tree to disk in filename

Parameters
tree_ptrpointer to Tree structure
filenamestring containing filename

The tree structure is stored as follows: every address is stored in a uint32_t (we are not allowing trees with more than UINT_MAX nodes). For every node, the addresses of the children are stored in the following fashion:

  • If it is pointing to NULL: 0.
  • Otherwise: i2, the index in the outer dimension of pool_2D is identified, and the difference jump = pool_2D[i][j].children[k] - pool_2D[i2] is computed. i2*NPOOL_D1 + jump is then stored for child k.

◆ tree_from_fasta()

Tree* tree_from_fasta ( Fa_data fasta,
int  L 
)

create Tree structure from fasta structure.

Parameters
fastapointer to fasta structure
Ltree length