FastqPuri
Functions
trim.h File Reference

trims/filter sequences after Quality, N's contaminations. More...

#include <stdio.h>
#include "Lmer.h"
#include "fq_read.h"
#include "defines.h"
#include "tree.h"
#include "bloom.h"
#include "adapters.h"
Include dependency graph for trim.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Functions

int trim_adapter (Fq_read *seq, Ad_seq *adap_list)
 trims sequence based on presence of N nucleotides More...
 
int trim_sequenceN (Fq_read *seq)
 trims sequence based on presence of N nucleotides More...
 
int trim_sequenceQ (Fq_read *seq)
 trims sequence based on lowQ base callings More...
 
bool is_read_inTree (Tree *tree_ptr, Fq_read *seq)
 check if Lread is contained in tree. It computes the score for the read and its reverse complement; if one ot them exceeds the user selected threshold, it returns true. Otherwise, it returns false. More...
 
bool is_read_inBloom (Bfilter *tree_ptr, Fq_read *seq, Bfkmer *ptr_Bfkmer)
 checks if a read is in Bloom filter. It computes the score for the read and returns true if it exceeds the user selected threshold. Returns false othersise. More...
 
int Qtrim_global (Fq_read *seq, int left, int right, char type)
 trims left from the left and right from the right More...
 

Detailed Description

trims/filter sequences after Quality, N's contaminations.

Author
Paula Perez paula.nosp@m.pere.nosp@m.zrubi.nosp@m.o@gm.nosp@m.ail.c.nosp@m.om
Date
24.08.2017

Function Documentation

◆ is_read_inBloom()

bool is_read_inBloom ( Bfilter ptr_bf,
Fq_read seq,
Bfkmer ptr_bfkmer 
)

checks if a read is in Bloom filter. It computes the score for the read and returns true if it exceeds the user selected threshold. Returns false othersise.

Parameters
ptr_bfpointer to Bfilter
seqfastq read
ptr_bfkmerpointer to Procs_kmer structure (will store global)
Returns
true if read was found, false otherwise

◆ is_read_inTree()

bool is_read_inTree ( Tree tree_ptr,
Fq_read seq 
)

check if Lread is contained in tree. It computes the score for the read and its reverse complement; if one ot them exceeds the user selected threshold, it returns true. Otherwise, it returns false.

Parameters
tree_ptrpointer to Tree structure
seqfastq read
Returns
true if read was found, false otherwise

◆ Qtrim_global()

int Qtrim_global ( Fq_read seq,
int  left,
int  right,
char  type 
)

trims left from the left and right from the right

Parameters
seqfastq read
leftnumber of nucleotides to be trimmed from the left
rightnumber of nucleotides to be trimmed from the right
typechar indicating the type of trimming (Q,A).
Returns
2, since they are all accepted and trim

◆ trim_adapter()

int trim_adapter ( Fq_read seq,
Ad_seq adap_list 
)

trims sequence based on presence of N nucleotides

if (adapter length < 16) -> search for seeds 8 nucleotides long else -> search for seeds 16 nucleotides long if (seed found) -> calculate score if score > threshold -> aligner found, trim / discard and exit. else -> search for seeds 8 nucleotides long

Parameters
seqpointer to Fq_read
adap_listarray of Ad_seq
Returns
-1 error, 0 discarded, 1 accepted as is, 2 accepted and trimmed
Note
Global input parameters from par_TF are also used

◆ trim_sequenceN()

int trim_sequenceN ( Fq_read seq)

trims sequence based on presence of N nucleotides

Parameters
seqfastq read
Returns
-1 error, 0 discarded, 1 accepted as is, 2 accepted and trimmed

This function calls a different function depending on the method passed as input par_TF.trimN:

  • NO(0): accepts it as is, (1),
  • ALL(1): accepts it as is if NO N's found (1), rejects it otherwise (0),
  • ENDS(2): trims the ends and accepts it if it is longer than minL (2 if trimming, 1 if no trimming), rejects it otherwise (0),
  • STRIP(3): finds the longest N-free subsequence and trims it if it is at least minL nucleotides long (2 if trimming, 1 if no N's are found), rejects it otherwise (0).

◆ trim_sequenceQ()

int trim_sequenceQ ( Fq_read seq)

trims sequence based on lowQ base callings

Parameters
seqfastq read
Returns
-1 error, 0 discarded, 1 accepted as is, 2 accepted and trimmed

This function calls a different function depending on the method passed as input par_TF.trimQ:

  • NO(0): accepts is as is , (1),
  • FRAC(1): accepts it if less than par_TF.nlowQ are found (1), rejects it otherwise (0),
  • ENDS(2): trims the ends and accepts it if it is longer than minL (2 if triming, 1 if no trimming), rejects it otherwise (0),
  • ENDSFRAC(3): trims the ends and accepts if the remaining sequence is at least minL bases long and if it contains less than nlowQ lowQ nucleotides (2 if trimming, 1 if no trimming). Otherwise, it is rejected, (0).
  • GLOBAL(4): it trims globally globleft nucleotides from the left and globright from the right, (returns 2).