<<

NAME

Sanger::CGP::TraFiC::Filter

SYNOPSIS

 use Sanger::CGP::TraFiC::Filter;
 my $filter = Sanger::CGP::TraFiC::Filter->new;
 $filter->set_output($output_dir);
 $filter->set_clusters($sample_clusters);

 $filter->add_filter_file($filter_on_this);
 $filter->add_filter_file($filter_on_this_too);
  ...
  # OR
 $filter->add_filter_file(\@list_of_filter_files);

 $filter->filter;

GENERAL

Provides bulk filtering of a set of clusters. Will allow more arbitrary processing of data after main processing is completed.

METHODS

User functions

set_output

Set the output location for output files.

set_clusters

Specify file which contains the clusters that are of interest.

add_filter_file

Add files that are used to filter the content of set_clusters. Accepts both a simple scalar for one file or an array reference for several files.

It is recommended that the first file added is the matched normal sample as this will remove the most noise from the data.

filter

Filter the content of set_clusters against each of the items in add_filter_file. Entries in add_filter_file are only sorted/loaded if data remains in the data set loaded from set_cluster.

output

Output filtered data. Includes sorting.

set_min_reads

Set the minimum number of reads that must support a filtering record for it to be used.

Defaults to 5.

Internal functions

_load_clusters

Sort and then load cluster information into a data structure.

Sort is always invoked to ensure that data is consistent. Most of the input should be relatively small.

<<