Concordia
IndexSearcher Class Reference

#include <index_searcher.hpp>

Public Member Functions

 IndexSearcher ()
 
virtual ~IndexSearcher ()
 
MatchedPatternFragment simpleSearch (boost::shared_ptr< HashGenerator > hashGenerator, boost::shared_ptr< std::vector< sauchar_t > > T, boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > > markers, boost::shared_ptr< std::vector< saidx_t > > SA, const std::string &pattern, bool byWhitespace=false) throw (ConcordiaException)
 
MatchedPatternFragment lexiconSearch (boost::shared_ptr< HashGenerator > hashGenerator, boost::shared_ptr< std::vector< sauchar_t > > T, boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > > markers, boost::shared_ptr< std::vector< saidx_t > > SA, const std::string &pattern, bool byWhitespace=false) throw (ConcordiaException)
 
std::vector< AnubisSearchResultanubisSearch (boost::shared_ptr< ConcordiaConfig > config, boost::shared_ptr< HashGenerator > hashGenerator, boost::shared_ptr< std::vector< sauchar_t > > T, boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > > markers, boost::shared_ptr< std::vector< saidx_t > > SA, const std::string &pattern) throw (ConcordiaException)
 
boost::shared_ptr< ConcordiaSearchResultconcordiaSearch (boost::shared_ptr< HashGenerator > hashGenerator, boost::shared_ptr< std::vector< sauchar_t > > T, boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > > markers, boost::shared_ptr< std::vector< saidx_t > > SA, const std::string &pattern, bool byWhitespace=false) throw (ConcordiaException)
 

Detailed Description

Class for searching the index with a sentence. In all searches the sentence is first hashed and then used as a query.

IndexSearcher performs the simpleSearch on its own, but uses a ConcordiaSearcher object to carry out concordiaSearch.

Constructor & Destructor Documentation

IndexSearcher::IndexSearcher ( )
explicit

Constructor.

IndexSearcher::~IndexSearcher ( )
virtual

Destructor.

Member Function Documentation

std::vector< AnubisSearchResult > IndexSearcher::anubisSearch ( boost::shared_ptr< ConcordiaConfig config,
boost::shared_ptr< HashGenerator hashGenerator,
boost::shared_ptr< std::vector< sauchar_t > >  T,
boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > >  markers,
boost::shared_ptr< std::vector< saidx_t > >  SA,
const std::string &  pattern 
)
throw (ConcordiaException
)
Deprecated:
Finds the examples from the index, whose resemblance to the pattern is maximal. This method may perform very slow, try using concordiaSearch instead.
Parameters
configconcordia config object (to read the anubis threshold parameter)
hashGeneratorhash generator to be used to convert input sentence to a hash
Thashed index to search in
markersmarkers array for the needs of searching
SAsuffix array for the needs of searching
patternstring pattern to be searched in the index.
byWhitespacewhether to tokenize the pattern by white space
Returns
vector of results
Exceptions
ConcordiaException
boost::shared_ptr< ConcordiaSearchResult > IndexSearcher::concordiaSearch ( boost::shared_ptr< HashGenerator hashGenerator,
boost::shared_ptr< std::vector< sauchar_t > >  T,
boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > >  markers,
boost::shared_ptr< std::vector< saidx_t > >  SA,
const std::string &  pattern,
bool  byWhitespace = false 
)
throw (ConcordiaException
)

Performs concordia lookup on the RAM-based index. This is a unique library functionality, designed to facilitate Computer-Aided Translation. For more info see Concordia searching.

Parameters
hashGeneratorhash generator to be used to convert input sentence to a hash
Thashed index to search in
markersmarkers array for the needs of searching
SAsuffix array for the needs of searching
patternpattern to be searched in the index.
byWhitespacewhether to tokenize the pattern by white space
Returns
result of the search
Exceptions
ConcordiaException

Here is the call graph for this function:

MatchedPatternFragment IndexSearcher::lexiconSearch ( boost::shared_ptr< HashGenerator hashGenerator,
boost::shared_ptr< std::vector< sauchar_t > >  T,
boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > >  markers,
boost::shared_ptr< std::vector< saidx_t > >  SA,
const std::string &  pattern,
bool  byWhitespace = false 
)
throw (ConcordiaException
)

Performs a search useful for lexicons in the following scenario: Concordia gets fed by a lexicon (glossary) instead of a TM. The lexicon search performs as simple search - it requires the match to cover the whole pattern, but additionally the lexicon search requires that the match is the whole example source.

Parameters
hashGeneratorhash generator to be used to convert input sentence to a hash
Thashed index to search in
markersmarkers array for the needs of searching
SAsuffix array for the needs of searching
patternstring pattern to be searched in the index.
Returns
matched pattern fragment, containing occurences of the pattern in the index
Exceptions
ConcordiaException

Here is the call graph for this function:

MatchedPatternFragment IndexSearcher::simpleSearch ( boost::shared_ptr< HashGenerator hashGenerator,
boost::shared_ptr< std::vector< sauchar_t > >  T,
boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > >  markers,
boost::shared_ptr< std::vector< saidx_t > >  SA,
const std::string &  pattern,
bool  byWhitespace = false 
)
throw (ConcordiaException
)

Performs a simple substring lookup in RAM-based index. For more info see Simple substring lookup.

Parameters
hashGeneratorhash generator to be used to convert input sentence to a hash
Thashed index to search in
markersmarkers array for the needs of searching
SAsuffix array for the needs of searching
patternstring pattern to be searched in the index.
Returns
matched pattern fragment, containing occurences of the pattern in the index
Exceptions
ConcordiaException

Here is the call graph for this function:


The documentation for this class was generated from the following files: