Concordia
ConcordiaIndex Class Reference

#include <concordia_index.hpp>

Public Member Functions

 ConcordiaIndex (const std::string &hashedIndexFilePath, const std::string &markersFilePath) throw (ConcordiaException)
 
virtual ~ConcordiaIndex ()
 
TokenizedSentence addExample (boost::shared_ptr< HashGenerator > hashGenerator, boost::shared_ptr< std::vector< sauchar_t > > T, boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > > markers, const Example &example)
 
void addTokenizedExample (boost::shared_ptr< HashGenerator > hashGenerator, boost::shared_ptr< std::vector< sauchar_t > > T, boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > > markers, const TokenizedSentence &tokenizedSentence, const SUFFIX_MARKER_TYPE id)
 
void addAllTokenizedExamples (boost::shared_ptr< HashGenerator > hashGenerator, boost::shared_ptr< std::vector< sauchar_t > > T, boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > > markers, const std::vector< TokenizedSentence > &tokenizedSentences, const std::vector< SUFFIX_MARKER_TYPE > &ids)
 
std::vector< TokenizedSentenceaddAllExamples (boost::shared_ptr< HashGenerator > hashGenerator, boost::shared_ptr< std::vector< sauchar_t > > T, boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > > markers, const std::vector< Example > &examples)
 
boost::shared_ptr< std::vector< saidx_t > > generateSuffixArray (boost::shared_ptr< std::vector< sauchar_t > > T)
 

Detailed Description

Class for creating and maintaining the index. This class does not hold the index data structures but only operates on them when they are passed to ConcordiaIndex methods by smart pointers. This class only remembers paths to two files: hashed index and markers array, which are backups of the respective data structures on HDD.

Constructor & Destructor Documentation

ConcordiaIndex::ConcordiaIndex ( const std::string &  hashedIndexFilePath,
const std::string &  markersFilePath 
)
throw (ConcordiaException
)
explicit

Constructor.

Parameters
hashedIndexFilePathpath to the hashed index file
markersFilePathpath to the markers array
Exceptions
ConcordiaException
ConcordiaIndex::~ConcordiaIndex ( )
virtual

Destructor.

Member Function Documentation

std::vector< TokenizedSentence > ConcordiaIndex::addAllExamples ( boost::shared_ptr< HashGenerator hashGenerator,
boost::shared_ptr< std::vector< sauchar_t > >  T,
boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > >  markers,
const std::vector< Example > &  examples 
)

Adds multiple examples to the index. Examples are first hashed using the hash generator passed to this method. Then, hashed index and markers array (also passed to this method) are appended with the hashed examples. At the same time, HDD versions of these two data structures are also appended with the same examples. The method returns a vector of tokenized examples.

Parameters
hashGeneratorhash generator to be used to prepare the hash of the example
TRAM-based hash index to be appended to
markersRAM-based markers array to be appended to
examplesvector of examples to be added to index
Returns
vector of tokenized examples
Exceptions
ConcordiaException
void ConcordiaIndex::addAllTokenizedExamples ( boost::shared_ptr< HashGenerator hashGenerator,
boost::shared_ptr< std::vector< sauchar_t > >  T,
boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > >  markers,
const std::vector< TokenizedSentence > &  tokenizedSentences,
const std::vector< SUFFIX_MARKER_TYPE > &  ids 
)

Adds multiple tokenized examples to the index. Hashed index and markers array are appended with the examples. At the same time, HDD versions of these two data structures are also appended with the same examples.

Parameters
hashGeneratorhash generator to be used to prepare the hash of the example
TRAM-based hash index to be appended to
markersRAM-based markers array to be appended to
exampleexample to be added to index
tokenizedSentencesvector of tokenized sentences to be added
idsvector of ids of the sentences to be added
Exceptions
ConcordiaException
TokenizedSentence ConcordiaIndex::addExample ( boost::shared_ptr< HashGenerator hashGenerator,
boost::shared_ptr< std::vector< sauchar_t > >  T,
boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > >  markers,
const Example example 
)

Adds an Example to the index. Example is first hashed using the hash generator passed to this method. Then, hashed index and markers array (also passed to this method) are appended with the hashed example. At the same time, HDD versions of these two data structures are also appended with the same example. The method returns a tokenized version of the example.

Parameters
hashGeneratorhash generator to be used to prepare the hash of the example
TRAM-based hash index to be appended to
markersRAM-based markers array to be appended to
exampleexample to be added to index
Returns
tokenized example
Exceptions
ConcordiaException
void ConcordiaIndex::addTokenizedExample ( boost::shared_ptr< HashGenerator hashGenerator,
boost::shared_ptr< std::vector< sauchar_t > >  T,
boost::shared_ptr< std::vector< SUFFIX_MARKER_TYPE > >  markers,
const TokenizedSentence tokenizedSentence,
const SUFFIX_MARKER_TYPE  id 
)

Adds a tokenized example to the index. Hashed index and markers array are appended with the example. At the same time, HDD versions of these two data structures are also appended with the same example.

Parameters
hashGeneratorhash generator to be used to prepare the hash of the example
TRAM-based hash index to be appended to
markersRAM-based markers array to be appended to
exampleexample to be added to index
tokenizedSentencetokenized sentence to be added
idof the sentence to be added
Exceptions
ConcordiaException
boost::shared_ptr< std::vector< saidx_t > > ConcordiaIndex::generateSuffixArray ( boost::shared_ptr< std::vector< sauchar_t > >  T)

Generates suffix array based on the passed hashed index.

Returns
the generated suffix array
Exceptions
ConcordiaException

The documentation for this class was generated from the following files: