fr.gouv.culture.sdx.documentbase
Class LuceneDocumentBase

java.lang.Object
  extended byfr.gouv.culture.sdx.utils.AbstractSdxObject
      extended byfr.gouv.culture.sdx.utils.database.DatabaseBacked
          extended byfr.gouv.culture.sdx.documentbase.AbstractDocumentBase
              extended byfr.gouv.culture.sdx.documentbase.SDXDocumentBase
                  extended byfr.gouv.culture.sdx.documentbase.LuceneDocumentBase
All Implemented Interfaces:
org.apache.avalon.framework.configuration.Configurable, org.apache.avalon.framework.context.Contextualizable, Describable, DocumentBase, Encodable, Identifiable, Localizable, org.apache.avalon.framework.logger.LogEnabled, Saveable, SDXDocumentBaseTarget, SdxObject, Searchable, org.apache.avalon.framework.service.Serviceable, Target, org.apache.excalibur.xml.sax.XMLizable
Direct Known Subclasses:
LuceneThesaurus

public class LuceneDocumentBase
extends SDXDocumentBase

A document base within an SDX application.

A document base is a very important document in SDX development. A document base is where documents are searched and retrieved, thus added (indexed), deleted or updated. A search cannot occur in a smaller unit than the document base. To exclude some parts of a document base, one should use query constructions, possibly filters.

A document base has a structure ; this structure is basically a list of fields. An application may have many document bases, and these document bases may have different structures. As always, indexable documents (XML, HTML or the like) with different structures can be indexed within a single document base.

Most applications will have only one document base, but in some cases it could be interesting to have more than one, like when different kinds of documents are never searched at the same time, in this case it would speed up the searching and indexing process if they are separated in different document bases.

A document base uses an indexer to index documents. It uses repositories to store the documents, either indexable ones or attached ones (i.e. non-indexable documents that are logically dependant of the indexable documents, images or the like). An application can get a searcher to perform searches within this document base, possibly with other document bases.

In order to work properly, a document base must be instantiated given the following sequence : 1) creation, 2) setting the super.getLog() (optional, but suggested for errors messages), 3) configuration, 4) initialization.

See Also:
AbstractSdxObject.enableLogging(org.apache.avalon.framework.logger.Logger), configure(org.apache.avalon.framework.configuration.Configuration), init()

Nested Class Summary
 
Nested classes inherited from class fr.gouv.culture.sdx.documentbase.SDXDocumentBaseTarget
SDXDocumentBaseTarget.ConfigurationNode
 
Nested classes inherited from class fr.gouv.culture.sdx.documentbase.DocumentBase
DocumentBase.ConfigurationNode
 
Field Summary
protected  FieldList _fieldList
          The (Lucene) fields that are to be handled by the index.
protected  java.util.HashMap _xmlFieldList
          The list of fields with a XML type
static java.lang.String DBELEM_ATTRIBUTE_REMOTE_ACCESS
          The implied attribute stating whether this document base is to be exposed to remote access or not.
static java.lang.String ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
          The element used to define system fields in sdx.xconf.
protected  java.lang.String INDEX_DIR_CURRENT
          Directory names for indexes
protected  java.lang.String INDEX_DIR_MAIN
           
protected  long lastDocCount
          Number of indexed doc since last split
protected  LuceneIndex luceneActiveIndex
          The active index for this document base
protected  LuceneIndex luceneCurrentIndex
          The temporary index for this document base
protected  java.util.Vector luceneSearchIndexList
          The sub-indexes for this document base (first entry is the activeIndex)
protected  java.lang.String SEARCH_INDEX_DIRECTORY_NAME
          The directory name for the index that stores documents' indexation.
protected  int subIndexCount
          Number of subindexes
 
Fields inherited from class fr.gouv.culture.sdx.documentbase.SDXDocumentBase
_documentAdditionStatus, _isIndexOptimized, autoOptimize, baseIndexDir, DOC_ADD_STATUS_ADDED, DOC_ADD_STATUS_FAILURE, DOC_ADD_STATUS_IGNORED, DOC_ADD_STATUS_REFRESHED, DOC_ADD_STATUS_REPLACED, DOC_URL, ELEMENT_NAME_DEFAULT_HPP, ELEMENT_NAME_DEFAULT_MAXSORT, keepOriginalDocuments, scheduler, SDX_DATABASE_FORMAT, SDX_DATABASE_VERSION, SDX_DATABASE_VERSION_2_3, SDX_DATE, SDX_DATE_MILLISECONDS, SDX_ISO8601_DATE, SDX_USER, splitActive, splitDoc, splitSize, splitUnit, useCompoundFiles
 
Fields inherited from class fr.gouv.culture.sdx.documentbase.AbstractDocumentBase
_indexationPipeline, _oaiHarv, ATTRIBUTE_AUTO_OPTIMIZE, ATTRIBUTE_COMPOUND_FILES, ATTRIBUTE_SPLIT_DOC, ATTRIBUTE_SPLIT_SIZE, ATTRIBUTE_SPLIT_UNIT, DBELEM_ATTRIBUTE_DEFAULT, DBELEM_ATTRIBUTE_HPP, DBELEM_ATTRIBUTE_KEEP_ORIGINAL, DBELEM_ATTRIBUTE_MAXSORT, defaultHitsPerPage, defaultMaxSort, defaultRepository, ELEMENT_NAME_INDEX_SPLIT, ELEMENT_NAME_OPTIMIZE, INTERNAL_FIELD_NAME_SDX_OAI_DELETED_RECORD, INTERNAL_FIELD_NAME_SDXALL, INTERNAL_FIELD_NAME_SDXAPPID, INTERNAL_FIELD_NAME_SDXCONTENTLENGTH, INTERNAL_FIELD_NAME_SDXDBID, INTERNAL_FIELD_NAME_SDXDOCID, INTERNAL_FIELD_NAME_SDXDOCTYPE, INTERNAL_FIELD_NAME_SDXMODDATE, INTERNAL_FIELD_NAME_SDXREPOID, INTERNAL_SDXALL_FIELD_VALUE, isDefault, locale, oaiRepo, PROPERTY_NAME_ATTACHED, PROPERTY_NAME_CONTENT_LENGTH, PROPERTY_NAME_DOCTYPE, PROPERTY_NAME_MIMETYPE, PROPERTY_NAME_ORIGINAL, PROPERTY_NAME_PARENT, PROPERTY_NAME_REPO, PROPERTY_NAME_SUB, repoConnectionPool, repositories, useMetadata
 
Fields inherited from class fr.gouv.culture.sdx.utils.database.DatabaseBacked
_database, CLASS_NAME_SUFFIX, DATABASE_DIR_NAME, databaseConf, dbLocation, dbPath, DEFAULT_DATABASE_TYPE
 
Fields inherited from class fr.gouv.culture.sdx.utils.AbstractSdxObject
_configuration, _context, _description, _encoding, _id, _locale, _logger, _manager, _xmlizable_objects, _xmlLang, isToSaxInitialized
 
Fields inherited from interface fr.gouv.culture.sdx.documentbase.DocumentBase
CLASS_NAME_SUFFIX, PACKAGE_QUALNAME
 
Fields inherited from interface fr.gouv.culture.sdx.utils.Encodable
DEFAULT_ENCODING
 
Fields inherited from interface fr.gouv.culture.sdx.utils.save.Saveable
ALL_SAVE_ATTRIB, PATH_ATTRIB, SAVE_DIRECTORY_PARAM
 
Constructor Summary
LuceneDocumentBase()
          Creates the document base.
 
Method Summary
protected  void addSubIndex()
          Add a splitted sub-index and update configuration aftermath
protected  void addSubIndex(LuceneIndex index)
          Add a splitted sub-index and update configuration aftermath
protected  void addToSearchIndex(java.lang.Object indexationDoc, boolean batchIndex)
          Writes a document to the search index
 void backup(SaveParameters save_config)
          Save the DocumentBase data objects
protected  void backupIndexes(SaveParameters save_config)
          Save the indexes files
protected  void backupTimeStamp(SaveParameters save_config)
          Save the timestamp files
protected  void compactSearchIndex()
           
 void configure(org.apache.avalon.framework.configuration.Configuration configuration)
          Sets the configuration options for this document base.
protected  void configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
           
protected  void configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
           
protected  void configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
           
protected  void configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
           
protected  void configureSearchIndex()
           
 java.util.Date creationDate()
           
 void delete(Document[] docs, org.xml.sax.ContentHandler handler)
          Overriding parent method only to add lucene index optimazation
protected  void deleteFromSearchIndex(java.lang.String docId)
           
 int docCount()
          TODO - This needs to be periodically written to a .properties file TODO - we a configurable generic mechanism to save such information to a .properties file like certain queries, terms, etc. which should be updated after indexation/deletion
protected  java.lang.String getFormatedSubIndexId(int subIndexNumber)
          Get the formated sub-index number (for directories name)
 Index getIndex()
          Gets the Index object for indexing and searching.
protected  java.lang.Object getIndexationDocument(IndexableDocument doc, java.lang.String storeDocId, java.lang.String repoId, IndexParameters params)
           
 org.apache.lucene.index.IndexReader getIndexReader()
          Return the index reader for all this document base indexes
protected  long getIndexSize(LuceneIndex index)
          ï¿½ Return the index size
 LuceneIndex getLuceneIndex()
           
 org.apache.lucene.search.Searcher getSearcher()
          Return the index searcher for all this document base indexes
 java.util.HashMap getXMLFieldList()
          Returns the list of XML type fields
 void index(IndexableDocument[] docs, Repository repository, IndexParameters params, org.xml.sax.ContentHandler handler)
          Adds some documents.
 void indexModified()
          Modifies the last modfication timestamp file
 void init()
          Initializes the document base.
protected  void initializeVectorizedIndex()
          Initialize the index vector by searching all sub index in it's directory NB : working as intended
protected  boolean initToSax()
          Init the LinkedHashMap _xmlizable_objects with the objects in order to describ them in XML
protected  void initVolatileObjectsToSax()
          Init the LinkedHashMap _xmlizable_volatile_objects with the objects in order to describ them in XML Some objects need to be refresh each time a toSAX is called
 java.util.Date lastModificationDate()
           
 void mergeBatch()
          Merges a batch of documents (in memory) into the physical index on the file system.
 void mergeCurrentBatch()
          Merges a batch of documents (in memory) into the physical index on the file system and optimize this one if necessary (depends of the autoOptimize attribute for the current Document Base)
 void optimize()
          Process an optimization of the indexes and repositories and system databases
 void reloadFieldList(java.lang.String appConfString)
          Reload the fieldList of an application
protected  void removeSubIndex()
          Remove a splitted sub-index and update configuration aftermath Currently of no use as there is no plan to do so, just here as a reminder for future functionnalities
protected  void renewKeyIndex()
          refresh data for the main and current index
 void replaceFieldList(FieldList fieldList)
          Replace the current fieldList by the new one
 void restore(SaveParameters save_config)
          Restore the DocumentBase data objects
protected  void restoreIndexes(SaveParameters save_config)
          Save the indexes files
protected  void restoreTimeStamp(SaveParameters save_config)
          Restore the timestamp files
protected  IndexParameters setBaseParameters(IndexParameters params)
          Set's the default pipeline parameters and ensures the params have a pipeline
protected  void setSearchIndexParameters(LuceneIndexParameters params)
          Sets the search index parameters for indexation performance
 boolean splitCheck(boolean currentIndex)
          Return true when splitting condition are reached if true, should be followed by a splitIndex() call
 void splitIndex(boolean currentIndex)
          Split the current big index into 2 smaller one
 
Methods inherited from class fr.gouv.culture.sdx.documentbase.SDXDocumentBase
add, checkIntegrity, configureBase, configureIdGenerator, configureOAIComponents, configureOptimizeTriggers, configureRepositories, configureSplit, delete, deleteIndexableDocumentComponents, deleteRelationsToMastersFromDatabase, getByteSplitSize, getDocument, getDocument, getDocument, getDocument, getOwners, getRelated, getRepositoryConfigurationList, getRepositoryForDocument, getRepositoryForStorage, getSplitDoc, getSplitSize, getSplitUnit, getUseCompoundFiles, handleParameters, index, index, isAutoOptimized, isIndexOptimized, rollbackIndexation, targetTriggered
 
Methods inherited from class fr.gouv.culture.sdx.documentbase.AbstractDocumentBase
addOaiDeletedRecord, configurePipeline, createEntityForDocMetaData, delete, deletePhysicalDocument, getDefaultHitsPerPage, getDefaultMaxSort, getDefaultRepository, getIdGenerator, getIndexationPipeline, getMimeType, getOAIHarvester, getOAIRepository, getPooledRepositoryConnection, getRepository, getSourceValidity, isDefault, isUseMetadata, optimizeDatabase, optimizeRepositories, releasePooledRepositoryConnections, removeOaiDeletedRecord
 
Methods inherited from class fr.gouv.culture.sdx.utils.database.DatabaseBacked
configure, getClassNameSuffix, getDatabase
 
Methods inherited from class fr.gouv.culture.sdx.utils.AbstractSdxObject
configureDescription, contextualize, enableLogging, getBaseAttributes, getConfiguration, getContext, getDescription, getEncoding, getId, getLocale, getLog, getServiceManager, getXmlLang, service, setDescription, setEncoding, setId, setLocale, setUpSdxObject, setUpSdxObject, setXmlLang, toSAX, verifyConfigurationResources
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface fr.gouv.culture.sdx.utils.SdxObject
getLog
 
Methods inherited from interface org.apache.avalon.framework.logger.LogEnabled
enableLogging
 
Methods inherited from interface org.apache.avalon.framework.context.Contextualizable
contextualize
 
Methods inherited from interface org.apache.avalon.framework.service.Serviceable
service
 
Methods inherited from interface fr.gouv.culture.sdx.utils.Identifiable
getId, setId
 
Methods inherited from interface fr.gouv.culture.sdx.utils.Describable
getDescription, setDescription
 
Methods inherited from interface fr.gouv.culture.sdx.utils.Encodable
getEncoding, setEncoding
 
Methods inherited from interface fr.gouv.culture.sdx.utils.Localizable
getLocale, getXmlLang, setLocale, setXmlLang
 
Methods inherited from interface org.apache.excalibur.xml.sax.XMLizable
toSAX
 
Methods inherited from interface fr.gouv.culture.sdx.search.Searchable
getId
 

Field Detail

luceneSearchIndexList

protected java.util.Vector luceneSearchIndexList
The sub-indexes for this document base (first entry is the activeIndex)


luceneActiveIndex

protected LuceneIndex luceneActiveIndex
The active index for this document base


luceneCurrentIndex

protected LuceneIndex luceneCurrentIndex
The temporary index for this document base


_fieldList

protected FieldList _fieldList
The (Lucene) fields that are to be handled by the index.


_xmlFieldList

protected java.util.HashMap _xmlFieldList
The list of fields with a XML type


subIndexCount

protected int subIndexCount
Number of subindexes


lastDocCount

protected long lastDocCount
Number of indexed doc since last split


INDEX_DIR_CURRENT

protected final java.lang.String INDEX_DIR_CURRENT
Directory names for indexes

See Also:
Constant Field Values

INDEX_DIR_MAIN

protected final java.lang.String INDEX_DIR_MAIN
See Also:
Constant Field Values

SEARCH_INDEX_DIRECTORY_NAME

protected final java.lang.String SEARCH_INDEX_DIRECTORY_NAME
The directory name for the index that stores documents' indexation.

See Also:
Constant Field Values

DBELEM_ATTRIBUTE_REMOTE_ACCESS

public static final java.lang.String DBELEM_ATTRIBUTE_REMOTE_ACCESS
The implied attribute stating whether this document base is to be exposed to remote access or not.

See Also:
Constant Field Values

ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS

public static final java.lang.String ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
The element used to define system fields in sdx.xconf.

See Also:
Constant Field Values
Constructor Detail

LuceneDocumentBase

public LuceneDocumentBase()
Creates the document base. After a document base is created, the super.getLog() could be set (optional, but suggested for errors messages) ; it should then be configured and after, initialized in order to work properly.

See Also:
AbstractSdxObject.enableLogging(org.apache.avalon.framework.logger.Logger), configure(org.apache.avalon.framework.configuration.Configuration), init()
Method Detail

configure

public void configure(org.apache.avalon.framework.configuration.Configuration configuration)
               throws org.apache.avalon.framework.configuration.ConfigurationException
Sets the configuration options for this document base.

Specified by:
configure in interface org.apache.avalon.framework.configuration.Configurable
Overrides:
configure in class SDXDocumentBase
Parameters:
configuration - The configuration object from which to build a document base.

Sample configuration entry:

<sdx:documentBase sdx:id = "myDocumentBaseName" sdx:type = "lucene">
       <sdx:fieldList xml:lang = "fr-FR" sdx:variant = "" sdx:analyzerConf = "" sdx:analyzerClass = "">
     <sdx:field code = "fieldName" type = "word" xml:lang = "fr-FR" sdx:analyzerClass = "" sdx:analyzerConf = ""/>
     <sdx:field code = "fieldName2" type = "field" xml:lang = "fr-FR" brief = "true"/>
     <sdx:field code = "fieldName3" type = "date" xml:lang = "fr-FR"/>
     <sdx:field code = "fieldName4" type = "unindexed" xml:lang = "fr-FR"/>
     </sdx:fieldList>
     <sdx:index>
     <sdx:pipeline sdx:id = "sdxIndexationPipeline">
     <sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step2" sdx:type = "xslt"/>
     <sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step3" sdx:type = "xslt" keep = "true"/>
     </sdx:pipeline>
     </sdx:index>
     <sdx:repositories>
     <sdx:repository baseDirectory = "blah4" depth = "3" extent = "100" sdx:type = "FS" sdx:default = "true" sdx:id = "blah4"/>
     <sdx:repository ref = "blah2"/>
     </sdx:repositories>
     </sdx:documentBase>
     
Throws:
org.apache.avalon.framework.configuration.ConfigurationException
See Also:
we should link to this in the future when we have better documentation capabilities

configureDocumentBase

protected void configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
                              throws org.apache.avalon.framework.configuration.ConfigurationException
Specified by:
configureDocumentBase in class SDXDocumentBase
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureFieldList

protected void configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
                           throws org.apache.avalon.framework.configuration.ConfigurationException
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

reloadFieldList

public void reloadFieldList(java.lang.String appConfString)
                     throws SDXException
Reload the fieldList of an application

Parameters:
appConfString - The path of the configuration file wich contain the new fieldList (eg, file:///myFiles/application.xconf, cocoon://myApplication/conf/application.xconf)
Throws:
SDXException

replaceFieldList

public void replaceFieldList(FieldList fieldList)
                      throws org.apache.avalon.framework.configuration.ConfigurationException
Replace the current fieldList by the new one

Parameters:
fieldList - The new fieldList wich replace the old one
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureSearchIndex

protected void configureSearchIndex()
                             throws org.apache.avalon.framework.configuration.ConfigurationException
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureOAIRepository

protected void configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
                               throws org.apache.avalon.framework.configuration.ConfigurationException
Specified by:
configureOAIRepository in class SDXDocumentBase
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

configureOAIHarvester

protected void configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
                              throws org.apache.avalon.framework.configuration.ConfigurationException
Specified by:
configureOAIHarvester in class SDXDocumentBase
Throws:
org.apache.avalon.framework.configuration.ConfigurationException

index

public void index(IndexableDocument[] docs,
                  Repository repository,
                  IndexParameters params,
                  org.xml.sax.ContentHandler handler)
           throws SDXException,
                  org.xml.sax.SAXException,
                  org.apache.cocoon.ProcessingException
Description copied from class: SDXDocumentBase
Adds some documents.

Specified by:
index in interface DocumentBase
Overrides:
index in class SDXDocumentBase
Parameters:
docs - The documents to add.
repository - The repository where to store the documents. If null is passed, the default repository will be used.
params - The parameters for this adding action.
handler - A content handler where to send information about the process (may be null) TODO : what kind of "informations" ? -pb
Throws:
SDXException
org.xml.sax.SAXException
org.apache.cocoon.ProcessingException

delete

public void delete(Document[] docs,
                   org.xml.sax.ContentHandler handler)
            throws SDXException,
                   org.xml.sax.SAXException,
                   org.apache.cocoon.ProcessingException
Overriding parent method only to add lucene index optimazation

Specified by:
delete in interface DocumentBase
Overrides:
delete in class SDXDocumentBase
Parameters:
docs - The documents to delete.
handler - A content handler to feed with information.
Throws:
SDXException
org.xml.sax.SAXException
org.apache.cocoon.ProcessingException

setBaseParameters

protected IndexParameters setBaseParameters(IndexParameters params)
Set's the default pipeline parameters and ensures the params have a pipeline

Overrides:
setBaseParameters in class SDXDocumentBase
Parameters:
params - The params object provided by the user at indexation time

getXMLFieldList

public java.util.HashMap getXMLFieldList()
Description copied from class: SDXDocumentBase
Returns the list of XML type fields

Specified by:
getXMLFieldList in class SDXDocumentBase

getIndex

public Index getIndex()
Gets the Index object for indexing and searching.

Returns:
The LuceneIndex object.

getLuceneIndex

public LuceneIndex getLuceneIndex()

setSearchIndexParameters

protected void setSearchIndexParameters(LuceneIndexParameters params)
Sets the search index parameters for indexation performance

Parameters:
params - The lucene specific params to user

addToSearchIndex

protected void addToSearchIndex(java.lang.Object indexationDoc,
                                boolean batchIndex)
                         throws SDXException
Writes a document to the search index

Specified by:
addToSearchIndex in class SDXDocumentBase
Parameters:
indexationDoc - The Document to add
batchIndex -
Throws:
SDXException

deleteFromSearchIndex

protected void deleteFromSearchIndex(java.lang.String docId)
                              throws SDXException
Specified by:
deleteFromSearchIndex in class SDXDocumentBase
Throws:
SDXException

compactSearchIndex

protected void compactSearchIndex()
                           throws SDXException
Specified by:
compactSearchIndex in class SDXDocumentBase
Throws:
SDXException

getIndexationDocument

protected java.lang.Object getIndexationDocument(IndexableDocument doc,
                                                 java.lang.String storeDocId,
                                                 java.lang.String repoId,
                                                 IndexParameters params)
                                          throws SDXException
Specified by:
getIndexationDocument in class SDXDocumentBase
Throws:
SDXException

lastModificationDate

public java.util.Date lastModificationDate()

creationDate

public java.util.Date creationDate()

init

public void init()
          throws SDXException
Description copied from interface: DocumentBase
Initializes the document base.

This method must be called after the super.getLog() has been set and the configuration done.

Specified by:
init in interface DocumentBase
Overrides:
init in class SDXDocumentBase
Throws:
SDXException

initToSax

protected boolean initToSax()
Description copied from class: AbstractSdxObject
Init the LinkedHashMap _xmlizable_objects with the objects in order to describ them in XML

Overrides:
initToSax in class SDXDocumentBase

initVolatileObjectsToSax

protected void initVolatileObjectsToSax()
Init the LinkedHashMap _xmlizable_volatile_objects with the objects in order to describ them in XML Some objects need to be refresh each time a toSAX is called

Overrides:
initVolatileObjectsToSax in class SDXDocumentBase

optimize

public void optimize()
Process an optimization of the indexes and repositories and system databases

Specified by:
optimize in interface DocumentBase
Specified by:
optimize in class SDXDocumentBase

mergeBatch

public void mergeBatch()
                throws SDXException
Description copied from class: SDXDocumentBase
Merges a batch of documents (in memory) into the physical index on the file system.

Specified by:
mergeBatch in class SDXDocumentBase
Throws:
SDXException

mergeCurrentBatch

public void mergeCurrentBatch()
Merges a batch of documents (in memory) into the physical index on the file system and optimize this one if necessary (depends of the autoOptimize attribute for the current Document Base)

Specified by:
mergeCurrentBatch in class SDXDocumentBase

indexModified

public void indexModified()
Modifies the last modfication timestamp file

Specified by:
indexModified in class SDXDocumentBase

splitIndex

public void splitIndex(boolean currentIndex)
                throws java.io.IOException,
                       SDXException
Split the current big index into 2 smaller one

Specified by:
splitIndex in class SDXDocumentBase
Throws:
java.io.IOException
SDXException

initializeVectorizedIndex

protected void initializeVectorizedIndex()
                                  throws org.apache.avalon.framework.configuration.ConfigurationException
Initialize the index vector by searching all sub index in it's directory NB : working as intended

Throws:
org.apache.avalon.framework.configuration.ConfigurationException

addSubIndex

protected void addSubIndex()
Add a splitted sub-index and update configuration aftermath


removeSubIndex

protected void removeSubIndex()
Remove a splitted sub-index and update configuration aftermath Currently of no use as there is no plan to do so, just here as a reminder for future functionnalities


splitCheck

public boolean splitCheck(boolean currentIndex)
                   throws SDXException
Return true when splitting condition are reached if true, should be followed by a splitIndex() call

Specified by:
splitCheck in class SDXDocumentBase
Throws:
SDXException

getIndexSize

protected long getIndexSize(LuceneIndex index)
� Return the index size


getSearcher

public org.apache.lucene.search.Searcher getSearcher()
                                              throws SDXException
Return the index searcher for all this document base indexes

Throws:
SDXException

getIndexReader

public org.apache.lucene.index.IndexReader getIndexReader()
                                                   throws SDXException
Return the index reader for all this document base indexes

Throws:
SDXException

getFormatedSubIndexId

protected java.lang.String getFormatedSubIndexId(int subIndexNumber)
Get the formated sub-index number (for directories name)


addSubIndex

protected void addSubIndex(LuceneIndex index)
Add a splitted sub-index and update configuration aftermath


renewKeyIndex

protected void renewKeyIndex()
refresh data for the main and current index


backup

public void backup(SaveParameters save_config)
            throws SDXException
Save the DocumentBase data objects

Specified by:
backup in interface Saveable
Overrides:
backup in class SDXDocumentBase
Throws:
SDXException
See Also:
Saveable.backup(fr.gouv.culture.sdx.utils.save.SaveParameters)

backupIndexes

protected void backupIndexes(SaveParameters save_config)
                      throws SDXException
Save the indexes files

Specified by:
backupIndexes in class SDXDocumentBase
Throws:
SDXException

backupTimeStamp

protected void backupTimeStamp(SaveParameters save_config)
                        throws SDXException
Save the timestamp files

Specified by:
backupTimeStamp in class SDXDocumentBase
Throws:
SDXException

restore

public void restore(SaveParameters save_config)
             throws SDXException
Restore the DocumentBase data objects

Specified by:
restore in interface Saveable
Overrides:
restore in class SDXDocumentBase
Throws:
SDXException
See Also:
Saveable.restore(fr.gouv.culture.sdx.utils.save.SaveParameters)

restoreIndexes

protected void restoreIndexes(SaveParameters save_config)
                       throws SDXException
Save the indexes files

Specified by:
restoreIndexes in class SDXDocumentBase
Throws:
SDXException

restoreTimeStamp

protected void restoreTimeStamp(SaveParameters save_config)
                         throws SDXException
Restore the timestamp files

Specified by:
restoreTimeStamp in class SDXDocumentBase
Throws:
SDXException

docCount

public int docCount()
TODO - This needs to be periodically written to a .properties file TODO - we a configurable generic mechanism to save such information to a .properties file like certain queries, terms, etc. which should be updated after indexation/deletion

Returns:
the number of document in all sub indexes


Copyright © 2000-2003 Ministere de la culture et de la communication / AJLSM. All Rights Reserved.