An administrator can configure the location of the full-text search index and related settings. Start and pause the crawler and even delete and reindex the entire system if necessary. The Audit Log captures both the searches performed by users and any activities like reindexing.

Full-Text Search Configuration

The Full-Text Search Configuration page allows you to configure the index and review statistics about your index.

  • Go to (Admin) > Site > Admin Console.
  • In the Management section, click Full-text search.

Index Configuration

  • Set Path. You can change the directory that stores the index (default: <tomcat>/temp/labkey_full_text_index) by entering a new path and clicking the Set Path button.
    • Changing the location of the index requires re-indexing all data, which may affect performance.
    • Resetting the path of the index is especially useful if you are running multiple LabKey deployments on the same Tomcat instance, because it allows each LabKey deployment to use a unique index.
    • Be sure to place your labkey_full_text_search index on a drive with sufficient space.
    • Never place the search index on an NFS drive or AWS EFS.
  • Start/Pause Crawler. The crawler, or document indexer, continually inventories your site when running. You might pause it to diagnose issues with memory or performance.
  • Delete Index. You can delete the entire index for your server. Please do this with caution because rebuilding the index can be slow.
  • Directory Type. This setting lets you can change the search indexing directory type. The setting marked "Default (MMapDirectory)" allows the underlying search library to choose the directory implementation (based on operating system and 32-bit vs. 64-bit). The other options override the default heuristic and hard-code a specific directory implementation. These are provided in case the "Default" setting causes problems on a specific deployment. Use the default type unless you see a problem with search. Contact LabKey for assistance if full-text indexing or searching seems to have difficulty with the default setting.
  • Indexed File Size Limit: The default setting is 100 MB. Files larger than the limit set on this page will not be indexed. You can change this cap, but this is generally not recommended. Increasing it will result in additional memory usage; if you increase beyond 100MB, we recommend you also increase your heap size to be 4GB or larger. The size of xlsx files is limited to 1/5 the total file size limit set (i.e. defaults to 20MB).

Index Statistics

This section provides information on the information that has been indexed by the system, plus identifies the limits that have been set for the indexer by the LabKey team. These limits enhance performance. For example, you will see the "Maximum Size" of files that will be scanned by the indexer; the maximum size allows the system to avoid indexing exceptionally large files.

Search Statistics

Lists the average time in milliseconds for each phase of searching the index, from creating the query to processing hits.

Audit Log

To see the search audit log:

  • Go to (Admin) > Site > Admin Console.
  • In the Management section, click Audit Log.
  • Choose the Search option in the pulldown menu.

This displays the log of audited search events for your system. For example, you can see the terms entered by users in the search box. If someone has deleted your search index, this event will be displayed in the list, along with information on the user who ordered the delete.

Set Up Folder-Specific Search Boxes

By default, a site-wide search box is included in the LabKey header. You can add additional search boxes to individual projects or folders and choose how they are scoped.

  • Add a Search web part to either column on the page.
  • This search will only search the container where you created it.
  • To also include subfolders, select Customize from the (triangle) menu and check the box to "Search subfolders".

As an example of a search box applied to a particular container, use the search box to the right of this page you are reading. It will search only the current folder (the LabKey documentation).

List and External Schema Metadata

By default, the search index includes metadata for lists and external schemas (including table names, table descriptions, column names, column labels, and column descriptions).

You can control indexing of List metadata when creating or editing a list definition under Advanced List Settings > Search Indexing Options. Learn more here: Edit a List Design

You can turn off indexing of external schema metadata by unchecking the checkbox Index Schema Meta Data when creating or editing an external schema definition. Learn more here: External Schemas and Data Sources

Include/Exclude a Folder from Search

You may want to exclude the contents of certain folders from searches. For example, you may not want archived folders or work in progress to appear in search results.

To exclude the contents of a folder from searches:
  • Navigate to the folder and select (Admin) > Folder > Management.
  • Click the Search tab.
  • Uncheck the checkbox Include this folder's contents in multi-folder search results.
  • Click Save.

Note that this does not exclude the contents from indexing, so searches that originate from that folder will still include its contents in the results. Searches from any other folder will not, even if they specify a site- or subfolder-inclusive scope.

Exclude a File/Directory from Search Indexing

LabKey generally indexes the file system directories associated with projects and folders, i.e. the contents of the @files and other filesets. Some file and directory patterns are ignored (skipped during indexing), including:

  • Contents of directories named ".Trash" or ".svn"
  • Files with names that start with "."
  • Anything on a path that includes "no_crawl"
  • Contents of any directory containing a file named ".nocrawl"
  • On postgres contents of a directory containing a file named "PG_VERSION"
To exclude a file or the content of a directory from indexing, you may be able to employ one of the above conventions.

Troubleshoot Search Indexing

Search Index not Initialized

When the search index cannot be found or properly initialized, you may see an error similar to "ERROR LuceneSearchServiceImpl <DATE> Module Upgrade : Error: Unable to initialize search index. Search will be disabled and new documents will not be indexed for searching until this is corrected and the server is restarted." A path to the .tip file will be included in the error text.

Options for resolving this include:

  • If you know the path is incorrect, and know where the .tip file resides, you can correct the Path to full-text search index and try again.
  • You can also Delete the current index and start the crawler again, essentially re-indexing all of your data. This option may take time but will run in the background, so can complete while you do other work on the server.

Threads Hang During Search

If you experience Tomcat threads hanging during search operation, check to ensure that your search index is NOT on an NFS filesystem or on AWS EFS. These filesystems should never be used for a full-text search index.

No Search Results When Expected

If you encounter unexpected search results, particularly a lack of any results when matches are known, you may need to rebuild the search index. This situation may occur in particular on a development machine where you routinely pause the crawler.

To rebuild the index, pause the crawler (if running), delete the index, and restart the crawler to and rebuild it.

Related Topics

Discussion

Was this content helpful?

Log in or register an account to provide feedback


previousnext
 
expand all collapse all