The Document Abstraction Workflow supports the movement and tracking of documents through the following general process. All steps are optional for any given document. Each part of the workflow may be configured to suit your specific needs:
  • Document Upload: with or without initial automatic abstraction using an NLP Engine to obtain some metadata and text fields.
  • Assignment to a Manual Abstractor and/or Reviewer - may be done automatically or manually.
  • Abstraction of Information
  • Review of Abstracted Information
  • Potential Reprocessing or Additional Abstraction Rounds
  • Approval
Different types of documents (for example, Pathology Reports and Cytogenetics Reports) can be processed through the same workflow, task list and assignment process, each using abstraction algorithms specific to the type of document. The assignment process itself can also be customized based on the type of disease discussed in the document.

Roles and Tasks

  • NLP/Abstraction Administrator:
    • Review list of documents ready for abstraction
    • Make assignments of roles and tasks to others
    • Manage project groups corresponding to the expected disease groups and document types
    • Create document processing configurations
  • Abstractor:
    • Choose a document to abstract from assigned list
    • Abstract document
    • Submit abstraction for review - or approval if no reviewer is assigned
  • Reviewer:
    • Review list of documents ready for review
    • Review abstraction results
    • Mark document as ready to progress to the next stage - either approve or reject
    • Review and potentially edit previously approved abstraction results

It is important to note that documents to be abstracted may well contain protected health information (PHI). Protection of PHI is strictly managed by LabKey Server, and with the addition of the nlp_premium, compliance, and complianceActivites modules, all access to documents, task lists, etc, containing PHI can be gated by permissions and also subject to approval of terms of use specific to the user's intended activity. Further, all access that is granted, including viewing, abstracting, and reviewing can be logged for audit or other review.

All sample screenshots and information shown in this documentation are fictitious.

Abstraction Workflow

The document itself passes through a series of states within the process:

  • Ready for assignment: when automatic abstraction is complete, automatic assignment was not completed, or reviewer requests re-abstraction
  • Ready for manual abstraction: once an abstractor is assigned
  • Ready for review: when abstraction is complete, if a reviewer is assigned
  • (optional) Ready for reprocessing: if requested by the reviewer
  • Approved
Passage of a document through these stages can be done using a BPMN (business process management) workflow engine. LabKey Server uses an Activiti Workflow to automatically advance the document to the correct state upon completion of the prior state. Users assigned as abstractors and reviewers can see lists of tasks assigned to them and mark them as completed when done.

Abstraction Task List

The Abstraction Task List web part on the Portal tab will be unique for each user, showing a tailored view of the particular tasks they are to complete. Typically a user will have only one type of task to perform, but if they play different roles, such as for different document types, they will see multiple lists. Tasks may be grouped in batches, such as by focus area or priority, making it easier to work and communicate efficiently. Below the personalized task list(s), the All Cases list gives an overview of the latest status of all cases visible to the user in this container - both those in progress and those whose results have been approved. In this screenshot, an admin user has assignment tasks, and is also assigned one document to abstract and another to review.

All task lists can be sorted to provide the most useful ordering to the individual user. Save the desired sorted grid as the "default" view to use it for automatically ordering your tasks. When an abstraction or review task is completed, the user will advance to the next task on their default view of the appropriate task list.


Following the initial step of automatic abstraction using the NLP engine, many documents will also be assigned for manual abstraction. The manual abstractor begins with the information garnered by the NLP engine and validates, corrects, and adds additional information to the abstracted results.

The assignment of documents to individual abstractors may be done automatically or manually by an administrator. An administrator can also choose to bypass the abstraction step by unassigning the manual abstractor, immediately forwarding the document to the review phase.


The assigned user completes a manual document abstraction following the steps outlined here:


Once abstraction is complete, the document is "ready for review" (if a reviewer is assigned) and the task moves to the assigned reviewer. If the administrator chooses to bypass the review step, they can leave the reviewer task unassigned for that document.

Reviewers select their tasks from their personalized task list, but can also see other cases on the All Tasks list. In addition to reviewing new abstractions, they can review and potentially reject previously approved abstraction results. Abstraction administrators may also perform this second level review. A rejected document is returned for additional steps as described in the table here.

Developer Note: Retrieving Approved Data via API

The client API can be used to retrieve information about imported documents and results. However, the task status is not stored directly, rather it is calculated at render time when displaying task status. When querying to select the "status" of a document, such as "Ready For Review" or "Approved," the reportID must be provided in addition to the taskKey. For example, a query like the following will return the expected calculated status value:

SELECT reportId, taskKey FROM Report WHERE ReportId = [remainder of the query]


expand all collapse all