Skip to main content

Training

Create high-quality training data by labeling task files and marking them for AI model training across different processing steps.

Training Data Workflow

Mark for Training

Flag correctly labeled tasks as training data:

  1. Complete Labeling - Apply all necessary labels to a task file
  2. Review Accuracy - Verify labels are correctly placed and complete
  3. Mark for Training - Click the "Mark for training" button (Alt+A)
  4. Training Dataset - Task becomes part of your AI model training data

Marked tasks display a "Marked for training" badge and contribute to model improvement when sufficient training data is available.

Manual Prediction

Generate AI predictions on-demand for any task:

  1. Select Task - Open the task file in the editor
  2. Predict - Click "Predict" button or use Alt+P hotkey
  3. Review Results - Examine AI-generated predictions
  4. Validate - Correct any inaccurate predictions manually
  5. Training Mark - Mark corrected predictions for training data

Erase All Labels

Clear all labels and predictions to start fresh:

  1. Select Task - Open the task file requiring label removal
  2. Clear Labels - Click "Erase" button or use Alt+C hotkey
  3. Confirm Removal - All manual and predicted labels are removed
  4. Relabel - Apply new labels manually or trigger prediction

Step-Specific Labeling

Different processing steps require different labeling approaches and support unique interaction methods.

Classification Labels

Categorize documents and define routing logic:

Label Application

  • Keyboard Shortcuts - Assign Ctrl + key combinations for quick labeling
  • Multiple Categories - Apply multiple classification labels per task
  • Confidence Thresholds - Set minimum prediction confidence levels

Right-Click Removal

Remove classification labels using context menus:

  1. Right-click on applied classification label
  2. Select "Remove" from context menu
  3. Label removed instantly without confirmation

Hotkey Configuration

Set up keyboard shortcuts for efficient labeling:

  • Ctrl Key - Toggle Ctrl key requirement for shortcut
  • Suffix Key - Define the letter/number for the combination
  • Quick Access - Label documents rapidly with keyboard shortcuts

Segmentation Labels

Define document regions and content boundaries:

Visual Segmentation

Create segments directly on document pages:

  • Manual Drawing - Draw segment boundaries on document image
  • Coordinate-Based - Define precise segment coordinates
  • Visual Feedback - Segments display with colored boundaries

Right-Click Editing

Modify or remove segments using context menus:

  1. Right-click on existing segment
  2. Select action from context menu:
    • Edit - Modify segment boundaries
    • Delete - Remove segment completely
  3. Visual Update - Changes reflected immediately on document

Automatic Segmentation

AI-assisted segment creation with manual review:

  • Model Predictions - AI suggests segment boundaries
  • Manual Adjustment - Refine predicted segments as needed
  • Approval Process - Review and approve segments before processing

Extraction Labels

Annotate specific data fields and content areas:

Entity Placement

Place extraction labels on target content:

  • Text Selection - Select text to annotate with entity labels
  • Field Mapping - Connect selected text to extraction fields
  • Multi-Selection - Apply same label to multiple text selections

Right-Click Management

Remove or modify extraction annotations:

  1. Right-click on placed annotation
  2. Context Options:
    • Edit Label - Change annotation type or properties
    • Remove - Delete annotation completely
    • Copy - Duplicate annotation settings
  3. Immediate Update - Changes applied without additional confirmation

Field Descriptions

Enhance AI model understanding with descriptive labels:

  • Label Names - Clear, descriptive field identifiers
  • Descriptions - Detailed explanations helping AI identify correct content
  • Examples - Sample text patterns the label should match

Grouping Labels

Organize document pages into logical groups:

Context Grouping

Define page relationships and grouping rules:

  • Page Selection - Choose pages belonging to same group
  • Group Names - Assign descriptive names to page groups
  • Relationship Rules - Define how pages relate within groups

Right-Click Operations

Manage page groups efficiently:

  1. Right-click on grouped pages
  2. Group Options:
    • Ungroup - Remove pages from current group
    • Rename Group - Change group identifier
    • Add Pages - Include additional pages in group
  3. Visual Organization - Group changes reflected in page layout

Label Quality Guidelines

Consistency Standards

Maintain high-quality training data:

  • Consistent Naming - Use standardized label names across similar documents
  • Complete Coverage - Label all relevant content, not just obvious examples
  • Accurate Boundaries - Ensure precise text selection and segment boundaries
  • Field Validation - Verify extracted content matches expected data types

Training Data Balance

Build robust datasets across document varieties:

  • Document Types - Include examples from all document categories
  • Content Variations - Label different layouts, formats, and content styles
  • Edge Cases - Include challenging examples and unusual formats
  • Sufficient Volume - Accumulate adequate examples for reliable training

Continuous Improvement

Iteratively refine training data quality:

  • Review Predictions - Regularly examine AI predictions for accuracy
  • Correct Errors - Fix inaccurate predictions and mark corrected versions
  • Update Labels - Refine label definitions based on prediction performance
  • Monitor Performance - Track prediction accuracy over time

Keyboard Shortcuts

Universal Actions

Available across all step types:

  • Alt + A - Mark/unmark task for training
  • Alt + P - Trigger prediction on current task
  • Alt + C - Clear all labels from current task

Classification Shortcuts

Step-specific keyboard combinations:

  • Ctrl + [Key] - Apply assigned classification label
  • Custom Keys - Configure in classification label settings

Troubleshooting

Training Mark Issues

Common problems and solutions:

  • Button Disabled - Ensure task has required labels and processing is complete
  • Missing Badge - Verify task status and refresh page if necessary
  • Unmarking Failed - Check task permissions and processing status

Prediction Problems

Address prediction accuracy issues:

  • No Predictions - Verify model is configured and has sufficient training data
  • Inaccurate Results - Review and correct predictions, then mark for training
  • Model Errors - Check label mapping between project and model configurations

Labeling Interface

Resolve labeling interface problems:

  • Right-Click Not Working - Ensure proper browser settings and page loading
  • Shortcuts Disabled - Verify keyboard shortcuts in browser and system settings
  • Visual Issues - Refresh page or check browser compatibility