AI Models
Import, configure, and manage AI models for automated prediction across your document processing pipeline.
Model Import
Access AI model configuration through Pipeline settings to enhance automated processing capabilities.
Import Process
Add AI models to your processing steps:
- Navigate to Pipeline Settings - Access via Project Settings → Pipeline
- Select Processing Step - Choose the step requiring AI model configuration
- Automation Settings - Locate "Model" section within step configuration
- Import Model - Choose from available pre-trained models
- Configure Mapping - Connect project labels to model labels
Model Selection
Choose appropriate models based on processing requirements:
- Available Models - Browse models compatible with your step type
- Model Information - Review model descriptions, capabilities, and requirements
- Compatibility Check - Ensure model supports your document types and languages
- Performance Metrics - Consider accuracy and processing speed characteristics
Label Mapping
Connect your project's label structure to AI model capabilities for accurate predictions.
Mapping Process
Ensure proper label alignment:
- Project Labels - Review labels defined in your processing step
- Model Labels - Examine available labels from imported AI model
- Create Mapping - Connect corresponding project and model labels
- Validate Alignment - Verify mappings make logical sense for your use case
- Test Configuration - Run sample predictions to validate mapping accuracy
Mapping Best Practices
Optimize label connections:
- Exact Matches - Prefer direct label name matches when available
- Semantic Alignment - Map labels with similar meanings and purposes
- Complete Coverage - Ensure all project labels have corresponding model labels
- Avoid Conflicts - Prevent multiple project labels mapping to single model label
Label Import
Copy labels from existing models:
- Import from Another Model - Use labels from previously configured models
- Duplicate Prevention - System automatically prevents duplicate label IDs
- Batch Import - Import multiple labels simultaneously from source models
- Label Validation - Verify imported labels meet project requirements
Model Types
Different AI model categories serve specific processing needs across your pipeline.
Named Entity Recognition (NER)
Extract specific data fields from document text:
Proprietary NER Models
Custom-trained models for specific extraction needs:
- Proprietary:NER - Traditional named entity recognition
- Proprietary:RAG-NER - Retrieval-Augmented Generation enhanced extraction
- Custom Training - Models trained on your specific document types
- High Accuracy - Optimized for your particular data extraction requirements
Model Configuration
NER-specific settings:
- Max Characters - Maximum text length per processing chunk
- Chunk Size - Text segment size for processing (when applicable)
- Chunk Overlap - Overlap between text segments to maintain context
- Entity Labels - Data fields the model can identify and extract
Classification Models
Categorize documents and route them through processing workflows:
Wordvector Classification
Pattern-based document categorization:
- Pattern Matching - Classify based on text patterns and keywords
- Word Vectors - Use semantic text analysis for categorization
- Custom Patterns - Define specific classification rules and conditions
- Routing Logic - Direct documents to appropriate processing steps
Proprietary Classification
Advanced machine learning classification:
- ML-Based - Deep learning models for document categorization
- Multi-Class - Support multiple document types simultaneously
- Confidence Scoring - Provides prediction confidence levels
- Adaptive Learning - Improves accuracy with additional training data
Object Detection Models
Identify visual elements and structures within documents:
Segmentation Models
Divide document pages into logical regions:
- Content Segmentation - Identify headers, paragraphs, tables, images
- Layout Analysis - Understand document structure and organization
- Boundary Detection - Define precise content region boundaries
- Visual Processing - Analyze document images for structural elements
Signature Detection
Identify and locate signatures within documents:
- Signature Recognition - Detect handwritten signatures
- Location Mapping - Provide precise signature coordinates
- Verification Support - Assist in signature validation workflows
- Multiple Signatures - Handle documents with multiple signature areas
Checkbox Detection
Identify and read checkbox states:
- Checkbox Recognition - Detect checkbox elements in forms
- State Analysis - Determine checked/unchecked status
- Form Processing - Extract form data including checkbox responses
- Layout Independence - Handle various checkbox styles and positions
Context Grouping Models
Organize document pages into meaningful groups:
Grouper-CTX Models
Context-aware page grouping:
- Content Analysis - Group pages based on semantic content similarity
- Document Structure - Understand natural document boundaries and sections
- Multi-Page Documents - Handle complex documents with multiple sections
- Relationship Detection - Identify pages belonging to common topics or processes
Model Management
Model Status
Monitor AI model states and training progress:
- Untrained - Model imported but not yet trained on your data
- Training - Model currently learning from your training dataset
- Trained - Model ready for prediction use
- Failed - Model training encountered errors requiring attention
Training Models
Prepare models for production use:
- Training Data - Accumulate sufficient labeled examples
- Initiate Training - Start model training process with your dataset
- Monitor Progress - Track training status and completion
- Validation - Test trained model accuracy on sample documents
- Deployment - Activate trained model for automated predictions
Model Maintenance
Keep models performing optimally:
- Performance Monitoring - Track prediction accuracy over time
- Retraining - Update models with new training examples
- Version Management - Handle model updates and rollbacks
- Quality Assurance - Regular validation of model performance
Configuration Examples
Extraction Step Setup
Configure NER model for data extraction:
Processing Step: Extract Customer Information
Model Type: Proprietary:RAG-NER
Max Characters: 10,000
Chunk Size: 1,000
Chunk Overlap: 200
Label Mapping:
- Project Label: "customer_name" → Model Label: "person_name"
- Project Label: "customer_email" → Model Label: "email_address"
- Project Label: "phone_number" → Model Label: "phone"
Classification Step Setup
Configure classification model for document routing:
Processing Step: Document Type Classification
Model Type: Proprietary:Classification
Label Mapping:
- Project Label: "invoice" → Model Label: "financial_document"
- Project Label: "contract" → Model Label: "legal_document"
- Project Label: "receipt" → Model Label: "purchase_record"
Routing Rules:
- Invoice → Forward to "Invoice Processing" step
- Contract → Forward to "Legal Review" step
- Receipt → Forward to "Expense Processing" step
Troubleshooting
Import Issues
Resolve model import problems:
- Model Unavailable - Verify model compatibility with step type
- Permission Errors - Check team access to model library
- Import Failures - Retry import or contact system administrator
Mapping Problems
Fix label mapping issues:
- Missing Labels - Ensure all project labels have model counterparts
- Conflicting Mappings - Resolve multiple project labels mapping to single model label
- Performance Issues - Review mapping accuracy and adjust as needed
Training Problems
Address model training difficulties:
- Insufficient Data - Accumulate more labeled training examples
- Training Failures - Check training data quality and label consistency
- Poor Performance - Review label mappings and training data balance
Related Documentation
- Prediction Overview - Core prediction concepts and workflow
- Training - Creating and managing training data
- Project Configuration - Step configuration and processing workflows
The AI model system is actively under development. New model types, configuration options, and management features are regularly added.
Available Models
- Pre-trained models
- Custom trained models
- Model versioning and updates
- Performance metrics
Model Management
Learn how to deploy, monitor, and maintain AI models in your workflows.