Scientific Publications
Research & Benchmarks
Review the compiler papers, accuracy benchmarks, and architectural designs written by the Devorise AI engineering research units.
Research Papers Index
Publication Abstract & Telemetry
Context-Aware Arabic Legal OCR via Fine-Tuned Layout Transformers
Devorise AI Foundation UnitPUBLISHED: March 2026
Abstract Overview
Traditional optical character recognition (OCR) fails on complex Arabic legal scans containing varied font sizing, official stamps, and signature marks. This paper proposes a fine-tuned layout transformer that extracts document text blocks while preserving logical hierarchy. The pipeline integrates a language model alignment checker to rectify parsing discrepancies prior to committing database changes.
Performance Benchmarks
99.42%
Layout Accuracy
115ms
Parse Speed (Per Page)
14.2x
Error Reduction VS Baseline
spec_manifest.yaml
model_metadata: architecture: "LayoutLMv3-Arabic-v2" fine_tuned_epochs: 18 context_window: 2048 validation_strategy: "checksum_verification" dataset_size: "124k_Arabic_Legal_Scans"