Scientific Publications

Research & Benchmarks

Review the compiler papers, accuracy benchmarks, and architectural designs written by the Devorise AI engineering research units.

Research Papers Index

Publication Abstract & Telemetry

Context-Aware Arabic Legal OCR via Fine-Tuned Layout Transformers

Devorise AI Foundation UnitPUBLISHED: March 2026
Abstract Overview

Traditional optical character recognition (OCR) fails on complex Arabic legal scans containing varied font sizing, official stamps, and signature marks. This paper proposes a fine-tuned layout transformer that extracts document text blocks while preserving logical hierarchy. The pipeline integrates a language model alignment checker to rectify parsing discrepancies prior to committing database changes.

Performance Benchmarks
99.42%
Layout Accuracy
115ms
Parse Speed (Per Page)
14.2x
Error Reduction VS Baseline
spec_manifest.yaml
model_metadata:
  architecture: "LayoutLMv3-Arabic-v2"
  fine_tuned_epochs: 18
  context_window: 2048
  validation_strategy: "checksum_verification"
  dataset_size: "124k_Arabic_Legal_Scans"