Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]
Added
- Unterstützung für UI-Sprachen (Englisch
en, Spanisches, Französischfr, Deutschde) mit Sprachumschalter. - Mehrsprachige MkDocs-Dokumentation (Englisch, Spanisch, Französisch, Deutsch) unter
/api/docs/site/<locale>/. - Kategoriebeschriftungen im Dropzone-Panel (Dokumente, Web, Bilder, Daten) jetzt vollständig internationalisiert.
Fixed
- Die Dokumentationsnavigation zeigt jetzt vollständig lokalisierte Seitennamen in allen unterstützten Sprachen an.
- Kategoriebeschriftungen für Dateiformate im Dropzone-Panel werden jetzt korrekt basierend auf der ausgewählten Sprache übersetzt.
- Verbesserte Extraktion von Dokumentationsseitentiteln mit besserem Fallback auf übersetzte Namen.
Changed
- Backend-Einstiegspunkt von
app.pyzuduckling.pyumbenannt für bessere Klarheit. - Flask-Anwendungsname zu "duckling" geändert (zeigt "Serving Flask app 'duckling'").
[0.0.10] - 2026-02-24
Security
- Frontend-Sicherheitslücken behoben (esbuild GHSA-67mh-4wv8-2f99): Vite 5→7, Vitest 1→4 und zugehörige Abhängigkeiten aktualisiert.
Fixed
vitest.config.tsfür Vitest 4-Kompatibilität aktualisiert.- CI/CD Node.js-Versionsanforderung auf 22 aktualisiert (erforderlich für Vite 7).
[0.0.9] - 2026-01-08
Added
- Custom Branding: Duckling-Logo und Versionsanzeige in der Kopfzeile.
- URL-basierte Dokumentenkonvertierung: Konvertierung von URLs mit automatischer Bildextraktion für HTML.
- Dokumentenanreicherungsoptionen: Code-, Formel-, Bildklassifizierung und Bildbeschreibung.
- Enrichment-Modell-Vorab-Download: KI-Modelle vor der Verarbeitung herunterladen.
- Bildvorschau-Galerie: Visuelle Miniaturansichten mit Lightbox-Viewer.
- OCR-Backend-Auto-Installation: Ein-Klick-Installation für pip-installierbare Backends.
- Format-spezifische Vorschau: Vorschau-Panel zeigt Inhalt im gewählten Exportformat.
- Gerendert vs. Roh-Vorschau-Modus: Umschalter für HTML und Markdown.
- Erweiterte Docker-Unterstützung: Multi-Stage-Dockerfiles, docker-compose-Varianten, Multi-Platform-Builds.
Fixed
- Multi-Worker-Inhaltsabruf (Bilder, Tabellen, Ergebnisse).
- HTML-Vorschau in der UI.
- URL-Bildextraktion für nicht in Anführungszeichen gesetzte
src-Attribute. - Dokumentations-Panel bedient jetzt vorgefertigte MkDocs-Site.
- Umgebungsvariablen und
.env-Laden. - Groß-/kleinschreibungsunabhängige Dateiendungen.
- Konfidenz-Score und OCR-Backend-Auswahl.
[0.0.8] - 2026-01-07
Changed
- Renamed: Project renamed from "Docling UI" to "Duckling"
- Updated all documentation, code, and configuration files
- Branding updated throughout the application
[0.0.7] - 2026-01-07
Added
- MkDocs Documentation: Migrated documentation to MkDocs with Material theme
- Modern, searchable documentation site
- Dark/light theme toggle
- Mermaid diagram support
- Improved navigation and organization
Changed
- Documentation structure reorganized for better navigation
- All diagrams converted to Mermaid format for live rendering
[0.0.6] - 2025-12-11
Security
- CRITICAL: Fixed Flask debug mode enabled by default in production
- Debug mode now controlled by
FLASK_DEBUGenvironment variable (default: false) - Host binding defaults to
127.0.0.1instead of0.0.0.0 - HIGH: Updated vulnerable dependencies
flask-cors: 4.0.0 → 6.0.0 (CVE-2024-1681, CVE-2024-6844, CVE-2024-6866, CVE-2024-6839)gunicorn: 21.2.0 → 23.0.0 (CVE-2024-1135, CVE-2024-6827)werkzeug: 3.0.1 → 3.1.4 (CVE-2024-34069, CVE-2024-49766, CVE-2024-49767, CVE-2025-66221)- MEDIUM: Added path traversal protection to file serving endpoints
- Image serving validates paths don't escape allowed directories
- Blocks directory traversal sequences (
..) - MEDIUM: Enhanced SQL query sanitization
- Search queries now escape LIKE wildcards
- Added query length limits
- Added comprehensive
SECURITY.mdwith: - Security audit summary
- Production deployment checklist
- Environment variable documentation
- Vulnerability reporting guidelines
Changed
- Backend now uses environment variables for all security-sensitive configuration
- Default host changed from
0.0.0.0to127.0.0.1for safer defaults
[0.0.5] - 2025-12-10
Added
- Batch Processing: Upload and convert multiple files at once
- Toggle batch mode in the header
-
Process multiple documents simultaneously
-
Image & Table Extraction:
- Extract embedded images from documents
- Download individual images
- Extract tables with full data preservation
- Export tables to CSV format
-
View table previews in the UI
-
RAG/Chunking Support:
- Document chunking for RAG applications
- Configurable max tokens per chunk (64-8192)
- Merge peers option for undersized chunks
-
Download chunks as JSON
-
Additional Export Formats:
- Document Tokens (
.tokens.json) - RAG Chunks (
.chunks.json) -
Enhanced DocTags export
-
Advanced OCR Options:
- Multiple OCR backends: EasyOCR, Tesseract, macOS Vision, RapidOCR
- GPU acceleration support (EasyOCR)
- Configurable confidence threshold (0-1)
- Bitmap area threshold control
-
Support for 28+ languages
-
Table Structure Options:
- Fast vs Accurate detection modes
- Cell matching configuration
-
Structure extraction toggle
-
Image Generation Options:
- Generate page images
- Extract picture images
- Extract table images
-
Configurable image scale (0.1x - 4.0x)
-
Performance/Accelerator Options:
- Device selection: Auto, CPU, CUDA, MPS (Apple Silicon)
- Thread count configuration (1-32)
-
Document timeout setting
-
New API Endpoints:
POST /api/convert/batch- Batch conversionGET /api/convert/<job_id>/images- List extracted imagesGET /api/convert/<job_id>/images/<id>- Download imageGET /api/convert/<job_id>/tables- List extracted tablesGET /api/convert/<job_id>/tables/<id>/csv- Download table CSVGET /api/convert/<job_id>/tables/<id>/image- Download table imageGET /api/convert/<job_id>/chunks- Get document chunksGET/PUT /api/settings/performance- Performance settingsGET/PUT /api/settings/chunking- Chunking settingsGET /api/settings/formats- List all supported formats
Changed
- Settings Panel: Completely redesigned with all new options
- Export Options: Enhanced with tabs for different content types
- DropZone: Updated with format categories and batch mode support
- Converter Service: Major refactoring for dynamic pipeline options
Fixed
- Confidence score calculation now uses cluster-level predictions
- Better handling of partial conversion success
[0.0.4] - 2025-12-10
Added
- OCR Support: Full OCR integration using EasyOCR
- Support for 14+ languages
- Force Full Page OCR option
- Configurable OCR settings
- Improved Confidence Calculation: Average confidence from layout predictions
Changed
- Updated converter service to use configurable pipeline options
- Enhanced settings panel with OCR options
[0.0.3] - 2025-12-10
Added
- Initial release of Duckling
- Frontend Features:
- Drag-and-drop file upload
- Real-time conversion progress
- Multi-format export options
- Settings panel
- Conversion history panel
- Dark theme with teal accent
- Responsive design
-
Animated transitions
-
Backend Features:
- Flask REST API with CORS
- Async document conversion
- SQLite-based history
- File upload management
- Configurable settings
-
Health check endpoint
-
Supported Input Formats:
- PDF, Word, PowerPoint, Excel
- HTML, Markdown, CSV
- Images (PNG, JPG, TIFF, etc.)
-
AsciiDoc, XML
-
Export Formats:
- Markdown, HTML, JSON
-
DocTags, Plain Text
-
Developer Experience:
- Comprehensive test suites
- Docker support
- TypeScript
- ESLint configuration
Security
- Input validation for file uploads
- File type restrictions
- Maximum file size limits
- Secure filename handling
[Unreleased]
Planned
- User authentication
- Cloud storage integration
- Conversion templates
- API rate limiting
- WebSocket for real-time updates
- Dark/light theme toggle
- Keyboard shortcuts
- Accessibility improvements (WCAG 2.1)