Andrejus Baranovski
Structured Data Retrieval with Sparrow using OCR and Vision LLM [Improved Accuracy]
I explain improvements I'm adding into Sparrow to achieve better accuracy for structured data. I'm using a method, where I run OCR step first, then construct advanced prompt with injected OCR data. This prompt is sent along with image to Vision LLM for structured data retrieval. All this happens as part of a single pipeline.
Ollama and MLX-VLM Accuracy Review (Qwen3-VL and Mistral Small 3.2)
I was running detail tests to compare accuracy for the same models (Qwen3-VL and Mistral Small 3.2) running on Ollama and MLX-VLM (recent 0.3.7 version). MLX-VLM runs faster, but with lower accuracy. The same is valid across different models.
Comparing Qwen3-VL AI Models for OCR Task
I'm comparing the Qwen3-VL 8B BF16 and Qwen3-VL 30B Q8 models for OCR and structured data extraction tasks. Based on my findings, the quantized 30B model runs faster and with better accuracy than the 8B BF16 model, despite using more memory.
Qwen3-VL Accuracy Differences on Ollama vs MLX
I run couple of tests with structured data extraction using newest Qwen3-VL model on Mac Mini M4 Pro with 64GB. I discovered the same Qwen3-VL model with the same level of quantantization performs differently on Ollama vs. MLX. It seems model conversion step is crucial and we must evaluate model performance on different platforms before going to production.
Qwen3-VL New Models Comparison and Performance on Mac Mini M4
I run and compare newest Qwen3-VL models in Sparrow. Qwen3-VL models run fast and provide good accuracy.
Ollama Support in Sparrow and Update to Latest MLX
I explain whats new in Sparrow and what was updated in the recent version.
Ollama vs MLX Inference Speed on Mac Mini M4 Pro 64GB
MLX runs faster on first inference, but thanks to model caching or other optimizations by Ollama, second and next inference runs faster on Ollama.
Advanced Structured Data Processing in Sparrow
I added instruction and validation functionality into Sparrow. This allows to process business logic with document data directly through Sparrow query. For example, it allows to check if given fields are present in the document.
My Experience with PyCharm AI Assistant
Explaining my experience with PyCharm AI Assistant. Showing example how code changes can be reviewed one by one, before they are accepted into your codebase.
Financial Table Structure Analysis with Computer Vision
Explaining new functionality I'm implementing in Sparrow to pre-process tables with grid structure. This greatly improves table data extraction by Vision LLMs.
PaddleOCR 3.1 Setup in FastAPI
I explain how to run PaddleOCR 3.1 from FastAPI app.
Structured Data Query with Sparrow AI Agent
Sparrow comes with option to extract stuctured data with query. In this video I explain how you can define such query to fetch array and field data.
Vision LLM with MLX: Extracting Electric Meter Data in Production
In this video, I share my experience using the MLX backend to run Vision LLM (with MLX-VLM) for structured data extraction in a production environment. See how I used Sparrow to accurately read electric meter data and learn practical tips for deploying similar solutions.
Solving Upwork Client Task with Sparrow
I show how Sparrow can be used to handle complex Upwork task, with accurate table data extraction. Key requirement is to prevent Vision LLM hallucinations, this is achieved by Sparrow hybrid data processing approach.
How to Extract Financial Statement Data with Sparrow & Vision LLM
Extract financial statement data with Sparrow and Vision LLM in this quick tutorial! Sparrow auto-detects tables, builds clear grids, and uses OCR for accurate Vision LLM results, preventing errors. Runs locally with no cloud dependency, making it great for private financial documents. Perfect for anyone handling sensitive financial data.
Boost Vision LLM Accuracy with OCR Text Integration
I show an interesting approach where I send both an image and OCR text to a Vision LLM. The prompt is constructed to instruct the Vision LLM to prioritize the OCR text. This allows the use of a Vision LLM for structured output construction while relying on external OCR text, giving you more control over the results.
Solving Vision LLM Number Formatting Issues Using PaddleOCR and Sparrow
Discover how to fix number formatting errors in vision LLMs like Mistral! In this video, I show how Mistral misreads "56,000" as "56000" and how combining PaddleOCR’s text extraction with Sparrow’s spatial data processing solves this hallucination issue.
PaddleOCR 3.0: Supercharge Your AI
I upgraded to PaddleOCR 3.0 and explain the new PaddleOCR API integration. My goal is to integrate OCR result output with Vision LLM processing to enhance large-scale, structured table data output.
Box Annotations in Sparrow for Structured Data Extraction
Check out my video on Box Annotations in Sparrow for Structured Data Extraction! I’ll show you how the Qwen2.5 vision model pulls bounding box annotations from images based on what you need. Plus, create simple descriptions and confidence score boxes.
Structured Data Annotation with Qwen2.5 VL and MLX-VLM
Qwen2.5 VL can provide bounding box coordinates and confidence values for extracted structured data. This is useful for visual data review and reporting. I will explain with a practical example what prompt should be used to ensure Qwen2.5 returns this data.


