Andrejus Baranovski
Secure and Private: On-Premise Invoice Processing with LangChain and Ollama RAG
The Ollama desktop tool helps run LLMs locally on your machine. This tutorial explains how I implemented a pipeline with LangChain and Ollama for on-premise invoice processing. Running LLM on-premise provides many advantages in terms of security and privacy. Ollama works similarly to Docker; you can think of it as Docker for LLMs. You can pull and run multiple LLMs. This allows to switch between LLMs without changing RAG pipeline.
Easy-to-Follow RAG Pipeline Tutorial: Invoice Processing with ChromaDB & LangChain
I explain the implementation of the pipeline to process invoice data from PDF documents. The data is loaded into Chroma DB's vector store. Through LangChain API, the data from the vector store is ready to be consumed by LLM as part of the RAG infrastructure.
Vector Database Impact on RAG Efficiency: A Simple Overview
I explain the importance of Vector DB for RAG implementation. I show with a simple example, how data retrieval from Vector DB could affect LLM performance. Before data is sent to LLM, you should verify if quality data is fetched from Vector DB.
JSON Output from Mistral 7B LLM [LangChain, Ctransformers]
I explain how to compose a prompt for Mistral 7B LLM model running with LangChain and Ctransformers to retrieve output as JSON string, without any additional text.
Structured JSON Output from LLM RAG on Local CPU [Weaviate, Llama.cpp, Haystack]
I explain how to get structured JSON output from LLM RAG running using Haystack API on top of Llama.cpp. Vector embeddings are stored in Weaviate database, the same as in my previous video. When extracting data, a structured JSON response is preferred because we are not interested in additional descriptions.
Invoice Data Processing with Llama2 13B LLM RAG on Local CPU [Weaviate, Llama.cpp, Haystack]
I explained how to set up local LLM RAG to process invoice data with Llama2 13B. Based on my experiments, Llama2 13B works better with tabular data compared to Mistral 7B model. This example presents a production LLM RAG setup with Weaviate database for vector embeddings, Haystack for LLM API, and Llama.cpp to run Llama2 13b on a local CPU.
Invoice Data Processing with Mistral LLM on Local CPU
I explain the solution to extract invoice document fields with open-source LLM Mistral. It runs on CPU and doesn't require Cloud machine. I'm using Mistral 7B LLM model, Langchain, Ctransformers and Faiss vector store to run it on a local CPU machine. This approach gives a great advantage for enterprise systems, when running ML models on Cloud is not allowed for privacy reasons.
Skipper MLOps Debugging and Development on Your Local Machine
I explain how to stop some of the Skipper MLOps services running in Docker and debug/develop these services code locally. This improves development workflow. There is no need to deploy code change to Docker container, it can be tested locally. Service that runs locally, connects to the Skipper infra through RabbitMQ queue.
Pros and Cons of Developing Your Own ChatGPT Plugin
I've been running ChatGPT plugin in prod for a month and sharing my thoughts about the pros and cons of developing it. Would I build a new ChatGPT plugin?
LLama 2 LLM for PDF Invoice Data Extraction
I show how you can extract data from text PDF invoice using LLama2 LLM model running on a free Colab GPU instance. I specifically explain how you can improve data retrieval using carefully crafted prompts.
Data Filtering and Aggregation with Receipt Assistant Plugin for ChatGPT
I explain Receipt Assistant plugin for ChatGPT from a user perspective. I show how to fetch previously processed and saved receipt data, including filtering and aggregation. Also, I show how you can fix spelling mistakes for Lithuanian language receipt items. At the end, numeric data is visualized with WizeCharts plugin for ChaGPT.
Computer Vision with ChatGPT - Receipt Assistant Plugin
Our plugin - Receipt Assistant was approved to be included in ChatGPT store. I explain how it works and how to use it in combination with other plugins, for example, to display charts. Receipt Assistant provides vision and storage option for ChatGPT. It is primarily tuned to work with receipts, but it can handle any structured info of medium complexity.
How to Host FastAPI from Your Computer with ngrok
With ngrok you can host your FastAPI app from your computer. This can be a handy and cheaper option for some projects. In this video, I explain my experience running FastAPI apps from my very own Cloud with ngrok :)
ChatGPT Plugin OAuth with Logto
You can setup OAuth for ChatGPT plugin, to be able to get user info. This is needed when the plugin works with user data, and you want to keep that data across sessions. With OAuth you can authenticate users. I explain how to setup it with Logto. Logto is an Auth0 alternative for building modern customer identity infrastructure with minimal effort, for both your customers and their organizations.
Deploy Local ML Apps with Ngrok
Ngrok helps to run your local apps online with access on the Web. It provides HTTPS with auto renewal, content compression. With Ngrok you can serve your ML apps running on local infra to external users, similar as it would be running on Cloud. Main advantage of such approach - it allows to reduce infra cost.
ChatGPT Plugin Backend with FastAPI
This tutorial explains how to integrate FastAPI backend with ChatGPT plugin implemented in Python. Backend stores data from ChatGPT in MongoDB to be persistent and available across sessions.
ChatGPT Plugin with Persistent Storage
Receipt Assistant is our ChatGPT plugin with persistent storage support. I show how it works to upload a scanned receipt and store OCR result converted to key/value pairs by ChatGPT. Load data back into ChatGPT, review it, and produce insights. In my future videos, I explain how it works from a technical point of view.
FastAPI, Pydantic and MongoDB for Beginners
I show how to initialize a connection to MongoDB from FastAPI endpoint with a startup event. Before pushing it to MongoDB collection, new record validation is done with Pydantic. I like the flexibility of MongoDB Motor async library. It helps to implement seamless communication from FastAPI to MongoDB.
File Upload App for ChatGPT
ChatGPT doesn't provide a file upload option out of the box. I explained the app I built with Streamlit to handle file upload and allow ChatGPT to fetch file content through the plugin and unique key.
Building Your Own ChatGPT Plugin
I explain how to get started with ChatGPT plugin development. It is essential to understand how to define OpenAPI specification to match endpoints. In this example, you will see a working use case with endpoints providing info on uploading a file and then fetching file data into ChatGPT.