Think Different: ollama

Thursday

OLLama and Gemma3 Tiny Test On CPU

Have you ever tested the tiny LLM Gemma3:1B with OLLama on your laptop or system that lacks a GPU?

You can build a fairly powerful GenAI application; however, it can be a little slow due to CPU processing.

Steps:

Download and install ollama if not already there in your system:

go to https://ollama.com/download and get the installation command
Check the ollama running by `ollama --version`

Now pull the Gemma LLM:

Go to https://ollama.com/library/gemma3
Run: `ollama pull gemma3:1b`

Run Ollama server with LLM if not already running

Check the list: `ollama list`
Run: `ollama serve`

Install the pip lib

Run: `pip install ollama`
Run: `pip install "jupyter-ai[ollama]`

To stop the ollama server

Run: `ps aux | grep ollama`
Run: `kill <PID>`
Run: `sudo systemctl stop ollama`

That all. Now got to your jupyter notebook. If not running run by command: `jupyter lab` or `jupyter notebook`

You can test by running my eg. notebook here https://github.com/dhirajpatra/jupyter_notebooks/blob/main/LLM/gemma3-1b-test.ipynb

Now it is your turn to configure, tuneing and develop many different application from RAG to Agentic AI. You can find out more code in my Github repos and also get the quick start guide here in blog. Thank you.

Saturday

Introducing the Local Copilot Chatbot Application: Your Ultimate Document-Based Query Assistant

actual screenshot taken of the knowledge bot

Introducing the Local Copilot Chatbot Application: Your Ultimate Document-Based Query Assistant

In today's fast-paced world, finding precise information quickly can make a significant difference. Our Local Copilot Chatbot Application offers a cutting-edge solution for accessing and querying document-based knowledge with remarkable efficiency. This Flask-based application utilizes the powerful Ollama and Phi3 models to deliver an interactive, intuitive chatbot experience. Here's a deep dive into what our application offers and how it leverages modern technologies to enhance your productivity.

What is the Local Copilot Chatbot Application?

The Local Copilot Chatbot Application is designed to serve as your personal assistant for document-based queries. Imagine having a copilot that understands your documents, provides precise answers, and adapts to your needs. That's exactly what our application does. It transforms your document uploads into a dynamic knowledge base that you can query using natural language.

Key Features

- Interactive Chatbot Interface: Engage with a responsive chatbot that provides accurate answers based on your document content.

- Document Upload and Processing: Upload your documents, and our system processes them into a searchable knowledge base.

- Vector Knowledge Base with RAG System: Utilize a sophisticated Retrieval-Augmented Generation (RAG) system that combines vector embeddings and document retrieval to deliver precise responses.

- Microservices Architecture: Our application uses a microservices approach, keeping the front-end and back-end isolated for greater flexibility and scalability.

- Session Management: Each user's interaction is managed through unique sessions, allowing for individualized queries and responses.

- Redis Cache with KNN: Used KNN algorithm with Redis cache to find similar questions already asked in session to get a faster response back.

Technologies Used

1. Flask: The back-end of our application is powered by Flask, a lightweight web framework that facilitates smooth interaction between the front-end and the chatbot service.

2. Ollama and Phi3 Models: These models form the core of our chatbot’s capabilities, enabling sophisticated language understanding and generation.

3. Chroma and Sentence Transformers: Chroma handles the vector database for document retrieval, while Sentence Transformers provide embeddings to compare and find relevant documents.

4. Redis: Used for caching responses to improve performance and reduce query times.

5. Docker: The entire application, including all its components, runs within Docker containers. This approach ensures consistent development and deployment environments, making it easy to manage dependencies and run the application locally.

6. Asynchronous Processing: Handles multiple user requests simultaneously, ensuring a smooth and efficient user experience.

How It Works

1. Document Upload: Start by uploading your documents through the front-end application. These documents are processed and stored in a vector knowledge base.

2. Knowledge Base Creation: Our system converts the document content into vector embeddings, making it searchable through the Chroma database.

3. Query Handling: When you pose a question, the chatbot uses the RAG system to retrieve relevant documents and generate a precise response.

4. Caching and Performance Optimization: Responses are cached in Redis to speed up future queries and enhance the overall performance of the system.

5. Session Management: Each session is tracked independently, ensuring personalized interactions and allowing multiple users to operate concurrently without interference.

What Can You Expect?

- Accurate Responses: The combination of advanced models and efficient retrieval systems ensures that you receive relevant and accurate answers.

- Flexible Integration: The microservices architecture allows for easy integration with various front-end frameworks and other back-end services.

- Enhanced Productivity: Quickly find and retrieve information from large volumes of documents, saving time and improving decision-making.

- Local Development: With all components running in Docker containers, you can easily set up and run the application on your local system.

Get Started

To explore the Local Copilot Chatbot Application, follow the setup instructions provided in our GitHub repository. Experience the power of a well-integrated chatbot system that understands your documents and delivers insightful answers at your fingertips.

System Used:

Medium power low RAM. However, if you can use 32GB RAM with Nvidia GPU and i7 CPU would be great and run after the first compilation.

GitHub Repo

https://github.com/dhirajpatra/ollama-langchain-streamlit

Thursday

OLLama and Gemma3 Tiny Test On CPU

Saturday

Introducing the Local Copilot Chatbot Application: Your Ultimate Document-Based Query Assistant

House Based Manufacturing Micro Clustering