Showing posts with label redis. Show all posts
Showing posts with label redis. Show all posts

Saturday

Introducing the Local Copilot Chatbot Application: Your Ultimate Document-Based Query Assistant



                                        
actual screenshot taken of the knowledge bot


Introducing the Local Copilot Chatbot Application: Your Ultimate Document-Based Query Assistant


In today's fast-paced world, finding precise information quickly can make a significant difference. Our Local Copilot Chatbot Application offers a cutting-edge solution for accessing and querying document-based knowledge with remarkable efficiency. This Flask-based application utilizes the powerful Ollama and Phi3 models to deliver an interactive, intuitive chatbot experience. Here's a deep dive into what our application offers and how it leverages modern technologies to enhance your productivity.


What is the Local Copilot Chatbot Application?


The Local Copilot Chatbot Application is designed to serve as your personal assistant for document-based queries. Imagine having a copilot that understands your documents, provides precise answers, and adapts to your needs. That's exactly what our application does. It transforms your document uploads into a dynamic knowledge base that you can query using natural language.


Key Features


- Interactive Chatbot Interface: Engage with a responsive chatbot that provides accurate answers based on your document content.

- Document Upload and Processing: Upload your documents, and our system processes them into a searchable knowledge base.

- Vector Knowledge Base with RAG System: Utilize a sophisticated Retrieval-Augmented Generation (RAG) system that combines vector embeddings and document retrieval to deliver precise responses.

- Microservices Architecture: Our application uses a microservices approach, keeping the front-end and back-end isolated for greater flexibility and scalability.

- Session Management: Each user's interaction is managed through unique sessions, allowing for individualized queries and responses.

- Redis Cache with KNN: Used KNN algorithm with Redis cache to find similar questions already asked in session to get a faster response back.


Technologies Used


1. Flask: The back-end of our application is powered by Flask, a lightweight web framework that facilitates smooth interaction between the front-end and the chatbot service.

2. Ollama and Phi3 Models: These models form the core of our chatbot’s capabilities, enabling sophisticated language understanding and generation.

3. Chroma and Sentence Transformers: Chroma handles the vector database for document retrieval, while Sentence Transformers provide embeddings to compare and find relevant documents.

4. Redis: Used for caching responses to improve performance and reduce query times.

5. Docker: The entire application, including all its components, runs within Docker containers. This approach ensures consistent development and deployment environments, making it easy to manage dependencies and run the application locally.

6. Asynchronous Processing: Handles multiple user requests simultaneously, ensuring a smooth and efficient user experience.


How It Works


1. Document Upload: Start by uploading your documents through the front-end application. These documents are processed and stored in a vector knowledge base.

2. Knowledge Base Creation: Our system converts the document content into vector embeddings, making it searchable through the Chroma database.

3. Query Handling: When you pose a question, the chatbot uses the RAG system to retrieve relevant documents and generate a precise response.

4. Caching and Performance Optimization: Responses are cached in Redis to speed up future queries and enhance the overall performance of the system.

5. Session Management: Each session is tracked independently, ensuring personalized interactions and allowing multiple users to operate concurrently without interference.


What Can You Expect?


- Accurate Responses: The combination of advanced models and efficient retrieval systems ensures that you receive relevant and accurate answers.

- Flexible Integration: The microservices architecture allows for easy integration with various front-end frameworks and other back-end services.

- Enhanced Productivity: Quickly find and retrieve information from large volumes of documents, saving time and improving decision-making.

- Local Development: With all components running in Docker containers, you can easily set up and run the application on your local system.


Get Started


To explore the Local Copilot Chatbot Application, follow the setup instructions provided in our GitHub repository. Experience the power of a well-integrated chatbot system that understands your documents and delivers insightful answers at your fingertips.


System Used:

Medium power low RAM. However, if you can use 32GB RAM with Nvidia GPU and i7 CPU would be great and run after the first compilation.



GitHub Repo

https://github.com/dhirajpatra/ollama-langchain-streamlit

Friday

Introduction to Django, Celery, Nginx, Redis and Docker

 




Django: A High-Level Web Framework


Django is a high-level web framework for building robust web applications quickly and efficiently. Written in Python, it follows the Model-View-Controller (MVC) architectural pattern and emphasizes the principle of DRY (Don't Repeat Yourself). Django provides an ORM (Object-Relational Mapping) system for database interactions, an admin interface for easy content management, and a powerful templating engine.


When to Use Django:


- Building web applications with complex data models.

- Rapid development of scalable and maintainable web projects.

- Emphasizing clean and pragmatic design.


Docker: Containerization for Seamless Deployment


Docker is a platform that enables developers to automate the deployment of applications inside lightweight, portable containers. Containers encapsulate the application and its dependencies, ensuring consistency across different environments. Docker simplifies the deployment process, making it easier to move applications between development, testing, and production environments.


When to Use Docker:


- Achieving consistency in different development and production environments.

- Isolating applications and dependencies for portability.

- Streamlining the deployment process with containerization.


Celery: Distributed Task Queue for Asynchronous Processing


Celery is an asynchronous distributed task queue system that allows you to run tasks asynchronously in the background. It's particularly useful for handling time-consuming operations, such as sending emails, processing data, or running periodic tasks. Celery supports task scheduling, result storage, and can be integrated with various message brokers.


When to Use Celery:


- Handling background tasks to improve application responsiveness.

- Performing periodic or scheduled tasks.

- Scaling applications by offloading resource-intensive processes.


Redis: In-Memory Data Store for Performance


Redis is an open-source, in-memory data structure store that can be used as a cache, message broker, or real-time analytics database. It provides fast read and write operations, making it suitable for scenarios where low-latency access to data is crucial. Redis is often used as a message broker for Celery in Django applications.


When to Use Redis:


- Caching frequently accessed data for faster retrieval.

- Serving as a message broker for distributed systems.

- Handling real-time analytics and data processing.


Nginx: The Versatile Web Server and Reverse Proxy


Nginx is a versatile web server and reverse proxy server known for its efficiency and scalability. It excels in handling concurrent connections and balancing loads. In Django applications, Nginx often acts as a reverse proxy, forwarding requests to the Django server.


When to Incorporate Nginx:


Enhancing performance by serving static files and handling concurrent connections.

Acting as a reverse proxy to balance loads and forward requests to the Django server.


Sample Application: Django ToDo App


I have created a beginner-level ToDo application using Django, Docker, Celery, and Redis. You can find the source code on [GitHub](https://github.com/dhirajpatra/docker-django-celery-postgres). The application demonstrates the integration of these technologies to build a simple yet powerful task management system.


Future Updates:


Feel free to explore the provided GitHub repository, and I encourage you to contribute or extend the application. I will be creating new branches to introduce additional features and improvements. Stay tuned for updates!


GitHub Repository: https://github.com/dhirajpatra/docker-django-celery-postgres

I have other similar repositories a few years back as well.

Azure Session Management

 

Photo by SHVETS production

Say we are going to create an application for customer management. Which requires faster interaction
from the customer to the application. So we need to manage the session with cache. Users can log in
from any device and use it seamlessly from any other device without any issues.




Below is an end-to-end solution for user login and session management using Azure Redis Cache,
API Gateway, Load Balancer, Azure App Service, and Azure Function serverless with a Flask
application in the backend.

In Azure, maintaining distributed session data typically involves using a combination of Azure services
and technologies.
Here are some best practices and technologies you can use to keep and manage distributed session data:

1. Azure Cache for Redis:
   - Description: Azure Cache for Redis is a fully managed, in-memory data store service built on the
popular open-source Redis. It is commonly used to store and manage session data for web applications.

   - Key Features:
     - In-Memory Storage
     - High Throughput
     - Support for Advanced Data Structures
     - Redis Pub/Sub for Messaging

   - Usage Example:
     ```python
     # Using Azure SDK for Python to interact with Azure Cache for Redis
     from azure.identity import DefaultAzureCredential
     from azure.redis.cache import RedisCacheClient

     credential = DefaultAzureCredential()
     redis_cache = RedisCacheClient.from_connection_string('your_connection_string',
credential=credential)

     # Set session data
     redis_cache.set('session_key', 'session_value')

     # Get session data
     session_data = redis_cache.get('session_key')
     ```

2. Azure Cosmos DB:
   - Description: Azure Cosmos DB is a multi-model, globally distributed database service.
It can be used to store and manage session data, offering high availability and low-latency access.

   - Key Features:
     - Multi-Model (Document, Graph, Table, etc.)
     - Global Distribution
     - Automatic and Instantaneous Scaling

   - Usage Example:
     ```python
     # Using Azure SDK for Python to interact with Azure Cosmos DB
     from azure.identity import DefaultAzureCredential
     from azure.cosmos import CosmosClient

     credential = DefaultAzureCredential()
     cosmos_client = CosmosClient('your_cosmos_db_connection_string', credential=credential)

     # Access and manipulate session data using Cosmos DB APIs
     ```

3. Azure SQL Database:
   - Description: Azure SQL Database is a fully managed relational database service. It can be used
to store session data, especially if your application relies on relational data models.

   - Key Features:
     - Fully Managed
     - High Availability
     - Scalability

   - Usage Example:
     ```python
     # Using Azure SDK for Python to interact with Azure SQL Database
     from azure.identity import DefaultAzureCredential
     from azure.sql.database import SqlDatabaseClient

     credential = DefaultAzureCredential()
     sql_db_client = SqlDatabaseClient('your_sql_db_connection_string', credential=credential)

     # Access and manipulate session data using SQL queries
     ```

4. Azure Blob Storage:
   - Description: Azure Blob Storage can be used to store session data in a distributed manner. It's
suitable for scenarios where you need to store large amounts of unstructured data.

   - Key Features:
     - Scalable and Durable
     - Cost-Effective
     - Blob Storage Tiers for Data Lifecycle Management

   - Usage Example:
     ```python
     # Using Azure SDK for Python to interact with Azure Blob Storage
     from azure.identity import DefaultAzureCredential
     from azure.storage.blob import BlobServiceClient

     credential = DefaultAzureCredential()
     blob_service_client = BlobServiceClient('your_blob_storage_connection_string',
credential=credential)

     # Store and retrieve session data as blobs
     ```

5. Azure Functions for Session Management:
   - Description: Azure Functions can be used to handle session-related logic. You can trigger functions
based on events such as user authentication or session expiration.

   - Key Features:
     - Serverless Architecture
     - Event-Driven
     - Scale Automatically

   - Usage Example:
     - Implement Azure Functions that respond to authentication or session-related events and interact
with the chosen data storage solution.

Best Practices:
- Use Secure Connections: Ensure that connections to your chosen data storage solutions are secured
using appropriate protocols and authentication mechanisms.

- Implement Session Expiry: Set policies for session data expiry and cleanup to manage resource usage
effectively.

- Consider Data Sharding: Depending on the size and nature of your application, consider sharding or
partitioning your data for better performance and scalability.

- Monitor and Optimize: Regularly monitor and optimize your chosen solution based on usage patterns
and requirements.

Select the technology based on your application's needs, scalability requirements, and the type of data
you are managing in the session. The examples provided use Python, but Azure SDKs are available
for various programming languages.
1. Azure Components Setup: Azure Redis Cache: - Create an Azure Redis Cache instance. - Obtain the connection string for the Redis Cache. Azure App Service (Web App): - Create an Azure App Service (Web App) to host your Flask application. - Configure the Flask application to connect to the Azure Redis Cache for session storage. Azure API Management (Optional): - Set up Azure API Management if you want to manage and secure your API. 2. Flask Application Setup: [below we will see in different use cases] Flask App with Flask-Session: - Create a Flask application with user authentication. - Use the `Flask-Session` extension to store session data in Redis. - Install required packages: ```bash pip install Flask Flask-Session redis ``` - Sample Flask App Code: ```python from flask import Flask, session from flask_session import Session import os app = Flask(__name__) # Configure session to use Redis app.config['SESSION_TYPE'] = 'redis' app.config['SESSION_PERMANENT'] = False app.config['SESSION_USE_SIGNER'] = True app.config['SESSION_KEY_PREFIX'] = 'your_prefix' app.config['SESSION_REDIS'] = 'your_redis_cache_connection_string' # Initialize the session extension Session(app) @app.route('/') def index(): if 'username' in session: return f'Logged in as {session["username"]}' return 'You are not logged in' @app.route('/login/<username>') def login(username): session['username'] = username return f'Logged in as {username}' @app.route('/logout') def logout(): session.pop('username', None) return 'Logged out' if __name__ == '__main__': app.secret_key = os.urandom(24) app.run(debug=True) ``` 3. Azure App Service Deployment: - Deploy your Flask application to the Azure App Service. 4. API Gateway and Load Balancer (Optional): - If using Azure API Management or Load Balancer, configure them to route traffic to your App
Service. 5. JWT Token Management: Description: JSON Web Tokens (JWT) can be used for secure communication and session management between
the front end, Flask API, and Azure Other Services.
Configuration: Use a library like PyJWT in your Flask API to generate and validate JWT tokens. Include the JWT token in requests from the front end to the Flask API and from the Flask API to
the Azure other Service.


5. Testing: - Test user login and session management functionality by accessing the endpoints of your Flask
application. 6. Secure Communication: - Ensure that all communication between the client and your Flask application and between
the application and Azure Redis Cache is secured using HTTPS.
Note: - Replace `'your_redis_cache_connection_string'` with the actual connection string of your Azure
Redis Cache. - If using Azure API Management or Load Balancer, configure them based on your specific
requirements. - Adjust session management settings (e.g., session timeout, secure cookie settings) based on your
application's needs.
Managing sessions using an API Gateway and Redis involves configuring the API Gateway to
handle user requests, directing traffic to backend services (like a Flask application), and using
Redis to store and manage session data. Below is a high-level guide on how this can be achieved:

1. API Gateway Configuration:

Azure API Management:
   - Create an Azure API Management instance.
   - Define API operations that correspond to your user authentication and session management
endpoints.
   - Configure policies in API Management to manage session tokens, validate tokens, and route requests.

Policies Example (in API Management):
   ```xml
   <!-- Validate JWT token policy -->
   <inbound>
      <base />
      <validate-jwt header-name="Authorization" failed-validation-httpcode="401"
failed-validation-error-message="Unauthorized. Access token is missing or invalid."
require-expiration-time="false" require-signed-tokens="false" require-expiration-time="true">
         <openid-config url="https://your-authorization-server/.well-known/openid-configuration" />
         <audiences>
            <audience>your-audience</audience>
         </audiences>
      </validate-jwt>
   </inbound>

   <!-- Set backend service and add session ID to request -->
   <backend>
      <base />
      <set-backend-service base-url="https://your-backend-service" />
      <rewrite-uri template="/{session-id}/{path}" />
   </backend>
   ```

2. Flask Application with Redis:

   - Configure your Flask application to use Redis for session storage.
   - Adjust the Flask app code provided in the previous example to handle requests from the API
Gateway.

3. Redis Session Management:

   - Use Redis to store and manage session data. When a user logs in, store their session information
(e.g., user ID, session token) in Redis.

   - Example Redis Commands (Python with `redis` library):
     ```python
     import redis

     # Connect to Redis
     redis_client = redis.StrictRedis(host='your-redis-host', port=6379, decode_responses=True)

     # Set session data
     redis_client.set('session_id', 'user_data')

     # Get session data
     session_data = redis_client.get('session_id')
     ```

4. Secure Communication:

   - Ensure that communication between the API Gateway, Flask application, and Redis is secured.
Use HTTPS, secure Redis connections, and validate tokens in the API Gateway.

5. Testing:

   - Test the integration by making requests to the API Gateway, which in turn routes requests to the
Flask application and manages sessions using Redis.

6. Load Balancer (Optional):

   - If you have multiple instances of your Flask application, consider using a load balancer to distribute
traffic evenly.

7. JWT Token Management:

import jwt from flask import Flask, request app = Flask(__name__) SECRET_KEY = 'your_secret_key' @app.route('/generate_token/<user_id>/<app_id>') def generate_token(user_id, app_id): # Check if the provided app_id is associated with the user_id # You can implement your logic here to verify the association if is_valid_association(user_id, app_id): token_payload = {'user_id': user_id, 'app_id': app_id} token = jwt.encode(token_payload, SECRET_KEY, algorithm='HS256') return {'token': token} else: return {'error': 'Invalid association between user_id and app_id'} @app.route('/verify_token', methods=['POST']) def verify_token(): token = request.json['token'] try: decoded_token = jwt.decode(token, SECRET_KEY, algorithms=['HS256']) return {'user_id': decoded_token['user_id'], 'app_id': decoded_token['app_id']} except jwt.ExpiredSignatureError: return {'error': 'Token has expired'} except jwt.InvalidTokenError: return {'error': 'Invalid token'} def is_valid_association(user_id, app_id): # Implement your logic to check if the provided app_id is associated with the user_id # For example, you might have a database where you store user and app associations # Return True if the association is valid, otherwise return False return True if __name__ == '__main__': app.run(debug=True)


In this example: The /generate_token/<user_id>/<app_id> endpoint now takes both user_id and app_id as parameters. There is a function is_valid_association that you should implement to check if the provided app_id
is associated with the given user_id. The generate_token endpoint generates a JWT token only if the association is valid. The /verify_token endpoint now returns both user_id and app_id from the decoded token. Remember to replace 'your_secret_key' with your actual secret key, and customize the
is_valid_association function based on your application's logic for associating users with appIDs.

Important links: