Posts

Showing posts from April 28, 2025

Local Gemma3 as VSCode Code Generation Extension

To use the #Gemma3 :1B model directly in #VSCode as a #codeassistant , you'll need to set up a local inference server or use an API that integrates with VS Code. Here's a step-by-step guide: Option 1: Run Gemma Locally & Integrate with VS Code 1. Install Required Dependencies Ensure you have Python (≥3.9) and `pip` installed. Then, install the necessary packages: ```bash pip install transformers torch sentencepiece ``` 2. Load Gemma 3:1B in a Python Script Create a Python script (`gemma_inference.py`) to load the model: ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "google/gemma-3-1b-it" # or "google/gemma-3-7b-it" if you have more resources tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto") def generate_code(prompt):   inputs = tokenizer(prompt, return_tensors="pt").to("cuda")   outputs = model.generate(**inputs, ...