In this guide, we’ll walk through setting up a local AI-powered coding assistant within Visual Studio Code (VS Code). By leveraging tools such as Ollama, CodeStral, and the Continue extension, we can enhance our development workflow, get intelligent code suggestions, and even automate parts of the coding process.
Installation:
Configuring the Environment:
Step 1: Installing the Required Extensions
- Install CodeStral: Open the Extensions panel in VS Code, search for “CodeStral,” and click “Install.” This extension helps with managing your local AI models for code assistance.
- Configure CodeStral: Once installed, follow the extension’s configuration guide. You will need to install additional components, such as Ollama, which sets up a local server to run AI models.
Step 2: Setting Up Ollama for Local AI Models
- Install Ollama: Download and install Ollama, a server that will allow your system to host AI models locally. Once installed, you should be able to run the command
Ollama run
from your terminal. - Use Granite 8B Model: For code suggestions, use the Granite 8B model from IBM, which is optimized for code-related tasks. Note that loading the model may take some time as it’s about 5GB in size.
Step 3: Working with the Continue Extension
- Installing Continue: The Continue extension integrates with models running on Ollama and helps provide code assistance based on context.
- Configure Privacy Settings: If you want to work offline or avoid telemetry, open your configuration file (
config.json
) and set theallow_telemetry
parameter tofalse
. - Configure Continue extension: Open the configuration settings in VS Code (Ctrl+P) and search for
config.json
. Add the following configuration:
JSON{ "models": [ { "provider": "ollama", "name": "granite_8b", "model": "granite_8b" } ], "allowTelemetry": false }
- Using the Extension: You can highlight specific parts of your code, press
Ctrl+L
, and interact with the AI. For instance, if you highlight a few lines of code and ask, “How can I improve this code?” the AI will analyze the snippet and suggest improvements. - Accepting Code Changes: Once the AI provides suggestions, you can either accept the changes directly or refine them by asking follow-up questions.
Step 4: Advanced Features
- Context Detection: Continue supports context detection across files, repositories, and even web URLs. It can analyze your entire project and provide suggestions based on your overall code structure.
- Working Offline with Privacy: If privacy is important, Continue can be configured to keep everything offline, unlike some extensions that send data for research purposes.
Step 6: Hardware Considerations and Speed Optimization for Ollama
- Optimizing GPU Usage: If you have multiple Nvidia GPUs, use the
nvidia-smi -L
command to identify the unique ID of each card. You can then set theCUDA_VISIBLE_DEVICES
environment variable to ensure the AI model utilizes the right GPU for faster performance. - Check Logs: Periodically check Ollama’s logs to troubleshoot any issues, such as problems initializing the server or GPU.
- Hardware Recommendations: If possible, use more powerful GPUs like RTX 4070 or 3090 for faster model performance, especially when running large models.
Use Cases of Cursor editor, or Cody VSCode extension
- Highlight the relevant code and use the AI tool to suggest changes. While having the code highlighted, by pressing Ctrl +K in Cursor you can even ask the assistant to rewrite the code, which will automatically update the code in place.
- By using the option to scan the entire codebase from the chat menu option, the AI can add it to the context and suggest improvements.
- You can switch between different AI models, such as Claude or Gemini, depending on your coding needs. Each model has strengths in areas like code generation or identifying code smells.
- If the model generates incorrect suggestions, you can refine your query or switch models to get a better response. Always test and review changes before fully integrating them into your codebase.
Useful videos: