Using LM Studio with 120 AI Chat
Run your AI assistant locally with LM Studio on macOS
What You'll Need
- A Mac computer (Apple silicon or Intel chip)
- 120 AI Chat app
- Internet connection
1. Install LM Studio (your local AI engine)
LM Studio is a user-friendly desktop application that allows you to run AI models directly on your Mac. It provides a Graphical User Interface (GUI) for easy interaction.
- Go to https://lmstudio.ai
- Click Download for macOS.
- Open the downloaded
.dmgfile - Drag LM Studio into your Applications folder.
- Double-click the app to launch it for the first time.
2. Download a Local AI Model
LM Studio has a built-in browser for discovering and downloading models.
- In the LM Studio app, click the Model Search tab in the App Settings (Developer Mode).
- Browse the available models or use the search bar to find a specific one.
- Recommended starter model: Mistral
- Select the model you want, and click the Download button. LM Studio will handle the download and save the model to your Mac.
3. Start a Local Server in LM Studio
- In LM Studio, go to the Developer tab (left sidebar).
- Select the model you want to load (e.g., Mistral-7B-Instruct).
- Click Start Server on Port 1234... (the default server)
- Leave LM Studio running in the background — it will act as your local AI engine.
4. Open 120 AI Chat
- Open the 120 AI Chat app.
- Choose your downloaded model, for example, Mistral.
5. Start Chatting!
- All answers are generated by the model on your Mac.
- Your data stays private.
LM Studio primarily supports models in the GGUF format, which are optimized to run efficiently on consumer hardware.
A quick guide to understanding LLMs for 120 AI Chat
Most Popular Local LLMs
Mistral (7B): A fast, balanced 4GB model that excels at general chat and quick answers, working well on most modern laptops but with limited coding capabilities.
LLaMA 3 (8B): A smart and accurate 4.5GB general-purpose assistant that delivers high-quality responses but requires 8GB+ RAM and preferably GPU acceleration.
Mixtral (8x7B): A highly intelligent 12GB+ model designed for deep reasoning and long context tasks, but only suitable for high-end systems due to its heavy resource requirements.
Gemma (2B/7B): Google's efficient 1.5-4GB lightweight model that runs anywhere but sacrifices some intelligence compared to larger alternatives.
Phi-3 (3.8B): A compact 2.5GB model that's surprisingly capable for its size, perfect for quick answers on low-end systems but limited in deep reasoning tasks.
Nous Hermes/OpenHermes: A community-tuned 4-7GB model optimized for helpful chat responses and writing, built on existing model foundations with similar system requirements to Mistral/LLaMA.
Minimum System Requirements
For basic models (2B–4B parameters), you'll need 4–8 GB RAM and a modern CPU from 2015 or later.
Mid-range models (7B–8B parameters) require 8–16 GB RAM with an SSD recommended for optimal performance.
Heavy models like Mixtral need 24 GB+ RAM or a dedicated GPU to run effectively.
For the best experience overall, use Apple M1/M2/M3 chips or a GPU with 6GB+ VRAM.
Tip: If your machine feels slow, choose smaller models like phi-3, gemma, or mistral (quantized).