Run your AI assistant locally with LM Studio on macOS
LM Studio is a user-friendly desktop application that allows you to run AI models directly on your Mac. It provides a Graphical User Interface (GUI) for easy interaction.
.dmg
fileLM Studio has a built-in browser for discovering and downloading models.
LM Studio primarily supports models in the GGUF format, which are optimized to run efficiently on consumer hardware.
A quick guide to understanding LLMs for 120 AI Chat
Mistral (7B): A fast, balanced 4GB model that excels at general chat and quick answers, working well on most modern laptops but with limited coding capabilities.
LLaMA 3 (8B): A smart and accurate 4.5GB general-purpose assistant that delivers high-quality responses but requires 8GB+ RAM and preferably GPU acceleration.
Mixtral (8x7B): A highly intelligent 12GB+ model designed for deep reasoning and long context tasks, but only suitable for high-end systems due to its heavy resource requirements.
Gemma (2B/7B): Google's efficient 1.5-4GB lightweight model that runs anywhere but sacrifices some intelligence compared to larger alternatives.
Phi-3 (3.8B): A compact 2.5GB model that's surprisingly capable for its size, perfect for quick answers on low-end systems but limited in deep reasoning tasks.
Nous Hermes/OpenHermes: A community-tuned 4-7GB model optimized for helpful chat responses and writing, built on existing model foundations with similar system requirements to Mistral/LLaMA.
For basic models (2B–4B parameters), you'll need 4–8 GB RAM and a modern CPU from 2015 or later.
Mid-range models (7B–8B parameters) require 8–16 GB RAM with an SSD recommended for optimal performance.
Heavy models like Mixtral need 24 GB+ RAM or a dedicated GPU to run effectively.
For the best experience overall, use Apple M1/M2/M3 chips or a GPU with 6GB+ VRAM.
Tip: If your machine feels slow, choose smaller models like phi-3, gemma, or mistral (quantized).