120 AI Chat - Local LLMs with Ollama

Using Local LLMs with 120 AI Chat

Run your AI assistant locally with Ollama on macOS

What You’ll Need

A Mac computer (Apple silicon or Intel chip)
120 AI Chat app
Internet connection

1. Install Ollama (your local AI engine)

Ollama allows you to run AI models directly on your Mac.

Go to https://ollama.com/download
Click Download for macOS
Open the downloaded .dmg file
Drag Ollama into your Applications folder
Double-click to launch it once — this will install the background service

Ollama doesn’t show a full app window — it runs silently in the background. You’ll use Terminal to interact with it.

2. Open the Terminal App on macOS

This is where you’ll tell your Mac to download an AI model.

Press Cmd + Space to open Spotlight Search
Type Terminal, then press Enter
A black window opens — this is the Terminal where you'll type commands

3. Download a Local AI Model Using Ollama

Recommended starter model: mistral

In the Terminal window, type: ollama run mistral

Then press Enter.

Ollama will automatically download the mistral model (about ~4GB)
When it's done, it will be ready to run on your Mac.

You only need to do this once. After that, the model is saved and ready to use.

4. Open 120 AI Chat

Open the 120 AI Chat app
Choose your downloaded model — for example, mistral

5. Start chatting!

All answers are generated by the model on your Mac
Your data stays private.

Summary

Install Ollama
Open Terminal (Cmd + Space → “Terminal”)
Type: ollama run mistral
Open 120 AI Chat → Select mistral
Start chatting!

A quick guide to understanding LLMs for 120 AI Chat

Most Popular Local LLMs

Mistral (7B): A fast, balanced 4GB model that excels at general chat and quick answers, working well on most modern laptops but with limited coding capabilities.

LLaMA 3 (8B): A smart and accurate 4.5GB general-purpose assistant that delivers high-quality responses but requires 8GB+ RAM and preferably GPU acceleration.

Mixtral (8x7B): A highly intelligent 12GB+ model designed for deep reasoning and long context tasks, but only suitable for high-end systems due to its heavy resource requirements.

Gemma (2B/7B): Google's efficient 1.5-4GB lightweight model that runs anywhere but sacrifices some intelligence compared to larger alternatives.

Phi-3 (3.8B): A compact 2.5GB model that's surprisingly capable for its size, perfect for quick answers on low-end systems but limited in deep reasoning tasks.

Nous Hermes/OpenHermes: A community-tuned 4-7GB model optimized for helpful chat responses and writing, built on existing model foundations with similar system requirements to Mistral/LLaMA.

Minimum System Requirements

For basic models (2B–4B parameters), you'll need 4–8 GB RAM and a modern CPU from 2015 or later.

Mid-range models (7B–8B parameters) require 8–16 GB RAM with an SSD recommended for optimal performance.

Heavy models like Mixtral need 24 GB+ RAM or a dedicated GPU to run effectively.

For the best experience overall, use Apple M1/M2/M3 chips or a GPU with 6GB+ VRAM.

Tip: If your machine feels slow, choose smaller models like phi-3, gemma, or mistral (quantized).