Latest: v0.3.5 · April 17, 2026

Local LLMs on Apple Silicon,
done the macOS-native way.

Native SwiftUI app. A proper CLI. OpenAI-compatible API — always on. Powered by Apple's MLX. Zero cloud, zero telemetry, zero Electron.

Requires macOS 14 (Sonoma) or later · Apple Silicon only · Apache-2.0 licensed

Why macMLX

Not another Electron wrapper.

The only tool that gives newcomers a real SwiftUI app AND gives developers a real CLI — both talking to the same in-process MLX engine.

Feature macMLX LM Studio Ollama oMLX
Native macOS GUI SwiftUI Electron Web UI
MLX-native inference GGUF only GGUF only
Command-line interface ✓ Swift-native
Resumable downloads + HF mirrors partial partial
OpenAI-compatible API ✓ always on
Zero Python required
Three surfaces, one core

Built like a macOS app should be.

MacMLXCore is the Swift SPM package that owns all inference. The GUI, the CLI, and the HTTP server are thin shells over the same protocol.

macMLX.app

SwiftUI, macOS 14+, Apple Silicon only.

  • Onboarding wizard picks engine + model directory
  • HuggingFace browser with resumable downloads
  • Conversation sidebar: rename, delete, rewind-to-here
  • Parameters Inspector (⌘⌥I) — per-model persistence
  • Benchmark tab, Logs tab, Menu bar extra
  • Sparkle EdDSA-signed auto-update

macmlx (CLI)

swift-argument-parser · native ANSI dashboards.

  • pull · list · run · serve · ps · stop
  • Honours preferredEngine + per-model parameters from GUI
  • Unicode progress bars with sub-cell precision
  • Boxed startup banner · coloured REPL prompt
  • PIDFile coordination · graceful SIGTERM
  • JSON output on every command for scripting

OpenAI-compatible API

Hummingbird 2 · localhost:8000/v1 · SSE streaming.

  • POST /v1/chat/completions · GET /v1/models · GET /x/status
  • Cold-swap loads any local model on demand (v0.3.3)
  • Concurrent swap requests serialised actor-side
  • Drop-in for Cursor, Continue, Cline, Raycast, Zed, Open WebUI
  • Real RSS reported on /x/status (no fake 0 bytes)
Current release

v0.3.5 — native ANSI dashboards, cleaner deps.

Released April 17, 2026. Removed SwiftTUI + PulseUI (both Swift 6 incompatible on macOS), rebuilt the three CLI dashboards (pull, serve, run) on a tiny in-house ANSI toolkit.

v0.3.5 2026-04-17
Added
  • CLITerm ANSI toolkit: colour, TTY detection, U+258x sub-cell progress bars, box-drawing headers
  • Unicode progress bar on macmlx pull with speed + ETA
  • Boxed startup banner on macmlx serve; tidier REPL header on macmlx run
Removed
  • SwiftTUI (unmaintained, Swift 6 strict-concurrency incompatible)
  • _PullDashboardView / _ServeDashboardView / _ChatTUIView stub types
  • PulseUI + PulseProxy (ConsoleView is #if !os(macOS)-gated)
Status
  • CLI tests: 16/16 green
  • MacMLXCore tests: 90/90 green
  • Issue #18 closed — real live CLI dashboards landed via CLITerm
Roadmap

Shipped. Shipping. Next.

Two products, one shared MacMLXCore. Twelve releases since v0.1. Each row below links to the actual tag or plan document.

Shipped
v0.1.0

Initial MVP

Native SwiftUI GUI · menu bar · CLI (serve / pull / run / list / ps / stop) · HuggingFace downloader · OpenAI-compatible API · Sparkle auto-update · memory-aware onboarding.

v0.2.0

Download + chat polish (10 issues)

Resumable downloads survive cancel + app quit · HF mirrors · Markdown rendering · message edit/regenerate · Parameters Inspector (⌘⌥I).

v0.3.0

Benchmark feature + cross-cutting gap-fix

Local benchmark tab (prefill + generation TPS, TTFT, peak RSS, history, Share-to-Community issue template) · 4 CRITICAL + 3 HIGH + 3 MEDIUM gap-fixes from an independent code review · bilingual README.

v0.3.1

Five UX fixes

macmlx list segfault fixed · chat banner flicker fixed · Markdown paragraph breaks preserved · manually-copied models auto-appear · chat toolbar model switcher actually works · max-tokens TextField replaces click-heavy Stepper.

v0.3.2

Conversation sidebar + rewind-to-here

Collapsible sidebar lists saved conversations · inline rename · delete with confirmation · right-click any message → Rewind drops every later message.

v0.3.3

API cold-swap model loading

/v1/chat/completions now auto-loads any locally-downloaded model by ID · concurrent swaps serialised actor-side · OpenAI-style 404 model_not_found error shape.

v0.3.4

Logs tab (native over Pulse LoggerStore)

SwiftUI Table with time / level badge / category / message · search field + level picker · Clear button wipes the on-disk store.

v0.3.5

Native ANSI CLI dashboards; SwiftTUI + PulseUI removed

In-house CLITerm toolkit replaces stub-linked SwiftTUI · PulseUI dropped (ConsoleView is iOS/iPadOS-only) · Logs tab keeps working via direct LoggerStore access.

In progress
v0.3.6

Small maintenance patch (~2h)

  1. macmlx --version auto-bumped via release.yml sed step (fixes perpetual 0.1.0 bug)
  2. macmlx search <query> subcommand — reuses HFDownloader.search, --sort=downloads|likes|recent, --limit, --json
  3. Binary slim-down: strip -S + dynamic Swift stdlib (~60 MB → ~45 MB)
  4. CLI --log-level + --log-stderr flags so Pulse logging is visible from the terminal
Next minor
Later
v0.5

LoRA adapters + conversation/dataset export

Drop in existing HuggingFace LoRA adapters (no training UI) · export conversations to JSONL for fine-tuning datasets.

v0.6

Speech I/O — ASR + TTS

WhisperKit (Core ML) for mic input in chat — upstream mlx-swift-lm doesn't ship audio models yet · AVSpeechSynthesizer for reading assistant replies aloud.

v0.7

Community Benchmarks service NEW

Today the Benchmark tab's Share to Community button pre-fills a GitHub issue. Tomorrow: an opt-in remote endpoint receives submissions, aggregates by chip × model × quant × macOS version, and serves a public leaderboard — the data page inside the app and on this website. Inspired by omlx.ai's community benchmarks.

  • Submission: POST /v1/benchmarks with BenchmarkResult JSON + anonymised HardwareInfo
  • Opt-in — no data leaves the Mac unless the user explicitly clicks Share
  • Public browsable leaderboard on this website — filter by chip family, memory, model family, quant
  • GitHub-issue submission continues as a fallback for users who prefer not to run the remote service
Deferred / blocked
#19

Signed + notarized DMG

Needs a paid Apple Developer account. Until then, DMG is unsigned — Gatekeeper asks users to run xattr -cr on first launch.

#12 / #13

Subprocess-based engines (SwiftLM, Python mlx-lm)

Closed as not planned — App Sandbox policy blocks spawning external binaries. Reopenable if sandbox policy is revisited or a Swift-native 100B+ MoE path appears.

#20

Homebrew tap for the CLI

Scheduled around v0.3.6–v0.4 once the CLI tarball lands as a release asset.

Benchmarks — today & tomorrow

From shared-issue to live leaderboard.

The Benchmark tab already measures prefill + generation tok/s, TTFT, peak RSS, load time, and stores history locally. Sharing is a one-click GitHub-issue pre-fill today. v0.7 plans to turn that submission into data you can query.

Today — v0.3.0

Share to Community

Result is encoded into a pre-filled GitHub issue using benchmark_submission.yml. Review before submit; nothing leaves the Mac until you click Create Issue.

benchmark Benchmark · Qwen3-8B-4bit · M3 Max 64GB
## System
Chip: Apple M3 Max
Memory: 64 GB
macOS: 15.4
Engine: MLX Swift (mlx-swift-lm 3.31.3)

## Result
Model:       Qwen3-8B-4bit
Prefill TPS: 182.4
Gen TPS:     45.1
TTFT:        0.32 s
Peak RSS:    18.2 GB
Load time:   4.8 s
Runs:        3 (median)
v0.7 — Community Benchmarks

Browsable leaderboard (preview)

Remote endpoint stores your opt-in submission, aggregates globally by chip × model × quantisation × macOS version, and publishes a filterable table on this site — plus inside the app so the Benchmark tab can show you how your Mac compares.

Chip Mem Model Gen TPS TTFT N
M4 Max 128G Qwen3-8B-4bit 62.3 0.21 s 47
M3 Max 64G Qwen3-8B-4bit 45.1 0.32 s 118
M3 Pro 36G Qwen3-8B-4bit 31.0 0.45 s 72
M2 Max 64G Qwen3-8B-4bit 28.4 0.51 s 34
M1 Max 32G Qwen3-8B-4bit 22.1 0.64 s 28

Mockup. Real data once v0.7 ships.

Architecture

One protocol. Three consumers. No leaky abstractions.

macMLX.app SwiftUI
macmlx CLI
HTTP clients Cursor · Continue · curl
MacMLXCore Swift SPM · @MainActor / actors · Swift 6 strict concurrency
InferenceEngine protocol
HummingbirdServer localhost:8000/v1
MLXSwiftEngine
mlx-swift-lm 3.31.3 · MLXLLM + MLXVLM (v0.4) · in-process
Apple Silicon Metal · ANE · Unified Memory
Quickstart

Running a 4-bit 8B model in 60 seconds.

  1. 1

    Install

    Download the DMG from Releases, drag macMLX.app to /Applications. On first launch, run xattr -cr /Applications/macMLX.app in Terminal (DMG is not notarized yet — issue #19).

  2. 2

    Onboard

    The setup wizard picks ~/.mac-mlx/models as the default model directory and selects the MLX Swift engine. Memory check warns if your Mac has less than the model's recommended RAM.

  3. 3

    Download + chat

    Open the Models tab, switch to Hugging Face, search for a model (try mlx-community/Qwen3-8B-4bit), click Download — progress bar with live speed and ETA, resumable across app quits. Load it from the Local tab, then head to Chat.

# install dev tools, clone, build the CLI
git clone https://github.com/magicnight/mac-mlx && cd mac-mlx
brew bundle
swift build --package-path macmlx-cli -c release

# download, run, serve
macmlx pull mlx-community/Qwen3-8B-4bit     # resumable
macmlx list                                   # local models
macmlx run Qwen3-8B-4bit "Hello, world"      # single prompt
macmlx run Qwen3-8B-4bit                       # interactive REPL
macmlx serve                                   # OpenAI API on :8000
macmlx ps                                      # is serve running?
macmlx stop                                    # graceful SIGTERM

# v0.3.6 preview
macmlx search qwen3 --sort likes --limit 10  # new in v0.3.6
macmlx serve --log-level debug --log-stderr  # new in v0.3.6
# anything OpenAI-compatible works. API key is ignored.
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen3-8B-4bit",
    "messages": [{"role": "user", "content": "Hi"}],
    "stream": true
  }'

# cold-swap: ask for any locally-downloaded model by ID
# server loads it on demand (v0.3.3+), concurrent swaps serialised
curl http://localhost:8000/v1/chat/completions \
  -d '{"model":"gemma-3-4b-it-qat-4bit","messages":[...]}'

# real RSS reported on status
curl http://localhost:8000/x/status | jq
# { "state": "ready", "model": "Qwen3-8B-4bit", "rss_gb": 18.2, ... }
Special thanks

Standing on shoulders.

macMLX wouldn't exist without these open-source projects. Click through and star them.

Apple · ml-explore

MLX

Apple's array framework for Apple Silicon. The engine under everything macMLX does.

Apple · ml-explore

mlx-swift-lm

Swift bindings + LLM/VLM model zoo. Pinned at 3.31.3. Ships MLXLLM (text) and MLXVLM (16 vision-language architectures, used for v0.4).

Hugging Face

swift-transformers

Tokenizers, Hub helpers, chat-template application in Swift. 1.3.x series (avoids argparse version conflicts).

hummingbird-project

Hummingbird

Swift-native, NIO-based HTTP server. Powers localhost:8000 and its OpenAI-compatible routes. 2.22.x.

sparkle-project

Sparkle

EdDSA-signed auto-update framework for Mac apps. Drives the Check for Updates… menu item and the appcast the release workflow pushes.

kean

Pulse

Structured logging framework with a Core Data-backed store. Backs LogManager and the native Logs tab. (PulseUI removed in v0.3.5 — ConsoleView is iOS/iPadOS-only; we read the store directly instead.)

Apple

swift-argument-parser

Every macmlx subcommand + flag is declared in pure Swift through ArgumentParser. 1.7.1.

Trans-N-ai

Swama

Swift-native MLX inference CLI that pioneered the in-process mlx-swift-lm pattern. macMLX took the architectural approach and added the GUI + OpenAI server layers.

jundot

oMLX

Reference for feature depth, community benchmark presentation, and MLX-ecosystem tool UX. Direct inspiration for the v0.7 Community Benchmarks plan.

SharpAI

SwiftLM

100B+ MoE inference path. Sandbox blocked the subprocess integration for now (issues #12/#13) — kept in the credits for pointing the way.

argmax

WhisperKit

Planned for v0.6 speech input — upstream mlx-swift-lm doesn't ship audio models yet, so WhisperKit's Core ML Whisper covers the UX in the meantime.

rensbreur · historical

SwiftTUI

Early CLI-dashboard candidate. Swift 6 strict-concurrency incompatibility led to an in-house ANSI toolkit in v0.3.5 (issue #18). Retained in credits; reopenable if upstream revives.

Full BibTeX citations in CITATIONS.bib.