# macMLX

> macMLX is a native macOS application for running local large language models on Apple Silicon. It ships three surfaces over one shared Swift core: a SwiftUI GUI app, a Swift-native command-line tool called `macmlx`, and an always-on OpenAI-compatible HTTP API at `localhost:8000/v1`. All three are powered by Apple's MLX framework via the `mlx-swift-lm` Swift package (pinned at 3.31.x). Open source under Apache 2.0.

macMLX is **not** a wrapper around llama.cpp or GGUF. It runs MLX-format models (safetensors + config.json) directly in-process. There is no Python runtime, no Electron, no cloud inference, and no telemetry. The only outbound network activity is model downloads from Hugging Face (or a user-configured mirror such as `hf-mirror.com`).

## Key facts

- **Repository**: https://github.com/magicnight/mac-mlx
- **Website**: https://macmlx.app/
- **Latest release**: v0.3.5 (2026-04-17)
- **License**: Apache 2.0
- **Language**: Swift 6 (strict concurrency enabled)
- **Primary inference engine**: `mlx-swift-lm` 3.31.x in-process (`MLXLLM` for text, `MLXVLM` in v0.4 for vision-language)
- **HTTP server**: Hummingbird 2 at `localhost:8000/v1` (OpenAI-compatible; cold-swap model loading since v0.3.3)
- **Data directory**: `~/.mac-mlx/` (models, conversations, model-params, downloads, benchmarks, logs, settings.json, macmlx.pid)
- **Platform**: macOS 14.0 (Sonoma) or later; Apple Silicon only (M1 / M2 / M3 / M4). Does **not** run on Intel Macs.
- **Distribution**: GitHub Releases DMG (currently unsigned — issue #19 tracks signing)

## Shipped versions

- **v0.1.0** — native SwiftUI GUI, menu bar, CLI (`serve` / `pull` / `run` / `list` / `ps` / `stop`), HuggingFace downloader, OpenAI-compatible API, Sparkle auto-update, memory-aware onboarding.
- **v0.2.0** — download + chat polish (resumable downloads across cancel/quit, HF mirrors, Markdown rendering, message edit/regenerate, Parameters Inspector with per-model persistence).
- **v0.3.0** — local Benchmark tab (prefill + generation TPS, TTFT, peak RSS, history, Share-to-Community GitHub issue), 4 CRITICAL + 3 HIGH + 3 MEDIUM cross-cutting gap-fixes, bilingual README.
- **v0.3.1** — five UX fixes: `macmlx list` segfault, chat banner flicker, Markdown paragraph breaks, manually-copied models auto-appear, chat toolbar model switcher.
- **v0.3.2** — conversation sidebar with rename, delete, and rewind-to-here.
- **v0.3.3** — API cold-swap model loading (`/v1/chat/completions` auto-loads any locally-downloaded model by ID).
- **v0.3.4** — Logs tab (native SwiftUI Table backed by Pulse `LoggerStore`).
- **v0.3.5** — native ANSI CLI dashboards; SwiftTUI + PulseUI removed (both Swift-6-incompatible on macOS).

## Roadmap

- **v0.3.6** — small maintenance patch (in progress): `macmlx --version` auto-bumped via `release.yml` sed step; `macmlx search <query>` subcommand reusing `HFDownloader.search`; binary slim-down (`strip -S` + dynamic Swift stdlib, ~60 MB → ~45 MB); CLI `--log-level` + `--log-stderr` flags so Pulse logging is visible from the terminal.
- **v0.4.0** (next minor) — Vision-Language Model support (issue #23) via MLXVLM. Supports 16 VLM architectures out of the box: Qwen2.5-VL, Qwen3-VL, Gemma-3 (4B/12B/27B), SmolVLM / SmolVLM2, Paligemma, Pixtral, Idefics3, FastVLM, LFM2-VL, glm_ocr, mistral3. Image picker in ChatInputView (NSOpenPanel + drag-drop + paste); OpenAI multimodal `content`-array parsing in HummingbirdServer; images persisted to `~/.mac-mlx/conversations/<uuid>/images/`.
- **v0.5** — LoRA adapter loading (drop in existing HuggingFace adapters, no training UI) + conversation/dataset export to JSONL.
- **v0.6** — Speech I/O: WhisperKit (Core ML) for mic input in chat; AVSpeechSynthesizer for assistant reply read-back. Native MLX Whisper deferred until upstream `mlx-swift-lm` ships audio models.
- **v0.7** — Community Benchmarks service. Today the Benchmark tab's *Share to Community* button pre-fills a GitHub issue. v0.7 plans an opt-in remote endpoint (`POST /v1/benchmarks`) that receives anonymised `BenchmarkResult` + `HardwareInfo` submissions, aggregates by chip × model × quantisation × macOS version, and serves a public leaderboard both on this website and inside the app. Inspired by omlx.ai's community benchmarks.

## Deferred / blocked

- **#19** — signed + notarized DMG. Needs a paid Apple Developer account.
- **#12, #13** — subprocess-based engines (SwiftLM, Python `mlx-lm`). Closed as not planned: macOS App Sandbox blocks spawning external binaries. Reopenable if sandbox policy changes.
- **#20** — Homebrew tap for the CLI. Scheduled around v0.3.6–v0.4 once the CLI tarball lands as a release asset.

## Comparison to similar tools

| Tool        | Native macOS GUI  | MLX inference | CLI         | Runtime         |
|-------------|-------------------|---------------|-------------|-----------------|
| **macMLX**  | Yes (SwiftUI)     | Yes           | Yes (Swift) | None (Swift-native) |
| LM Studio   | No (Electron)     | No (GGUF)     | No          | Electron        |
| Ollama      | No                | No (GGUF)     | Yes         | Go              |
| oMLX        | No (web UI)       | Yes           | Yes         | Python          |

## OpenAI-compatible API

macMLX runs an always-on OpenAI-compatible server at `http://localhost:8000/v1` whenever a model is loaded or whenever `macmlx serve` is running. Any OpenAI-compatible client works with a custom base URL: Cursor, Continue, Cline, Raycast, Zed, Open WebUI, Aider, and so on. The API key can be any non-empty string.

Endpoints:
- `POST /v1/chat/completions` — streaming (SSE) and non-streaming chat completions. Cold-swap: any locally-downloaded model ID auto-loads on request (v0.3.3+).
- `GET /v1/models` — returns the currently-loaded model (compatibility surface).
- `GET /x/status` — reports real resident set size (RSS) and engine state.

## Quickstart

### GUI
1. Download `macMLX-vX.X.X.dmg` from https://github.com/magicnight/mac-mlx/releases
2. Drag `macMLX.app` to `/Applications`
3. First launch: `xattr -cr /Applications/macMLX.app && open /Applications/macMLX.app` (DMG is not notarized — issue #19)
4. Onboarding wizard sets `~/.mac-mlx/models` and selects the MLX Swift engine
5. Download a model in the Models tab (try `mlx-community/Qwen3-8B-4bit`), load it, and chat

### CLI
```bash
macmlx pull mlx-community/Qwen3-8B-4bit     # resumable download
macmlx list                                  # local models
macmlx run Qwen3-8B-4bit "Hello, world"      # single-prompt
macmlx run Qwen3-8B-4bit                     # interactive REPL
macmlx serve                                 # OpenAI API on :8000
macmlx ps                                    # is serve running?
macmlx stop                                  # graceful SIGTERM
```

## Dependencies (open-source projects macMLX builds on)

- **MLX** — Apple's array framework for Apple Silicon. https://github.com/ml-explore/mlx
- **mlx-swift-lm** — Swift bindings and LLM/VLM model zoo (pinned 3.31.x). https://github.com/ml-explore/mlx-swift-examples
- **swift-transformers** — Hugging Face tokenizers and Hub helpers in Swift (1.3.x). https://github.com/huggingface/swift-transformers
- **Hummingbird** — Swift-native, NIO-based HTTP server (2.22.x). https://github.com/hummingbird-project/hummingbird
- **Sparkle** — EdDSA-signed auto-update framework for Mac apps. https://github.com/sparkle-project/Sparkle
- **Pulse** — structured logging with a Core Data–backed store. https://github.com/kean/Pulse
- **swift-argument-parser** — Apple's declarative CLI framework (1.7.1). https://github.com/apple/swift-argument-parser
- **Swama** — Swift-native MLX inference CLI; architectural inspiration for macMLX's in-process approach. https://github.com/Trans-N-ai/swama
- **oMLX** — feature-depth reference and inspiration for the v0.7 Community Benchmarks plan. https://github.com/jundot/omlx
- **SwiftLM** — 100B+ MoE inference path (subprocess integration blocked by sandbox policy). https://github.com/SharpAI/SwiftLM
- **WhisperKit** — Core ML Whisper implementation planned for v0.6 speech input. https://github.com/argmaxinc/WhisperKit

## Docs

- [README — English](https://github.com/magicnight/mac-mlx/blob/main/README.md)
- [README — 简体中文](https://github.com/magicnight/mac-mlx/blob/main/README.zh-CN.md)
- [CHANGELOG](https://github.com/magicnight/mac-mlx/blob/main/CHANGELOG.md)
- [Contributing](https://github.com/magicnight/mac-mlx/blob/main/CONTRIBUTING.md)
- [Security policy](https://github.com/magicnight/mac-mlx/blob/main/SECURITY.md)
- [Citations (BibTeX)](https://github.com/magicnight/mac-mlx/blob/main/CITATIONS.bib)
- [Code of Conduct](https://github.com/magicnight/mac-mlx/blob/main/CODE_OF_CONDUCT.md)
- [License (Apache 2.0)](https://github.com/magicnight/mac-mlx/blob/main/LICENSE)

## Optional

- [Releases — Atom feed](https://github.com/magicnight/mac-mlx/releases.atom)
- [Issues](https://github.com/magicnight/mac-mlx/issues)
- [Discussions](https://github.com/magicnight/mac-mlx/discussions)