# macMLX > macMLX is a native macOS application for running local large language models on Apple Silicon. It ships three surfaces over one shared Swift core: a SwiftUI GUI app, a Swift-native command-line tool called `macmlx`, and an always-on OpenAI-compatible HTTP API at `localhost:8000/v1`. All three are powered by Apple's MLX framework via the `mlx-swift-lm` Swift package (pinned at 3.31.x). Open source under Apache 2.0. macMLX is **not** a wrapper around llama.cpp or GGUF. It runs MLX-format models (safetensors + config.json) directly in-process. There is no Python runtime, no Electron, no cloud inference, and no telemetry. The only outbound network activity is model downloads from Hugging Face (or a user-configured mirror such as `hf-mirror.com`). ## Key facts - **Repository**: https://github.com/magicnight/mac-mlx - **Website**: https://macmlx.app/ - **Latest release**: v0.3.5 (2026-04-17) - **License**: Apache 2.0 - **Language**: Swift 6 (strict concurrency enabled) - **Primary inference engine**: `mlx-swift-lm` 3.31.x in-process (`MLXLLM` for text, `MLXVLM` in v0.4 for vision-language) - **HTTP server**: Hummingbird 2 at `localhost:8000/v1` (OpenAI-compatible; cold-swap model loading since v0.3.3) - **Data directory**: `~/.mac-mlx/` (models, conversations, model-params, downloads, benchmarks, logs, settings.json, macmlx.pid) - **Platform**: macOS 14.0 (Sonoma) or later; Apple Silicon only (M1 / M2 / M3 / M4). Does **not** run on Intel Macs. - **Distribution**: GitHub Releases DMG (currently unsigned — issue #19 tracks signing) ## Shipped versions - **v0.1.0** — native SwiftUI GUI, menu bar, CLI (`serve` / `pull` / `run` / `list` / `ps` / `stop`), HuggingFace downloader, OpenAI-compatible API, Sparkle auto-update, memory-aware onboarding. - **v0.2.0** — download + chat polish (resumable downloads across cancel/quit, HF mirrors, Markdown rendering, message edit/regenerate, Parameters Inspector with per-model persistence). - **v0.3.0** — local Benchmark tab (prefill + generation TPS, TTFT, peak RSS, history, Share-to-Community GitHub issue), 4 CRITICAL + 3 HIGH + 3 MEDIUM cross-cutting gap-fixes, bilingual README. - **v0.3.1** — five UX fixes: `macmlx list` segfault, chat banner flicker, Markdown paragraph breaks, manually-copied models auto-appear, chat toolbar model switcher. - **v0.3.2** — conversation sidebar with rename, delete, and rewind-to-here. - **v0.3.3** — API cold-swap model loading (`/v1/chat/completions` auto-loads any locally-downloaded model by ID). - **v0.3.4** — Logs tab (native SwiftUI Table backed by Pulse `LoggerStore`). - **v0.3.5** — native ANSI CLI dashboards; SwiftTUI + PulseUI removed (both Swift-6-incompatible on macOS). ## Roadmap - **v0.3.6** — small maintenance patch (in progress): `macmlx --version` auto-bumped via `release.yml` sed step; `macmlx search ` subcommand reusing `HFDownloader.search`; binary slim-down (`strip -S` + dynamic Swift stdlib, ~60 MB → ~45 MB); CLI `--log-level` + `--log-stderr` flags so Pulse logging is visible from the terminal. - **v0.4.0** (next minor) — Vision-Language Model support (issue #23) via MLXVLM. Supports 16 VLM architectures out of the box: Qwen2.5-VL, Qwen3-VL, Gemma-3 (4B/12B/27B), SmolVLM / SmolVLM2, Paligemma, Pixtral, Idefics3, FastVLM, LFM2-VL, glm_ocr, mistral3. Image picker in ChatInputView (NSOpenPanel + drag-drop + paste); OpenAI multimodal `content`-array parsing in HummingbirdServer; images persisted to `~/.mac-mlx/conversations//images/`. - **v0.5** — LoRA adapter loading (drop in existing HuggingFace adapters, no training UI) + conversation/dataset export to JSONL. - **v0.6** — Speech I/O: WhisperKit (Core ML) for mic input in chat; AVSpeechSynthesizer for assistant reply read-back. Native MLX Whisper deferred until upstream `mlx-swift-lm` ships audio models. - **v0.7** — Community Benchmarks service. Today the Benchmark tab's *Share to Community* button pre-fills a GitHub issue. v0.7 plans an opt-in remote endpoint (`POST /v1/benchmarks`) that receives anonymised `BenchmarkResult` + `HardwareInfo` submissions, aggregates by chip × model × quantisation × macOS version, and serves a public leaderboard both on this website and inside the app. Inspired by omlx.ai's community benchmarks. ## Deferred / blocked - **#19** — signed + notarized DMG. Needs a paid Apple Developer account. - **#12, #13** — subprocess-based engines (SwiftLM, Python `mlx-lm`). Closed as not planned: macOS App Sandbox blocks spawning external binaries. Reopenable if sandbox policy changes. - **#20** — Homebrew tap for the CLI. Scheduled around v0.3.6–v0.4 once the CLI tarball lands as a release asset. ## Comparison to similar tools | Tool | Native macOS GUI | MLX inference | CLI | Runtime | |-------------|-------------------|---------------|-------------|-----------------| | **macMLX** | Yes (SwiftUI) | Yes | Yes (Swift) | None (Swift-native) | | LM Studio | No (Electron) | No (GGUF) | No | Electron | | Ollama | No | No (GGUF) | Yes | Go | | oMLX | No (web UI) | Yes | Yes | Python | ## OpenAI-compatible API macMLX runs an always-on OpenAI-compatible server at `http://localhost:8000/v1` whenever a model is loaded or whenever `macmlx serve` is running. Any OpenAI-compatible client works with a custom base URL: Cursor, Continue, Cline, Raycast, Zed, Open WebUI, Aider, and so on. The API key can be any non-empty string. Endpoints: - `POST /v1/chat/completions` — streaming (SSE) and non-streaming chat completions. Cold-swap: any locally-downloaded model ID auto-loads on request (v0.3.3+). - `GET /v1/models` — returns the currently-loaded model (compatibility surface). - `GET /x/status` — reports real resident set size (RSS) and engine state. ## Quickstart ### GUI 1. Download `macMLX-vX.X.X.dmg` from https://github.com/magicnight/mac-mlx/releases 2. Drag `macMLX.app` to `/Applications` 3. First launch: `xattr -cr /Applications/macMLX.app && open /Applications/macMLX.app` (DMG is not notarized — issue #19) 4. Onboarding wizard sets `~/.mac-mlx/models` and selects the MLX Swift engine 5. Download a model in the Models tab (try `mlx-community/Qwen3-8B-4bit`), load it, and chat ### CLI ```bash macmlx pull mlx-community/Qwen3-8B-4bit # resumable download macmlx list # local models macmlx run Qwen3-8B-4bit "Hello, world" # single-prompt macmlx run Qwen3-8B-4bit # interactive REPL macmlx serve # OpenAI API on :8000 macmlx ps # is serve running? macmlx stop # graceful SIGTERM ``` ## Dependencies (open-source projects macMLX builds on) - **MLX** — Apple's array framework for Apple Silicon. https://github.com/ml-explore/mlx - **mlx-swift-lm** — Swift bindings and LLM/VLM model zoo (pinned 3.31.x). https://github.com/ml-explore/mlx-swift-examples - **swift-transformers** — Hugging Face tokenizers and Hub helpers in Swift (1.3.x). https://github.com/huggingface/swift-transformers - **Hummingbird** — Swift-native, NIO-based HTTP server (2.22.x). https://github.com/hummingbird-project/hummingbird - **Sparkle** — EdDSA-signed auto-update framework for Mac apps. https://github.com/sparkle-project/Sparkle - **Pulse** — structured logging with a Core Data–backed store. https://github.com/kean/Pulse - **swift-argument-parser** — Apple's declarative CLI framework (1.7.1). https://github.com/apple/swift-argument-parser - **Swama** — Swift-native MLX inference CLI; architectural inspiration for macMLX's in-process approach. https://github.com/Trans-N-ai/swama - **oMLX** — feature-depth reference and inspiration for the v0.7 Community Benchmarks plan. https://github.com/jundot/omlx - **SwiftLM** — 100B+ MoE inference path (subprocess integration blocked by sandbox policy). https://github.com/SharpAI/SwiftLM - **WhisperKit** — Core ML Whisper implementation planned for v0.6 speech input. https://github.com/argmaxinc/WhisperKit ## Docs - [README — English](https://github.com/magicnight/mac-mlx/blob/main/README.md) - [README — 简体中文](https://github.com/magicnight/mac-mlx/blob/main/README.zh-CN.md) - [CHANGELOG](https://github.com/magicnight/mac-mlx/blob/main/CHANGELOG.md) - [Contributing](https://github.com/magicnight/mac-mlx/blob/main/CONTRIBUTING.md) - [Security policy](https://github.com/magicnight/mac-mlx/blob/main/SECURITY.md) - [Citations (BibTeX)](https://github.com/magicnight/mac-mlx/blob/main/CITATIONS.bib) - [Code of Conduct](https://github.com/magicnight/mac-mlx/blob/main/CODE_OF_CONDUCT.md) - [License (Apache 2.0)](https://github.com/magicnight/mac-mlx/blob/main/LICENSE) ## Optional - [Releases — Atom feed](https://github.com/magicnight/mac-mlx/releases.atom) - [Issues](https://github.com/magicnight/mac-mlx/issues) - [Discussions](https://github.com/magicnight/mac-mlx/discussions)