Files
shimmy/docs/quickstart.md
Michael A. Kuykendall 0b0e8e2e29 feat(mlx): implement native Apple Silicon MLX support with pre-commit quality gates
- Add comprehensive MLX engine implementation with Python MLX bindings
- Implement MLX model discovery, loading, and native inference pipeline
- Add MLX feature flag compilation and Apple Silicon hardware detection
- Create dedicated GitHub Actions workflow for MLX testing on macos-14 ARM64
- Add MLX documentation to README and wiki with capability descriptions
- Implement pre-commit hooks enforcing cargo fmt, clippy, and test validation
- Fix GPU backend tests to properly force specific backends instead of auto-detection
- Resolve property test race conditions with serial test execution
- Update release workflow validation and platform-specific test expectations
- Add MLX implementation plan and cross-compilation toolchain support

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-09 20:11:32 -05:00

2.1 KiB

Quick Start: Shimmy in 30 Seconds

1. Download

# Linux/macOS
curl -L https://github.com/Michael-A-Kuykendall/shimmy/releases/latest/download/shimmy -o shimmy
chmod +x shimmy

# Windows
curl -L https://github.com/Michael-A-Kuykendall/shimmy/releases/latest/download/shimmy.exe -o shimmy.exe

2. Get a Model

Place any .gguf file in one of these locations:

  • ./models/your-model.gguf
  • Set SHIMMY_BASE_GGUF=/path/to/your-model.gguf
  • Or just put it in ~/Downloads/ - Shimmy will find it

Don't have a model? Try microsoft/Phi-3-mini-4k-instruct-gguf

3. Start Shimmy

./shimmy serve

That's it! Shimmy is now running on http://localhost:11435

4. Connect Your Tools

VSCode Copilot:

// settings.json
{
  "github.copilot.advanced": {
    "serverUrl": "http://localhost:11435"
  }
}

Continue.dev:

{
  "models": [{
    "title": "Local Shimmy",
    "provider": "openai",
    "model": "your-model-name",
    "apiBase": "http://localhost:11435/v1"
  }]
}

Cursor: Set custom endpoint to http://localhost:11435

5. Test It

# List available models
./shimmy list

# Test generation
./shimmy generate --name your-model --prompt "Hello!" --max-tokens 10

# Or use curl
curl -X POST http://localhost:11435/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 10
  }'

Troubleshooting

No models found?

  • Make sure your .gguf file is in ./models/ or set SHIMMY_BASE_GGUF
  • Run ./shimmy discover to see what Shimmy can find

Port already in use?

./shimmy serve --bind 127.0.0.1:11436

Need help?


Next: Check out integrations for more examples!