mirror of
https://fastgit.cc/github.com/Michael-A-Kuykendall/shimmy
synced 2026-05-01 14:30:10 +08:00
- Add comprehensive MLX engine implementation with Python MLX bindings - Implement MLX model discovery, loading, and native inference pipeline - Add MLX feature flag compilation and Apple Silicon hardware detection - Create dedicated GitHub Actions workflow for MLX testing on macos-14 ARM64 - Add MLX documentation to README and wiki with capability descriptions - Implement pre-commit hooks enforcing cargo fmt, clippy, and test validation - Fix GPU backend tests to properly force specific backends instead of auto-detection - Resolve property test race conditions with serial test execution - Update release workflow validation and platform-specific test expectations - Add MLX implementation plan and cross-compilation toolchain support 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2.1 KiB
2.1 KiB
Quick Start: Shimmy in 30 Seconds
1. Download
# Linux/macOS
curl -L https://github.com/Michael-A-Kuykendall/shimmy/releases/latest/download/shimmy -o shimmy
chmod +x shimmy
# Windows
curl -L https://github.com/Michael-A-Kuykendall/shimmy/releases/latest/download/shimmy.exe -o shimmy.exe
2. Get a Model
Place any .gguf file in one of these locations:
./models/your-model.gguf- Set
SHIMMY_BASE_GGUF=/path/to/your-model.gguf - Or just put it in
~/Downloads/- Shimmy will find it
Don't have a model? Try microsoft/Phi-3-mini-4k-instruct-gguf
3. Start Shimmy
./shimmy serve
That's it! Shimmy is now running on http://localhost:11435
4. Connect Your Tools
VSCode Copilot:
// settings.json
{
"github.copilot.advanced": {
"serverUrl": "http://localhost:11435"
}
}
Continue.dev:
{
"models": [{
"title": "Local Shimmy",
"provider": "openai",
"model": "your-model-name",
"apiBase": "http://localhost:11435/v1"
}]
}
Cursor:
Set custom endpoint to http://localhost:11435
5. Test It
# List available models
./shimmy list
# Test generation
./shimmy generate --name your-model --prompt "Hello!" --max-tokens 10
# Or use curl
curl -X POST http://localhost:11435/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "your-model",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 10
}'
Troubleshooting
No models found?
- Make sure your
.gguffile is in./models/or setSHIMMY_BASE_GGUF - Run
./shimmy discoverto see what Shimmy can find
Port already in use?
./shimmy serve --bind 127.0.0.1:11436
Need help?
- Open an issue
- Check existing discussions
Next: Check out integrations for more examples!