mirror of https://fastgit.cc/github.com/Michael-A-Kuykendall/shimmy synced 2026-05-01 06:12:44 +08:00

Files

Michael A. Kuykendall 0b0e8e2e29 feat(mlx): implement native Apple Silicon MLX support with pre-commit quality gates

- Add comprehensive MLX engine implementation with Python MLX bindings
- Implement MLX model discovery, loading, and native inference pipeline
- Add MLX feature flag compilation and Apple Silicon hardware detection
- Create dedicated GitHub Actions workflow for MLX testing on macos-14 ARM64
- Add MLX documentation to README and wiki with capability descriptions
- Implement pre-commit hooks enforcing cargo fmt, clippy, and test validation
- Fix GPU backend tests to properly force specific backends instead of auto-detection
- Resolve property test race conditions with serial test execution
- Update release workflow validation and platform-specific test expectations
- Add MLX implementation plan and cross-compilation toolchain support

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-09 20:11:32 -05:00

docker-compose.yml

feat(mlx): implement native Apple Silicon MLX support with pre-commit quality gates

2025-10-09 20:11:32 -05:00

Dockerfile

feat(mlx): implement native Apple Silicon MLX support with pre-commit quality gates

2025-10-09 20:11:32 -05:00

fly.toml

feat(mlx): implement native Apple Silicon MLX support with pre-commit quality gates

2025-10-09 20:11:32 -05:00

nginx.conf

feat(mlx): implement native Apple Silicon MLX support with pre-commit quality gates

2025-10-09 20:11:32 -05:00

railway.toml

feat(mlx): implement native Apple Silicon MLX support with pre-commit quality gates

2025-10-09 20:11:32 -05:00

README.md

feat(mlx): implement native Apple Silicon MLX support with pre-commit quality gates

2025-10-09 20:11:32 -05:00

render.yaml

feat(mlx): implement native Apple Silicon MLX support with pre-commit quality gates

2025-10-09 20:11:32 -05:00

README.md

Shimmy Cloud Deployment

One-click deployment configurations for popular cloud platforms.

Quick Deploy Buttons

Railway

Render

Fly.io

# Install flyctl and deploy
curl -L https://fly.io/install.sh | sh
fly deploy

Docker (Any Platform)

# Local development
docker-compose up

# Production with Nginx
docker-compose --profile production up

Platform-Specific Instructions

Railway.app

Click the "Deploy on Railway" button above
Connect your GitHub account
Fork this repository
Railway will automatically build and deploy
Your Shimmy instance will be available at https://your-app.railway.app

Render.com

Click the "Deploy to Render" button above
Connect your GitHub repository
Render will use the render.yaml configuration
Your service will be available with automatic HTTPS

Fly.io

Install the Fly CLI: curl -L https://fly.io/install.sh | sh
Clone this repository: git clone https://github.com/Michael-A-Kuykendall/shimmy.git
Navigate to the project: cd shimmy
Create and deploy: fly deploy
Access your app: fly open

Google Cloud Run

# Build and deploy to Cloud Run
gcloud builds submit --tag gcr.io/PROJECT-ID/shimmy
gcloud run deploy --image gcr.io/PROJECT-ID/shimmy --platform managed

AWS App Runner

Create apprunner.yaml in your repository root:

version: 1.0
runtime: docker
build:
  commands:
    build:
      - echo "Building Shimmy with Docker"
run:
  runtime-version: latest
  command: shimmy serve --bind 0.0.0.0:8080
  network:
    port: 8080

DigitalOcean App Platform

Create app via DigitalOcean control panel
Connect your GitHub repository
DigitalOcean will detect the Dockerfile automatically
Set environment variables as needed

Environment Variables

Variable	Default	Description
`PORT`	`11434`	Port to bind the server
`RUST_LOG`	`info`	Log level (error, warn, info, debug, trace)
`SHIMMY_BIND`	`0.0.0.0:11434`	Full bind address

Resource Requirements

Minimal

CPU: 0.5 vCPU
Memory: 512MB RAM
Storage: 100MB (binary only)

High Performance

CPU: 2+ vCPU
Memory: 4GB+ RAM
Storage: 10GB+ SSD

Security Considerations

Authentication: Shimmy doesn't include built-in authentication. Use a reverse proxy (Nginx, Cloudflare) for auth.
Rate Limiting: The included Nginx configuration has basic rate limiting. Adjust as needed.
HTTPS: Most cloud platforms provide automatic HTTPS. For self-hosted deployments, configure SSL certificates.
Firewall: Only expose port 11434 (or your configured port) to the public internet.

Monitoring

Health Checks

All configurations include health checks at /health endpoint.

Logs

Set RUST_LOG=debug for detailed logging. Most platforms provide log aggregation.

Metrics

For production deployments, consider adding:

Prometheus metrics
Jaeger tracing
Custom monitoring dashboards

Scaling

Horizontal Scaling

Shimmy is stateless and can be horizontally scaled. Use a load balancer to distribute requests.

Vertical Scaling

For better performance with large models:

Increase memory for model caching
Add more CPU cores for parallel processing
Use SSD storage for faster model loading

Troubleshooting

Common Issues

Out of Memory: Increase memory allocation or use memory-mapped loading
Slow Startup: Enable model caching and use persistent storage
Connection Timeout: Increase proxy timeout settings for large model inference

Debug Mode

# Enable debug logging
RUST_LOG=debug shimmy serve

Container Debugging

# Access running container
docker exec -it shimmy-container /bin/bash

# Check logs
docker logs shimmy-container

README.md

Shimmy Cloud Deployment

Quick Deploy Buttons

Railway

Render

Fly.io

Docker (Any Platform)

Platform-Specific Instructions

Railway.app

Render.com

Fly.io

Google Cloud Run

AWS App Runner

DigitalOcean App Platform

Environment Variables

Resource Requirements

Minimal

Recommended

High Performance

Security Considerations

Monitoring

Health Checks

Logs

Metrics

Scaling

Horizontal Scaling

Vertical Scaling

Troubleshooting

Common Issues

Debug Mode

Container Debugging