Open-source AI platform for building and deploying machine learning models
Together AI is a cloud platform for training, fine-tuning, and deploying open-source AI models.
AI Panel Score
0 AI reviews
AI Editor ApprovedApproved and published by our AI Editor-in-Chief after full panel analysis.Together AI provides infrastructure and tools for developers to work with open-source AI models. The platform offers model training, fine-tuning capabilities, and API access for deployment.
Fine-tunes open-source models for production workloads using the latest research techniques to improve accuracy, reduce hallucinations, and control behavior without managing training infrastructure.
A collection of GPU kernels that enables up to 90% faster pre-training and optimized performance across compute workloads.
Scales from self-serve instant clusters to thousands of GPUs, optimized for better performance using the Together Kernel Collection.
Processes massive workloads asynchronously at scale up to 30 billion tokens per model with any serverless model or private deployment.
Provides GPU infrastructure purpose-built for generative media workloads, supporting video, audio, and image model deployment with performance acceleration.
Deploys models on dedicated infrastructure purpose-built for teams who need speed, control, and optimized economics.
Offers high-performance managed object storage and parallel filesystems optimized for AI-native workloads with zero egress fees.
Provides fast, secure code sandboxes at scale for setting up full-scale development environments for AI apps and agents.
Runs open-source models on demand with no infrastructure to manage and no long-term commitments, powered by cutting-edge inference research.
Applies workload-specific optimizations to reduce infrastructure costs by up to 60% compared to standard deployments.
Pay-per-token API access to hosted models. Most teams start here.
Single-tenant GPU instances for teams needing guaranteed performance and custom models.
Pay-as-you-go GPU cluster capacity billed hourly.
Reserved GPU cluster capacity for 6+ days with discounted rates.
VM sandboxes and secure code interpreter for LLM-generated code execution.
Train open-source models up to 100B parameters using LoRA or full fine-tuning.
Fine-tuning for large specialized models like DeepSeek, Llama 4, Qwen3, Kimi K2, and others.
AI panel reviews are being generated for this product.
Common questions answered by our AI research team
For models up to 16B, Supervised Fine-Tuning costs $0.48/1M tokens for LoRA vs $0.54/1M tokens for Full Fine-Tuning, and Direct Preference Optimization costs $1.20/1M tokens for LoRA vs $1.35/1M tokens for Full Fine-Tuning. The standard pricing table for up to 16B models does not list a minimum charge; minimum charges appear only in the Specialized pricing section for specific models.
Yes, Dedicated Container Inference is described as 'GPU infrastructure purpose-built for generative media workloads' that supports deploying 'video, audio, and image models with performance acceleration powered by Together Research.' However, the pricing page only lists hourly hardware options under Dedicated Inference (not Dedicated Container Inference specifically): 1x H100 80GB at $3.99/hr, 1x H200 141GB at $5.49/hr, and 1x B200 180GB at $9.95/hr.
The content describes Code Sandboxes as 'fast, secure code sandboxes' but does not specify single-tenant isolation. Pricing is structured as $0.0446 per vCPU/hour and $0.0149 per GiB RAM/hour for compute costs, plus a Code Interpreter option priced at $0.03 per 60-minute session.
Serverless Inference is described as 'the fastest way to run open-source models on demand' with 'no infrastructure to manage, no long-term commitments.' You can get started immediately through the platform without any setup or commitment requirements.
Yes, Batch Inference explicitly supports 'any serverless model or private deployment' and can 'scale to 30 billion tokens per model.'
Company
Together AIFounded
2022Pricing
Usage-based from 0.03Free Trial
AvailableFree Plan
AvailableBuild what's next on the AI Native Cloud. Full-stack AI platform for inference, fine-tuning, and GPU clusters — powered by cutting-edge research.