Deep Dive: Making Open Models 10x faster and better for Modern Application Innovation
← Head back to all of our AI Engineer World's Fair recaps
Lin Qiao @lqiao / Fireworks
Dmytro (Dima) Dzhulgakov @dzhulgakov / Fireworks
Watch it on YouTube | AI.Engineer Talk Details
Dima from Fireworks AI, introduced their focus on productionizing and customizing open-source models for inference. The founding team comes from PyTorch at Meta and Google AI, with extensive experience in AI productionization.
Open Source Models vs Proprietary Models
The speaker discussed the trade-offs between large proprietary models and smaller open-source models:
- Proprietary models: Good for many tasks but often overkill for specific use cases
- Open-source models: More customizable, potentially faster, and more cost-effective for narrow domains
Challenges with Open Source Models
- Complicated setup and maintenance
- Optimization for specific use cases
- Production readiness (scalability, telemetry, observability)
Fireworks AI's Approach
- Custom serving stack optimized for efficiency
- Support for various modalities (LLMs, image generation)
- Focus on customization for specific use cases
- Fine-tuning capabilities, including LoRA deployment
Compound AI Systems
The speaker introduced the concept of "compound AI systems," combining:
- Language models
- RAG (Retrieval-Augmented Generation)
- Function calling
- External tools and APIs
FireFunction & Function Calling Demo
This was one of the most interesting parts of the talk. The speaker demonstrated a model fine-tuned specifically for function calling:
- Combines free-form chat with API connections
- Can perform multi-step reasoning
- Maintains contextual awareness
Demo
- Asked to generate a bar chart of top cloud providers' stock prices
- Model identified providers, queried stock prices, and generated the chart
- Demonstrated ability to modify the chart based on follow-up questions
The demo app is open-source and available on GitHub, using their FireFunction model (available on HuggingFace).
Summary
Overall, this talk provided a good overview of Fireworks AI's approach to open-source model deployment and customization.
Their focus on efficiency and function calling capabilities seems particularly noteworthy.