Skip to main content
AI Engineer World's Fair (2024)

Deep Dive: Making Open Models 10x faster and better for Modern Application Innovation

PromptPanel

Head back to all of our AI Engineer World's Fair recaps

Lin Qiao @lqiao / Fireworks
Dmytro (Dima) Dzhulgakov @dzhulgakov / Fireworks
Watch it on YouTube | AI.Engineer Talk Details

Dima from Fireworks AI, introduced their focus on productionizing and customizing open-source models for inference. The founding team comes from PyTorch at Meta and Google AI, with extensive experience in AI productionization.

Open Source Models vs Proprietary Models

frame_0170.jpg

The speaker discussed the trade-offs between large proprietary models and smaller open-source models:

  • Proprietary models: Good for many tasks but often overkill for specific use cases
  • Open-source models: More customizable, potentially faster, and more cost-effective for narrow domains

Challenges with Open Source Models

  • Complicated setup and maintenance
  • Optimization for specific use cases
  • Production readiness (scalability, telemetry, observability)

Fireworks AI's Approach

frame_0417.jpg

  • Custom serving stack optimized for efficiency
  • Support for various modalities (LLMs, image generation)
  • Focus on customization for specific use cases
  • Fine-tuning capabilities, including LoRA deployment

frame_0587.jpg

Compound AI Systems

frame_0698.jpg

The speaker introduced the concept of "compound AI systems," combining:

  • Language models
  • RAG (Retrieval-Augmented Generation)
  • Function calling
  • External tools and APIs

FireFunction & Function Calling Demo

frame_0733.jpg

This was one of the most interesting parts of the talk. The speaker demonstrated a model fine-tuned specifically for function calling:

  • Combines free-form chat with API connections
  • Can perform multi-step reasoning
  • Maintains contextual awareness

frame_0819.jpg

Demo

  • Asked to generate a bar chart of top cloud providers' stock prices
  • Model identified providers, queried stock prices, and generated the chart
  • Demonstrated ability to modify the chart based on follow-up questions

The demo app is open-source and available on GitHub, using their FireFunction model (available on HuggingFace).

Summary

Overall, this talk provided a good overview of Fireworks AI's approach to open-source model deployment and customization.
Their focus on efficiency and function calling capabilities seems particularly noteworthy.