Skip to main content
AI Engineer World's Fair (2024)

Deep Dive: Making Open Models 10x faster and better for Modern Application Innovation


Head back to all of our AI Engineer World's Fair recaps

Lin Qiao @lqiao / Fireworks
Dmytro (Dima) Dzhulgakov @dzhulgakov / Fireworks
Watch it on YouTube | AI.Engineer Talk Details

Dima from Fireworks AI, introduced their focus on productionizing and customizing open-source models for inference. The founding team comes from PyTorch at Meta and Google AI, with extensive experience in AI productionization.

Open Source Models vs Proprietary Models


The speaker discussed the trade-offs between large proprietary models and smaller open-source models:

  • Proprietary models: Good for many tasks but often overkill for specific use cases
  • Open-source models: More customizable, potentially faster, and more cost-effective for narrow domains

Challenges with Open Source Models

  • Complicated setup and maintenance
  • Optimization for specific use cases
  • Production readiness (scalability, telemetry, observability)

Fireworks AI's Approach


  • Custom serving stack optimized for efficiency
  • Support for various modalities (LLMs, image generation)
  • Focus on customization for specific use cases
  • Fine-tuning capabilities, including LoRA deployment


Compound AI Systems


The speaker introduced the concept of "compound AI systems," combining:

  • Language models
  • RAG (Retrieval-Augmented Generation)
  • Function calling
  • External tools and APIs

FireFunction & Function Calling Demo


This was one of the most interesting parts of the talk. The speaker demonstrated a model fine-tuned specifically for function calling:

  • Combines free-form chat with API connections
  • Can perform multi-step reasoning
  • Maintains contextual awareness



  • Asked to generate a bar chart of top cloud providers' stock prices
  • Model identified providers, queried stock prices, and generated the chart
  • Demonstrated ability to modify the chart based on follow-up questions

The demo app is open-source and available on GitHub, using their FireFunction model (available on HuggingFace).


Overall, this talk provided a good overview of Fireworks AI's approach to open-source model deployment and customization.
Their focus on efficiency and function calling capabilities seems particularly noteworthy.