Deep Dive: Building SOTA Open Weights Tool Use: The Command R Family
← Head back to all of our AI Engineer World's Fair recaps
Sandra Kublik @itsSandraKublik / Cohere
Watch it on YouTube | AI.Engineer Talk Details
Sandra from Cohere gave a talk about their recent work on language models, focusing on their Command R family of models. She highlighted their progress in developing models optimized for retrieval-augmented generation (RAG) and tool use.
Key Points
- Cohere released Command R and Command R Plus models in March 2024
- These models excel at structured reasoning and are competitive with larger models like GPT-4 Turbo
- Command R Plus quickly gained popularity in the open-source community
- Cohere focused on optimizing for citations and reducing hallucinations
- They open-sourced their UI toolkit for RAG applications in April 2024
- Recent work has centered on improving tool use capabilities, especially for enterprise contexts
Technical Details
Model Design Decisions
- Optimized for retrieval-augmented generation (RAG) and tool use
- Addressed challenges like prompt sensitivity and overcoming model bias towards focusing on the beginning of documents
- Improved ability to balance pre-trained knowledge with new information from prompts
- Enhanced citation capabilities for better transparency and reduced hallucinations
Tool Use Capabilities
- Developed both single-step and multi-step tool use functionality
- Multi-step capability allows for sequential reasoning and error correction
- Released a multi-step API that allows users to describe available tools and parameters
- Model creates plans and adapts them based on tool outputs
Performance and Efficiency
- Command R Plus is competitive with GPT-4 Turbo on complex reasoning benchmarks
3-5 times cheaper to run than comparable models, improving scalability for production use
Demos
Sandra demonstrated two applications built with Cohere's models:
Complexity AI (cplxai): A generative search engine that provides answers grounded in multiple sources with clickable citations.
Internal RAG demo: Showed multi-step reasoning capabilities by asking the model to find the three largest companies by market cap, get employee counts, create a graph, and draft a tweet. The demo displayed each step of the model's thought process.
As well as usage with their Discord bot.
Resources
- Toolkit repo: Open-source UI components for building RAG applications
- Command R and Command R Plus models