Using OpenAI chat completion with Gemini and Together

7 min readNov 17, 2024

Over the past few years, OpenAI has transformed how industries leverage generative AI, starting with the groundbreaking release of ChatGPT. This tool marked a turning point, sparking widespread adoption of AI technologies. Following ChatGPT, OpenAI introduced a range of APIs accessible in popular programming languages like Python and JavaScript. One of the most prominent methods for engaging with OpenAI models, such as GPT-4-mini, GPT-Turbo and GPT-4o, is through Chat Completion.

Recently, OpenAI expanded its capabilities by allowing integration with various open-source models, enabling developers to use Together.AI and Google’s Gemini models via the Chat Completion API. This approach offers cost-effective alternatives to OpenAI’s proprietary models while maintaining flexibility and performance.

Let’s dive deeper into how this new integration can drive innovation using Together and Gemini.

Getting Started

What is OpenAI Chat Completion?
What is Together.AI?
Accessing Meta Llama models via OpenAI’s Chat Completion
Experimenting with Together
1. Using together library
2. Using OpenAI chat completion
Accessing Google Gemini via OpenAI’s Chat Completion
Using Gemini with OpenAI
Summary

What is OpenAI Chat Completion?

OpenAI Chat Completion is a feature of OpenAI’s API, particularly associated with the GPT models like GPT-3.5 and GPT-4. It allows developers to interact with powerful language models to generate human-like text based on user prompts. This feature is used to build conversational AI applications, such as chatbots, virtual assistants, customer support systems, and more.

Key Features of Chat Completion:

Natural Language Understanding: The model can interpret and respond to prompts in natural language, making interactions feel human-like.
Context-Aware Responses: It can maintain context over a conversation, making it suitable for multi-turn dialogues.
Customizability: Users can set system messages or modify prompts to guide the tone, personality, or specific instructions for the model.
Versatility: It can handle a variety of tasks, such as answering questions, providing recommendations, generating content, and more.

What is Together.AI?

Together.AI is an open platform for developing, deploying, and managing AI models. It focuses on democratizing access to large language models and AI infrastructure, aiming to make advanced AI capabilities available to everyone.

Key Features of Together.AI:

Collaborative AI Development: It provides a platform for researchers, developers, and organizations to collaborate on building AI models and applications.
Decentralized AI Infrastructure: Together.AI uses a decentralized infrastructure, which allows for distributing AI workloads across multiple nodes. This can reduce costs and increase accessibility to powerful AI resources.
Open-Source Models: The platform supports open-source models, encouraging transparency, collaboration, and innovation in AI development.
Interoperability: Together.AI emphasizes interoperability, meaning it can work seamlessly with other AI platforms and frameworks.

Why is Together.AI Important?

Together.AI aims to democratize access to advanced AI by providing a decentralized, open platform that reduces reliance on large AI providers. This decentralized infrastructure not only cuts down the high costs associated with training and deploying large models, making AI more affordable for smaller entities, but also encourages collaboration among researchers and developers. By promoting open-source models and transparency, Together.AI aligns with ethical AI development principles, fostering responsible and inclusive innovation within the AI community.

Accessing Meta Llama models via OpenAI’s Chat Completion

Using OpenAI’s chat completion API offers a simple way to access proprietary models like GPT-4. However, running models such as Llama 3.2 with vision capabilities locally requires high-end hardware, which can be costly. If you’ve developed solutions using OpenAI’s ecosystem, transitioning to open-source models like Llama to reduce costs often involves significant codebase modifications. Together.AI addresses this challenge by allowing you to use open-source models through OpenAI’s chat completion interface, enabling a smoother migration with minimal changes while maintaining cost efficiency.

Visit Together.AI and sign in to your account or create a new one if needed.
Go to the Models section to explore different open-source models.
Navigate to the Dashboard to collect your API key.

Installing dependencies

Create and activate a virtual environment by executing the following command.

python -m venv venv
source venv/bin/activate #for ubuntu
venv/Scripts/activate #for windows

Install together, openai and python-dotenv libraries using pip.

pip install together openai python-dotenv

Experimenting with Together

1. Using together library

Together’s python library provides support for accessing the open-source models available on their website.

Create a .env file and add your Together API key to the .env file as follows:

TOGETHER_API_KEY=05e1a66f6f0fad...

Use the code below to access the Meta Vision model.

# Load environment variables from the .env file
import os
from dotenv import load_dotenv
load_dotenv()
os.environ['TOGETHER_API_KEY'] = os.getenv('TOGETHER_API_KEY')

from together import Together                                                                   # type: ignore

client = Together(api_key=os.getenv('TOGETHER_API_KEY'))

response = client.chat.completions.create(
    model="meta-llama/Llama-Vision-Free",
    messages=[
        {"role": "user", 
         "content": "What are some fun things to do in New York?"
        }
    ]
)
print(response.choices[0].message.content)

You will get output as follows,

2. Using OpenAI chat completion

OpenAI chat completion also supports together models. With the help of together, users can use different models not only GPT models with OpenAI chat completion.

Use the following code to access Meta vision model through OpenAI chat completion.

# Load environment variables from the .env file
import os
from dotenv import load_dotenv
load_dotenv()
os.environ['TOGETHER_API_KEY'] = os.getenv('TOGETHER_API_KEY')

from openai import OpenAI
client = OpenAI(
    api_key=os.getenv('TOGETHER_API_KEY'),
    base_url="https://api.together.xyz/v1" 
)

response = client.chat.completions.create(
  model="meta-llama/Llama-Vision-Free",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {
      "role": "user", 
      "content": "What are some fun things to do in New York?"
    }
  ]
)
print(response.choices[0].message.content)

You will get output as follows,

Accessing Google Gemini via OpenAI’s Chat Completion

As of November 2024, Google’s Gemini models are now accessible using the OpenAI Chat Completion API. This integration allows developers to utilize Gemini’s capabilities directly through OpenAI’s library, simplifying access to both chat and embeddings functionalities. Initial support includes models like Gemini 1.5 Flash. Users can start experimenting with it via Python, JavaScript, or REST API calls.

Why is Access to Gemini via OpenAI Chat Completion Important?

The integration of Google Gemini models with OpenAI’s Chat Completion API is significant for several reasons:

Unified Access: Developers can now access both OpenAI’s GPT models and Google’s Gemini models through a single API, streamlining workflows.
Enhanced Capabilities: Gemini models offer unique strengths in areas like multilingual support and contextual understanding, complementing OpenAI’s offerings.
Flexibility: With support for Python, JavaScript, and REST APIs, this integration broadens the flexibility for developers to build advanced AI applications.

To get started, check out the full documentation on the Google Developers Blog.

Using Gemini with OpenAI

Similar to together, Gemini models can be used in OpenAI chat completion by specifying the base_url and the api_key.

Visit Google AI Studio
Click on Get API Key → Create API key to generate gemini api key.

Use the following code to access gemini flash model through openai chat completion.

# Load environment variables from the .env file
import os
from dotenv import load_dotenv
load_dotenv()
os.environ['GEMINI_API_KEY'] = os.getenv('GEMINI_API_KEY')

from openai import OpenAI
client = OpenAI(
    api_key=os.getenv('GEMINI_API_KEY'),
    base_url="https://generativelanguage.googleapis.com/v1beta/"
)

response = client.chat.completions.create(
    model="gemini-1.5-flash",
    n=1,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user",
            "content": "Explain to me how AI works"
        }
    ]
)

print(response.choices[0].message.content)