Neon is Generally Available! Serverless Postgres with branching to boost your development velocity.Read more
AI

Deploy Mistral Large to Azure and create a conversation with Python and LangChain

Step-by-step guide to deploying Mistral Large to Azure

Post image

We’re Neon, and we’re redefining the database experience with our cloud-native serverless Postgres solution. If you’ve been looking for a database for your RAG apps that adapts to your application loads, you’re in the right place. Learn more about Neon and give it a try, and let us know what you think. Neon is cloud-native Postgres and scales your AI apps to millions of users with pgvector. In this post, Raouf is going to tell you what you need to know about Mistral Large, the most advanced LLM by MistralAI.

Mistral AI has recently unveiled its most advanced open-source large language model (LLM) yet, Mistral Large, alongside its ChatGPT competitor, Le Chat (beta). Le Chat includes other models such as Next, and Small, to let you explore Mistral AI’s capabilities. 

Post image

For those waiting to get their hands on Le Chat but stuck in the queue, this guide will show you how to deploy Mistral Large on Azure and start using it immediately with LangChain.

Before we dive into the deployment process, let’s briefly explore Mistral Large.

Mistral Large

Mistral Large is Mistral AI’s most advanced model with unparalleled reasoning capabilities across multiple languages, including French, Spanish, German and Italian. It has a generous 32k token context window making interesting for Retrieval Augmented Generation applications. 

Post image
Comparison measuring massive multitask language understanding

And most importantly, Mistral Large is pretty good at coding and math. The model ranks the highest in the MassiveText Benchmarks for Programming Problems (MBPP), which covers a wide range of difficulty levels and programming concepts and is designed to evaluate models on several fronts, including accuracy and efficiency.

Mistral Large also ranks the highest in the GSM8K, which measures the capabilities of AI models in educational contexts and reasoning in mathematics.

Post image

But don’t believe the benchmarks. Next, we’ll deploy the Mistral Large model to Azure and try it for ourselves.

Deploy your own Mistral Large model to Azure

As part of the launch, Mistral AI announced its partnership with Microsoft, making the Mistral Large model available on Azure. Below are the steps to deploy the model:

  1. Access Azure AI Studio: Sign into your Azure account and navigate to AI Studio.
  2. Deploy Mistral Large: Look for the “Deploy” option and select Mistral Large for deployment.
Post image
  1. Create a Project: If you haven’t already, set up a new project, opting for the Pay-As-You-Go plan and choosing France Central as your region.
Post image
  1. Review and Create: Double-check your resource information before finalizing your AI project.
Post image
  1. Finalize Deployment: After creating your AI project, proceed to deploy Mistral Large. Choose a name for your deployment; this will be your inference endpoint’s identifier.
  1. Select a Deployment Name: This is the name that will be displayed on your inference endpoint.
Post image

Congratulations 🎉 You’ve successfully deployed Mistral Large on Azure!

How to use Mistral Large with LangChain

After deployment, you’ll receive an API endpoint and a security key for making inferences. We’ll use those further below.

Post image

To use Mistral Large with LangChain, follow these steps:

  1. Create project
mkdir mistral-large-example
cd mistral-large-example
  1. Create and activate Python environment: Run the following command to create an environment.
python -m venv myenv
source myenv/bin/activate
  1. Install packages and project dependencies:
pip install langchain langchain_mistralai
  1. Create a LangChain conversation: first, create a file:
touch main.py

Here’s an example of how to create a LangChain conversation chain with Mistral Large:

from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, MessagesPlaceholder
from langchain.schema import SystemMessage
from langchain_mistralai.chat_models import ChatMistralAI

# Configuration for prompting
prompt = ChatPromptTemplate.from_messages([
    SystemMessage(content="You are a chatbot engaging in a conversation with a human, often incorporating French cultural references."),
    MessagesPlaceholder(variable_name="chat_history"),
    HumanMessagePromptTemplate.from_template("{human_input}"),
])

# Memory configuration
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

# Configuring the Mistral model endpoint and API key
chat_model = ChatMistralAI(
    endpoint="https://<endpoint>.francecentral.inference.ai.azure.com",
    mistral_api_key="<api-key",
)

# Setting up the conversation chain
chat_llm_chain = LLMChain(
    llm=chat_model,
    prompt=prompt,
    memory=memory,
    verbose=True,
)

# Example usage
result = chat_llm_chain.predict(human_input="Hi there, my friend")
print(result)

Copy/Paste the code above to the main.py file and run the following:

python main.py

Here is how the output should look like:

python3 main.py 

> Entering new LLMChain chain...

Prompt after formatting:

System: You are a chatbot engaging in a conversation with a human, often incorporating French cultural references.

Human: Hi there, my friend

> Finished chain.

 Hello! It's a pleasure to chat with you. As you've noticed, I enjoy incorporating French cultural references into our conversations. Did you know that the Eiffel Tower, one of France's most iconic landmarks, was initially criticized by some of France's leading artists and intellectuals for its design when it was first built? How can I assist you today?

Conclusion

There has never been a better time to develop AI-powered applications. With rapid deployments to robust and scalable infrastructures such as Azure’s, developers can create applications that are more intelligent, interactive, and impactful.

If you are building a RAG application, or simply need a Postgres database that scales, Neon with its autoscaling capabilities offers elastic vector search and fast index build with pgvector, making your AI apps fast and scalable to millions of users.

Start building with Neon for free today, join us on Discord and let us know what you’re working on and how we can help you build better apps. 

Resources