Introduction
Fine-tuning allows large language models to better align with specific tasks, incorporate new information, and achieve superior task alignment. Compared to prompting, fine-tuning significantly boosts performance, often surpassing larger models in terms of speed and cost-effectiveness. This article will delve into the process of fine-tuning a large language model using the Mistral AI platform.
Learning Objectives
- Understand the process and benefits of fine-tuning large language models for specific tasks and advanced workflows.
- Master the preparation of datasets in JSON Lines format for fine-tuning, including instruction-based and function-calling logic formats.
- Learn to execute fine-tuning on the Mistral AI platform, configure jobs, monitor training, and perform inference using fine-tuned models.

Dataset Preparation
For dataset preparation, data must be stored in JSON Lines (.jsonl) files, allowing multiple JSON objects to be stored, each on a new line. Datasets should follow an instruction-following format that represents a user-assistant conversation. Each JSON data sample should either consist of only user and assistant messages (“Default Instruct”) or include function-calling logic (“Function-calling Instruct”).
Let us look at a few use cases for constructing a dataset.
Specific Format
To illustrate, let’s consider extracting medical information from notes. The medical_knowledge_from_extracts dataset provides the desired output format, including conditions and interventions.
Interventions can be categorized into behavioral, drug, and other interventions.
An example of the output format is provided:
{
"conditions": "Proteinuria",
"interventions": [
"Drug: Losartan Potassium",
"Other: Comparator: Placebo (Losartan)",
"Drug: Comparator: amlodipine besylate",
"Other: Comparator: Placebo (amlodipine besylate)",
"Other: Placebo (Losartan)",
"Drug: Enalapril Maleate"
]
}
The following code demonstrates how to load, format, and save this data as a .jsonl file. Additionally, you can randomize the order and split the data for further processing into training and validation files.
import pandas as pd
import json
df = pd.read_csv(
"https://huggingface.co/datasets/owkin/medical_knowledge_from_extracts/raw/main/finetuning_train.csv"
)
df_formatted = [
{
"messages": [
{"role": "user", "content": row["Question"]},
{"role": "assistant", "content": row["Answer"]}
]
}
for index, row in df.iterrows()
]
with open("data.jsonl", "w") as f:
for line in df_formatted:
json.dump(line, f)
f.write("\n")
Also Read: Fine-Tuning Large Language Language Models
Coding
To generate SQL from text, we can use data containing SQL questions and the context of the SQL table to train the model for correct SQL syntax output.
The formatted output will be as follows:

The code below demonstrates formatting data for text-to-SQL generation:
import pandas as pd
import json
df = pd.read_json(
"https://huggingface.co/datasets/b-mc2/sql-create-context/resolve/main/sql_create_context_v4.json"
)
df_formatted = [
{
"messages": [
{
"role": "user",
"content": f"""
You are a powerful text-to-SQL model. Your job is to answer questions about a database.
You are given a question and context regarding one or more tables.
You must output the SQL query that answers the question.
### Input: {row["question"]}
### Context: {row["context"]}
### Response:
"""
},
{
"role": "assistant",
"content": row["answer"]
}
]
}
for index, row in df.iterrows()
]
with open("data.jsonl", "w") as f:
for line in df_formatted:
json.dump(line, f)
f.write("\n")
Adopt for RAG
Another application is fine-tuning an LLM to enhance its performance for RAG (Retrieval Augmented Generation). RAFT (Retrieval Augmented Fine-Tuning) fine-tunes an LLM to answer questions based on relevant documents, resulting in significant performance improvements across specialized domains.
To create a fine-tuning dataset for RAG, start with the context of the document’s original text. Generate questions and answers to form query-context-answer triplets based on this context. Here are two prompt templates for generating these questions and answers:
Prompt template for generating questions:
Context information is provided below:
———————
{context_str}
———————
Given the context information and without prior knowledge, create {num_questions_per_chunk} diverse questions based on the context. Ensure that the questions are relevant to the information provided in the context.
Prompt template for generating answers:
Context information is provided below:
——————— {context_str} ———————
Given the context information and without prior knowledge, answer the following query: {generated_query_str}
Enhance Mistral AI’s function-calling capabilities by fine-tuning function-calling data. In cases where native function calling features are insufficient, especially when dealing with specific tools and domains, fine-tune using your agent data for function calling. This approach can greatly enhance the agent’s performance and accuracy, enabling it to select appropriate tools and actions effectively.
A simple example to train the model to call the generate_anagram() function is provided above. The example demonstrates the interaction between the user, assistant, and tools in generating anagrams.
The formatting guidelines for storing conversational data in a structured format are outlined, ensuring that each message is properly categorized and follows a specific sequence for effective communication and tool utilization. To generate these automatically, use `””.join(random.choices(string.ascii_letters + string.digits, k=9))
You can validate the dataset format and make corrections by modifying the script if needed:
# Download the validation script
wget https://raw.githubusercontent.com/mistralai/mistral-finetune/main/utils/validate_data.py
# Download the reformat script
wget https://raw.githubusercontent.com/mistralai/mistral-finetune/main/utils/reformat_data.py
# Reformat data
python reformat_data.py data.jsonl
# Validate data
python validate_data.py data.jsonl
Training
Once you have the data file in the correct format, you can upload it to the Mistral Client for use in fine-tuning jobs.
import os
from mistralai.client import MistralClient
api_key = os.environ.get("MISTRAL_API_KEY")
client = MistralClient(api_key=api_key)
with open("training_file.jsonl", "rb") as f:
training_data = client.files.create(file=("training_file.jsonl", f))
Please note that each fine-tuning job on the Mistral LLM costs $2 per 1M tokens for the Mistral 7B model, with a minimum of $4.
After loading the dataset, you can create a fine-tuning job.
from mistralai.models.jobs import TrainingParameters
created_jobs = client.jobs.create(
model="open-mistral-7b",
training_files=[training_data.id],
validation_files=[validation_data.id],
hyperparameters=TrainingParameters(
training_steps=10,
learning_rate=0.0001,
)
)
created_jobs
Expected Output

The parameters include:
- model: the model you want to fine-tune, such as open-mistral-7b or mistral-small-latest.
- training_files: a collection of training file IDs, which can include one or more files.
- validation_files: a collection of validation file IDs, which can include one or more files.
- hyperparameters: adjustable hyperparameters like “training_step” and “learning_rate” that users can modify.
For LoRA fine-tuning, the recommended learning rate is 1e-4 (default) or 1e-5.
The specified learning rate is the peak rate, warming up linearly and decaying by cosine schedule. During warmup, the rate increases linearly from a small initial value to a larger value over several steps, then decreases following a cosine function.
You can also use Weights and Biases to monitor and track metrics.
from mistralai.models.jobs import WandbIntegrationIn, TrainingParameters
import os
wandb_api_key = os.environ.get("WANDB_API_KEY")
created_jobs = client.jobs.create(
model="open-mistral-7b",
training_files=[training_data.id],
validation_files=[validation_data.id],
hyperparameters=TrainingParameters(
training_steps=10,
learning_rate=0.0001,
),
integrations=[
WandbIntegrationIn(
project="test_api",
run_name="test",
api_key=wandb_api_key,
).dict()
]
)
created_jobs
You can use the dry_run=True argument to determine the number of tokens the model is being trained on.
Inference
You can list jobs, retrieve a job, or cancel a job.
# List jobs
jobs = client.jobs.list()
print(jobs)
# Retrieve a job
retrieved_jobs = client.jobs.retrieve(created_jobs.id)
print(retrieved_jobs)
# Cancel a job
canceled_jobs = client.jobs.cancel(created_jobs.id)
print(canceled_jobs)
When completing a fine-tuned job, you can get the fine-tuned model name with retrieved_jobs.fine_tuned_model.
from mistralai.models.chat_completion import ChatMessage
chat_response = client.chat(
model=retrieved_job.fine_tuned_model,
messages=[
ChatMessage(role="user", content="What is the best French cheese?")
]
)
Local Fine-Tuning and Inference
You can also use Mistral AI’s open-source libraries for fine-tuning and inference on Large Language Models (LLMs) locally. Check out the following repositories for these tasks:
Fine-Tuning: https://github.com/mistralai/mistral-finetune
Inference: https://github.com/mistralai/mistral-inference
Conclusion
In conclusion, fine-tuning large language models on the Mistral platform enhances their performance for specific tasks, integrates new information, and manages complex workflows. Proper dataset preparation and tool usage in Mistral are key to achieving task alignment and efficiency. Fine-tuning is crucial for maximizing model potential in various applications like medical data analysis and generation systems. The Mistral platform offers the tools and flexibility needed to meet AI development goals effectively.
Key Takeaways
- Fine-tuning large language models significantly improves task alignment, efficiency, and the ability to integrate new and complex information compared to traditional prompting methods.
- Properly preparing datasets in JSON Lines format and following instruction-based formats, including function-calling logic, is crucial for fine-tuning.
- The Mistral AI platform offers powerful tools and flexibility for fine-tuning open-source and optimized models, allowing for superior performance in various specialized tasks and applications.
- Mistral also offers open-source libraries for fine-tuning and inference, which users can utilize locally or on any other platform.
Frequently Asked Questions
A. Fine-tuning large language models significantly improves their alignment with specific tasks, making them better. It also allows the models to incorporate new facts and handle complex workflows more effectively than traditional prompting methods.
A. Datasets must be stored in JSON Lines (.jsonl) format, with each line containing a JSON object. The data should follow an instruction-following format that represents user-assistant conversations. The “role” must be “user,” “assistant,” “system,” or “tool.”
A. The Mistral platform offers tools for uploading and preparing datasets, configuring fine-tuning jobs with specific models and hyperparameters, and monitoring training with integrations like Weights and Biases. It also supports performing inference using fine-tuned models, providing a comprehensive environment for AI development.