Build the generate() API from source

Pre-requisites
Clone the onnxruntime-genai repo
Download ONNX Runtime binaries
Build the generate() API
Install the library into your application

Pre-requisites

cmake
.NET6 (if building C#)

Clone the onnxruntime-genai repo

git clone https://github.com/microsoft/onnxruntime-genai
cd onnxruntime-genai

Download ONNX Runtime binaries

By default, the onnxruntime-genai build expects to find the ONNX Runtime include and binaries in a folder called ort in the root directory of onnxruntime-genai. You can put the ONNX Runtime files in a different location and specify this location to the onnxruntime-genai build via the --ort_home command line argument.

These instructions assume you are in the onnxruntime-genai folder.

Windows

These instruction use win-x64. Replace this if you are using a different architecture.

curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.19.2/onnxruntime-win-x64-1.19.2.zip -o onnxruntime-win-x64-1.19.2.zip
tar xvf onnxruntime-win-x64-1.19.2.zip
move onnxruntime-win-x64-1.19.2 ort 

Linux and Mac

These instruction use linux-x64-gpu. Replace this if you are using a different architecture.

curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.19.2/onnxruntime-linux-x64-gpu-1.19.2.tgz -o onnxruntime-linux-x64-gpu-1.19.2.tgz
tar xvzf onnxruntime-linux-x64-gpu-1.19.2.tgz
mv onnxruntime-linux-x64-gpu-1.19.2 ort 

Android

If you do not already have an ort folder, create one.

mkdir ort

curl -L https://repo1.maven.org/maven2/com/microsoft/onnxruntime/onnxruntime-android/1.19.2/onnxruntime-android-1.19.2.aar -o ort/onnxruntime-android-1.19.2.aar
cd ort
tar xvf onnxruntime-android-1.19.2.aar
cd ..

Build the generate() API

This step assumes that you are in the root of the onnxruntime-genai repo, and you have followed the previous steps to copy the onnxruntime headers and binaries into the folder specified by , which defaults to `onnxruntime-genai/ort`.

All of the build commands below have a --config argument, which takes the following options:

Release builds release binaries
Debug build binaries with debug symbols
RelWithDebInfo builds release binaries with debug info

Build Python API

Windows CPU build

python build.py --config Release

Windows DirectML build

python build.py --use_dml --config Release

Linux build

python build.py --config Release

Linux CUDA build

python build.py --use_cuda --config Release

Mac build

python build.py --config Release

Build Java API

python build.py --build_java --config Release

Build for Android

If building on Windows, install ninja.

pip install ninja

Run the build script.

python build.py --build_java --android --android_home <path to your Android SDK> --android_ndk_path <path to your NDK installation> --android_abi  [armeabi-v7a|arm64-v8a|x86|x86_64] --config Release

Install the library into your application

Install Python wheel

cd build/wheel
pip install *.whl

Install NuGet

Coming soon

Install JAR

Copy build/Windows/Release/src/java/build/libs/*.jar into your application.

Install AAR

Copy build/Android/Release/src/java/build/android/outputs/aar/onnxruntime-genai-release.aar into your application.

Install C/C++ header file and library

Windows

Use the header in src\ort_genai.h and the libraries in build\Windows\Release

Linux

Use the header in src/ort_genai.h and the libraries in build/Linux/Release