Providers
Understand how providers connect your models to devices and clouds.
Every embedl-hub command follows the structure:
embedl-hub <command> <toolchain> <provider>The provider determines where and how the command is executed. Some providers run locally, some dispatch to a managed device cloud, and some connect to your own hardware over SSH.
Available providers
local
Runs the operation on your local machine. No remote device or cloud account is required.
Currently supported for TFLite compilation only, where it uses onnx2tf to convert an ONNX model
to the TFLite format. This is useful for getting a .tflite file quickly
without needing cloud access.
CLI:
embedl-hub compile tflite local -m model.onnxThe local provider does not require a device, so there is no
corresponding DeviceManager method.
qai-hub
Dispatches jobs to Qualcomm AI Hub, a managed cloud service from Qualcomm. Supports compilation (with quantization) and profiling on Snapdragon-powered devices (e.g., Samsung Galaxy S25, Google Pixel 9 Pro) as well as some automotive SoCs.
Requires a Qualcomm AI Hub account and API token. To set it up:
Create an account on Qualcomm AI Hub.
Log in and click Settings in the top-right corner to find your API token.
Configure the token locally (the
qai-hubpackage is installed automatically withembedl-hub):qai-hub configure --api_token YOUR_API_TOKENYour Qualcomm AI Hub API token is not shared with Embedl Hub and stays privately in your local environment.
CLI:
embedl-hub compile tflite qai-hub \ -m model.onnx \ -d "Samsung Galaxy S24" \ -s 1,3,224,224Python API:
DeviceManager.get_qai_hub_device("Samsung Galaxy S24", name="galaxy-s24")aws
Dispatches profiling jobs to the Embedl device cloud, backed by AWS Device Farm. This is the default cloud for profiling TFLite models and does not require any additional setup beyond your Embedl Hub account.
CLI:
embedl-hub profile tflite aws \ -m model.tflite \ -d "Samsung Galaxy S25"Python API:
DeviceManager.get_aws_device("Samsung Galaxy S25", name="galaxy-s25")embedl-onnxruntime
Connects to a remote device over SSH and runs the embedl-onnxruntime backend on it. Use this provider to compile, profile, and invoke ONNX models on your own hardware — for example, a Raspberry Pi, Jetson board, or any Linux device you have SSH access to.
The target device must have the embedl-onnxruntime wheel installed. See the embedl-onnxruntime repository for installation instructions.
Together with trtexec, this is one of the fastest backends available.
CLI:
embedl-hub compile onnxruntime embedl-onnxruntime \ -m model.onnx \ --host 192.168.1.42 \ --user piPython API:
from embedl_hub.core.device.ssh import SSHConfigDeviceManager.get_embedl_onnxruntime_device( SSHConfig(host="192.168.1.42", username="pi"), name="rpi",)trtexec
Connects to a remote NVIDIA device over SSH and runs NVIDIA’s trtexec tool on it. Use this provider to compile, profile, and invoke
TensorRT models on your own GPU-equipped hardware.
Together with embedl-onnxruntime, this is one of the fastest backends available.
CLI:
embedl-hub compile tensorrt trtexec \ -m model.onnx \ --host 192.168.1.10 \ --user nvidiaPython API:
from embedl_hub.core.device.ssh import SSHConfigDeviceManager.get_tensorrt_device( SSHConfig(host="192.168.1.10", username="nvidia"), name="jetson",)You can browse all supported devices on the Supported devices page.
Supported combinations
Not every provider is available for every toolchain and command. The tables below show which combinations are supported.
Compile
| local | qai-hub | embedl-onnxruntime | trtexec | |
|---|---|---|---|---|
| tflite | ✓ | ✓ | — | — |
| onnxruntime | — | ✓ | ✓ | — |
| tensorrt | — | — | — | ✓ |
Profile
| qai-hub | aws | embedl-onnxruntime | trtexec | |
|---|---|---|---|---|
| tflite | ✓ | ✓ | — | — |
| onnxruntime | ✓ | — | ✓ | — |
| tensorrt | — | — | — | ✓ |
Invoke
| qai-hub | embedl-onnxruntime | trtexec | |
|---|---|---|---|
| tflite | ✓ | — | — |
| onnxruntime | ✓ | ✓ | — |
| tensorrt | — | — | ✓ |
Note: The local provider does not require a device name. All other
providers require a --device flag or device configuration.
Choosing a provider
- Just need a TFLite file? Use
compile tflite localfor quick local conversion with FP16 quantization — no device or cloud account needed. - Want INT8-quantized, device-optimized models? Use
qai-hubfor Qualcomm devices,embedl-onnxruntimefor ONNX Runtime on your own hardware, ortrtexecfor TensorRT on NVIDIA GPUs. - Profiling on real devices? Use
awsfor the Embedl device cloud, orqai-hubfor Qualcomm AI Hub devices. - Have your own hardware? Use
embedl-onnxruntimeortrtexecto compile, profile, and invoke models on any device you can reach over SSH.