Providers
Understand how providers connect your models to devices and clouds.
Every embedl-hub command follows the structure:
embedl-hub <command> <toolchain> <provider>The provider determines where and how the command is executed. Providers fall into three categories:
- Local — runs on your machine, no device needed.
- Cloud — dispatches jobs to a managed device cloud (Run in Cloud).
- Your hardware — connects to your own devices over SSH (Run on Your Hardware).
Local provider
local
Runs the operation on your local machine. No remote device or cloud account is required.
Currently supported for TFLite compilation only, where it uses onnx2tf to convert an ONNX model
to the TFLite format. This is useful for getting a .tflite file quickly
without needing cloud access.
embedl-hub compile tflite local -m model.onnxThe local provider does not require a device, so there is no
corresponding DeviceManager method.
Cloud providers
These providers dispatch jobs to managed cloud services. See the Run in Cloud section for step-by-step guides.
qai-hub
Dispatches jobs to Qualcomm AI Hub, a managed cloud service from Qualcomm. Supports compilation (with quantization) and profiling on Snapdragon-powered devices (e.g., Samsung Galaxy S25, Google Pixel 9 Pro) as well as some automotive SoCs.
Requires a Qualcomm AI Hub account and API token. To set it up:
Create an account on Qualcomm AI Hub.
Log in and click Settings in the top-right corner to find your API token.
Configure the token locally (the
qai-hubpackage is installed automatically withembedl-hub):qai-hub configure --api_token YOUR_API_TOKENYour Qualcomm AI Hub API token is not shared with Embedl Hub and stays privately in your local environment.
embedl-hub compile tflite qai-hub \ -m model.onnx \ -d "Samsung Galaxy S24" \ -s 1,3,224,224aws
Dispatches profiling jobs to the Embedl device cloud, backed by AWS Device Farm. This is the default cloud for profiling TFLite models and does not require any additional setup beyond your Embedl Hub account.
embedl-hub profile tflite aws \ -m model.tflite \ -d "Samsung Galaxy S25"Your hardware providers
These providers connect to your own devices over SSH. See the Run on Your Hardware section for step-by-step guides.
embedl-onnxruntime
Connects to a remote device over SSH and runs the embedl-onnxruntime backend on it. Use this provider to compile, profile, and invoke ONNX models on your own hardware — for example, a Raspberry Pi, Jetson board, or any Linux device you have SSH access to.
The target device must have the embedl-onnxruntime wheel installed. See the embedl-onnxruntime repository for installation instructions.
Together with trtexec, this is one of the fastest backends available —
a MobileNetV2 compiles in ~7 s and profiles in ~12 s.
embedl-hub compile onnxruntime embedl-onnxruntime \ -m model.onnx \ --host 192.168.1.42 \ --user pitrtexec
Connects to a remote NVIDIA device over SSH and runs NVIDIA’s trtexec tool on it. Use this provider to compile, profile, and invoke
TensorRT models on your own GPU-equipped hardware.
Together with embedl-onnxruntime, this is one of the fastest backends available — a MobileNetV2 engine builds in ~70 s and profiles in ~14 s.
embedl-hub compile tensorrt trtexec \ -m model.onnx \ --host 192.168.1.10 \ --user nvidiaYou can browse all supported devices on the Supported devices page.
Supported combinations
Not every provider is available for every toolchain and command. The tables below show which combinations are supported.
Compile
| local | qai-hub | embedl-onnxruntime | trtexec | |
|---|---|---|---|---|
| tflite | ✓ | ✓ | — | — |
| onnxruntime | — | ✓ | ✓ | — |
| tensorrt | — | — | — | ✓ |
Profile
| qai-hub | aws | embedl-onnxruntime | trtexec | |
|---|---|---|---|---|
| tflite | ✓ | ✓ | — | — |
| onnxruntime | ✓ | — | ✓ | — |
| tensorrt | — | — | — | ✓ |
Invoke
| qai-hub | embedl-onnxruntime | trtexec | |
|---|---|---|---|
| tflite | ✓ | — | — |
| onnxruntime | ✓ | ✓ | — |
| tensorrt | — | — | ✓ |
Note: The local provider does not require a device name. All other
providers require a --device flag or device configuration.
Choosing a provider
- Just need a TFLite file? Use
compile tflite localfor quick local conversion — no device or cloud account needed. Pass--fp16for half-precision. - Want INT8-quantized, device-optimized models? Use
qai-hubfor Qualcomm devices,embedl-onnxruntimefor ONNX Runtime on your own hardware, ortrtexecfor TensorRT on NVIDIA GPUs. - Profiling on real devices? Use
awsfor the Embedl device cloud, orqai-hubfor Qualcomm AI Hub devices. - Have your own hardware? Use
embedl-onnxruntimeortrtexecto compile, profile, and invoke models on any device you can reach over SSH.