Providers - Embedl Hub

Every embedl-hub command follows the structure:

embedl-hub <command> <toolchain> <provider>

The provider determines where and how the command is executed. Providers fall into three categories:

Local — runs on your machine, no device needed.
Cloud — dispatches jobs to a managed device cloud (Run in Cloud).
Your hardware — connects to your own devices over SSH (Run on Your Hardware).

Local provider

local

Runs the operation on your local machine. No remote device or cloud account is required.

Currently supported for TFLite compilation only, where it uses onnx2tf to convert an ONNX model to the TFLite format. This is useful for getting a .tflite file quickly without needing cloud access.

embedl-hub compile tflite local -m model.onnx

The local provider does not require a device, so there is no corresponding DeviceManager method.

Cloud providers

These providers dispatch jobs to managed cloud services. See the Run in Cloud section for step-by-step guides.

aws

Dispatches profiling jobs to the Embedl device cloud, backed by AWS Device Farm. This is the default cloud for profiling TFLite models and does not require any additional setup beyond your Embedl Hub account.

embedl-hub profile tflite aws \

    -m model.tflite \

    -d "Samsung Galaxy S25"

qai-hub

Dispatches jobs to Qualcomm AI Hub, a managed cloud service from Qualcomm. Supports compilation (with quantization) and profiling on Snapdragon-powered devices (e.g., Samsung Galaxy S25, Google Pixel 9 Pro) as well as some automotive SoCs.

Requires a Qualcomm AI Hub account and API token. To set it up:

Create an account on Qualcomm AI Hub.
Log in and click Settings in the top-right corner to find your API token.
Configure the token locally (the qai-hub package is installed automatically with embedl-hub):
```
qai-hub configure --api_token YOUR_API_TOKEN
```
Your Qualcomm AI Hub API token is not shared with Embedl Hub and stays privately in your local environment.

embedl-hub compile tflite qai-hub \

    -m model.onnx \

    -d "Samsung Galaxy S24" \

    -s 1,3,224,224

Your hardware providers

These providers connect to your own devices over SSH. See the Run on Your Hardware section for step-by-step guides.

embedl-onnxruntime

Connects to a remote device over SSH and runs the embedl-onnxruntime backend on it. Use this provider to compile, profile, and invoke ONNX models on your own hardware — for example, a Raspberry Pi, Jetson board, or any Linux device you have SSH access to.

The target device must have the embedl-onnxruntime wheel installed. See the embedl-onnxruntime repository for installation instructions.

Together with trtexec, this is one of the fastest backends available — a MobileNetV2 compiles in ~7 s and profiles in ~12 s.

embedl-hub compile onnxruntime embedl-onnxruntime \

    -m model.onnx \

    --host 192.168.1.42 \

    --user pi

trtexec

Connects to a remote NVIDIA device over SSH and runs NVIDIA’s trtexec tool on it. Use this provider to compile, profile, and invoke TensorRT models on your own GPU-equipped hardware.

Together with embedl-onnxruntime, this is one of the fastest backends available — a MobileNetV2 engine builds in ~70 s and profiles in ~14 s.

embedl-hub compile tensorrt trtexec \

    -m model.onnx \

    --host 192.168.1.10 \

    --user nvidia

You can browse all supported devices on the Supported devices page.

Supported combinations

Not every provider is available for every toolchain and command. The tables below show which combinations are supported.

Compile

	local	qai-hub	embedl-onnxruntime	trtexec
tflite	✓	✓	—	—
onnxruntime	—	✓	✓	—
tensorrt	—	—	—	✓

Profile

	aws	qai-hub	embedl-onnxruntime	trtexec
tflite	✓	✓	—	—
onnxruntime	—	✓	✓	—
tensorrt	—	—	—	✓

Invoke

	qai-hub	embedl-onnxruntime	trtexec
tflite	✓	—	—
onnxruntime	✓	✓	—
tensorrt	—	—	✓

Note: The local provider does not require a device name. All other providers require a --device flag or device configuration.

Choosing a provider

Just need a TFLite file? Use compile tflite local for quick local conversion — no device or cloud account needed. Pass --fp16 for half-precision.
Want INT8-quantized, device-optimized models? Use qai-hub for Qualcomm devices, embedl-onnxruntime for ONNX Runtime on your own hardware, or trtexec for TensorRT on NVIDIA GPUs.
Profiling on real devices? Use aws for the Embedl device cloud, or qai-hub for Qualcomm AI Hub devices.
Have your own hardware? Use embedl-onnxruntime or trtexec to compile, profile, and invoke models on any device you can reach over SSH.