I Benchmarked 6 LLMs on Jetson Thor — Here’s What Surprised Me

JetsonHacks Sep 12, 2025 12:43

12,250 views · 273 likes Watch on YouTube ↗

Join this channel to get access to perks:
https://www.youtube.com/channel/UCQs0lwV6E4p7LQaGJ6fgy5Q/join
Jetson AGX Thor Developer Kit: https://amzn.to/49jqnes
NVIDIA AGX Thor Product Page: https://nvda.ws/4mW7wcW
Let's benchmark 6 LLMs, including Qwen3 Coder and OpenAI gpt-oss-120b on the NVIDIA Jetson AGX Thor Developer Kit using llama.cpp. Then we will run a couple of the LLMs on a server and test them out in a web browser. One of them is gpt-oss-120b with a 128K context, all on an embedded device!

LLM comparison from: https://artificialanalysis.ai

00:00 Intro
00:33 Qwen3 Coder
01:58 OpenAI gpt-oss-20b
02:12 NVIDIA Llama3 Nemotron 49B
02:21 Qwen3 32B
03:30 gpt-oss-120b
04:35 Qwen3 Coder 30B
04:52 Qwen3 Coder first prompt
05:26 Qwen3 Coder second prompt
06:38 Qwen3 Coder server result
06:48 OpenAI gpt-oss-120b
07:02 gpt-oss first prompt
09:02 gpt-oss second prompt
11:28 gpt-oss server result
12:05 Takeaways

As an Amazon Associate I earn from qualifying purchases.
Visit the JetsonHacks storefront on Amazon: https://www.amazon.com/shop/jetsonhacks

Visit the website at https://jetsonhacks.com
Sign up for the newsletter! https://newsletter.jetsonhacks.com
Github accounts: https://github.com/jetsonhacks
https://github.com/jetsonhacksnano
Twitter: http://twitter.com/jetsonhacks

Some of these links here are affiliate links. As an Amazon Associate I earn from qualifying purchases at no extra cost to you.

Category (YouTube): Science & Technology

Playback is via YouTube's official embedded player. Data from YouTube; Exumo is not affiliated with YouTube.