Run Gemma 4 Locally on AGX Orin — Private NemoClaw for Itself and Other Orins
JetsonHacks 13:19
13,329 views · 302 likes Watch on YouTube ↗
Join this channel to get access to perks:
https://www.youtube.com/channel/UCQs0lwV6E4p7LQaGJ6fgy5Q/join
NVIDIA Jetson Orin Nano Developer Kit: https://amzn.to/4shzDFS
NVIDIA Jetson AGX Orin Developer Kit: https://amzn.to/4ceNXJq
In this video, we set up a Jetson AGX Orin to do more than just run a model locally. We turn it into a local LLM server for NemoClaw, then use it to serve other Jetsons over the local network.
We start by installing Ollama on the AGX Orin, pulling Google Gemma 4 models, and benchmarking them to see how they perform on the machine. From there, we install NemoClaw and point it at that same local Ollama server as the inference provider. That gives us a practical local AI workflow, and it also helps keep prompts and working data on our own hardware instead of sending them to a third-party provider.
In the final part of the video, we bring in a Jetson Orin Nano and point it back to the AGX Orin over the LAN. The AGX Orin becomes a local model server for itself and for other network-attached Orins. That is a useful pattern on Jetson, because smaller machines do not always need to host their most capable models locally in order to use them.
This is a practical edge AI setup: local models, private inference, and shared LLM service across Jetsons on your own network.
JetsonHacks NemoClaw-Orin Github: https://github.com/jetsonhacks/NemoClaw-Orin
00:00 Introduction
02:21 Google Gemma 4
03:24 Gemma 4 Benchmarks
04:25 Installing NemoClaw
07:37 Running OpenClaw
10:17 Gemma Model Response
10:45 Network Inference Provider
12:29 Outro
As an Amazon Associate I earn from qualifying purchases.
Visit the JetsonHacks storefront on Amazon: https://www.amazon.com/shop/jetsonhacks
Visit the website at https://jetsonhacks.com
Sign up for the newsletter! https://newsletter.jetsonhacks.com
Github accounts: https://github.com/jetsonhacks
https://github.com/jetsonhacksnano
Twitter: http://twitter.com/jetsonhacks
Some of these links here are affiliate links. As an Amazon Associate I earn from qualifying purchases at no extra cost to you.
https://www.youtube.com/channel/UCQs0lwV6E4p7LQaGJ6fgy5Q/join
NVIDIA Jetson Orin Nano Developer Kit: https://amzn.to/4shzDFS
NVIDIA Jetson AGX Orin Developer Kit: https://amzn.to/4ceNXJq
In this video, we set up a Jetson AGX Orin to do more than just run a model locally. We turn it into a local LLM server for NemoClaw, then use it to serve other Jetsons over the local network.
We start by installing Ollama on the AGX Orin, pulling Google Gemma 4 models, and benchmarking them to see how they perform on the machine. From there, we install NemoClaw and point it at that same local Ollama server as the inference provider. That gives us a practical local AI workflow, and it also helps keep prompts and working data on our own hardware instead of sending them to a third-party provider.
In the final part of the video, we bring in a Jetson Orin Nano and point it back to the AGX Orin over the LAN. The AGX Orin becomes a local model server for itself and for other network-attached Orins. That is a useful pattern on Jetson, because smaller machines do not always need to host their most capable models locally in order to use them.
This is a practical edge AI setup: local models, private inference, and shared LLM service across Jetsons on your own network.
JetsonHacks NemoClaw-Orin Github: https://github.com/jetsonhacks/NemoClaw-Orin
00:00 Introduction
02:21 Google Gemma 4
03:24 Gemma 4 Benchmarks
04:25 Installing NemoClaw
07:37 Running OpenClaw
10:17 Gemma Model Response
10:45 Network Inference Provider
12:29 Outro
As an Amazon Associate I earn from qualifying purchases.
Visit the JetsonHacks storefront on Amazon: https://www.amazon.com/shop/jetsonhacks
Visit the website at https://jetsonhacks.com
Sign up for the newsletter! https://newsletter.jetsonhacks.com
Github accounts: https://github.com/jetsonhacks
https://github.com/jetsonhacksnano
Twitter: http://twitter.com/jetsonhacks
Some of these links here are affiliate links. As an Amazon Associate I earn from qualifying purchases at no extra cost to you.
Playback is via YouTube's official embedded player. Data from YouTube; Exumo is not affiliated with YouTube.