How I add Live Private Web Search to Gemma 4
JetsonHacks 13:30
4,659 views · 178 likes Watch on YouTube ↗
Join this channel to get access to perks:
https://www.youtube.com/channel/UCQs0lwV6E4p7LQaGJ6fgy5Q/join
Stop AI Guessing: Private Web Search for Local LLMs:
Give your Gemma 4 local model Live Private Web Search capabilities using the Model Context Protocol (MCP) and SearXNG. Learn how to bypass LLM training cutoffs and build an agent-facing search layer for real-time data.
Demonstration on NVIDIA Jetson Orin Nano:
https://amzn.to/4uNHQU1
Code and project files:
https://github.com/jetsonhacks/searxng-search
We use SearXNG as a private, self-hosted metasearch broker, then turn web search into a tool that a local AI agent can use. Along the way, we look at how agents, tools, MCP, and skills fit together in a practical local AI workflow.
What you'll learn:
* Self-Hosted Private Search: How to set up SearXNG in Docker as a private search broker.
* MCP Server Integration: Building and launching an MCP (Model Context Protocol) server to bridge your local LLM and the web.
* Gemma 4 Tool Calling: Configuring llama.cpp to enable tool-use and reasoning for agentic workflows.
* Agents vs. Skills: A deep dive into the difference between predictive model outputs and deterministic programming.
The Problem:
Why local LLMs struggle with current information, and why “training cutoff” matters in real use.
SearXNG as a Search Broker
How SearXNG provides private, self-hosted metasearch without turning your Jetson into a web index.
HTML vs JSON
How the same search service can return a normal web page for humans or structured JSON results for programs and agents.
Agents, Tools, MCP, and Skills
How a local agent can expose search as a tool, use MCP as the connection layer, and rely on skill instructions to decide when and how search should be used.
The Demo
A local model running on an NVIDIA Jetson Orin Nano uses a search tool to look up current information instead of guessing from its training data.
Key Technologies
SearXNG — private, self-hosted metasearch
MCP — Model Context Protocol for exposing and calling tools
llama.cpp — local model serving
Gemma 4 — local LLM used in the demo
NVIDIA Jetson Orin Nano — edge AI hardware for the local stack
Resources
Code, scripts, and the MCP search tool used in this video:
https://github.com/jetsonhacks/searxng-search
00:00 Intro
00:22 Install SearXNG
02:05 What is Metasearch?
03:09 Python tool call
03:44 MCP Server Demo
06:13 Agents and Skills
06:43 Agents, Tools and Skills explained
12:24 Outro
As an Amazon Associate I earn from qualifying purchases.
Visit the JetsonHacks storefront on Amazon: https://www.amazon.com/shop/jetsonhacks
Visit the website at https://jetsonhacks.com
Sign up for the newsletter! https://newsletter.jetsonhacks.com
Github accounts: https://github.com/jetsonhacks
https://github.com/jetsonhacksnano
Twitter: http://twitter.com/jetsonhacks
Some of these links here are affiliate links. As an Amazon Associate I earn from qualifying purchases at no extra cost to you.
https://www.youtube.com/channel/UCQs0lwV6E4p7LQaGJ6fgy5Q/join
Stop AI Guessing: Private Web Search for Local LLMs:
Give your Gemma 4 local model Live Private Web Search capabilities using the Model Context Protocol (MCP) and SearXNG. Learn how to bypass LLM training cutoffs and build an agent-facing search layer for real-time data.
Demonstration on NVIDIA Jetson Orin Nano:
https://amzn.to/4uNHQU1
Code and project files:
https://github.com/jetsonhacks/searxng-search
We use SearXNG as a private, self-hosted metasearch broker, then turn web search into a tool that a local AI agent can use. Along the way, we look at how agents, tools, MCP, and skills fit together in a practical local AI workflow.
What you'll learn:
* Self-Hosted Private Search: How to set up SearXNG in Docker as a private search broker.
* MCP Server Integration: Building and launching an MCP (Model Context Protocol) server to bridge your local LLM and the web.
* Gemma 4 Tool Calling: Configuring llama.cpp to enable tool-use and reasoning for agentic workflows.
* Agents vs. Skills: A deep dive into the difference between predictive model outputs and deterministic programming.
The Problem:
Why local LLMs struggle with current information, and why “training cutoff” matters in real use.
SearXNG as a Search Broker
How SearXNG provides private, self-hosted metasearch without turning your Jetson into a web index.
HTML vs JSON
How the same search service can return a normal web page for humans or structured JSON results for programs and agents.
Agents, Tools, MCP, and Skills
How a local agent can expose search as a tool, use MCP as the connection layer, and rely on skill instructions to decide when and how search should be used.
The Demo
A local model running on an NVIDIA Jetson Orin Nano uses a search tool to look up current information instead of guessing from its training data.
Key Technologies
SearXNG — private, self-hosted metasearch
MCP — Model Context Protocol for exposing and calling tools
llama.cpp — local model serving
Gemma 4 — local LLM used in the demo
NVIDIA Jetson Orin Nano — edge AI hardware for the local stack
Resources
Code, scripts, and the MCP search tool used in this video:
https://github.com/jetsonhacks/searxng-search
00:00 Intro
00:22 Install SearXNG
02:05 What is Metasearch?
03:09 Python tool call
03:44 MCP Server Demo
06:13 Agents and Skills
06:43 Agents, Tools and Skills explained
12:24 Outro
As an Amazon Associate I earn from qualifying purchases.
Visit the JetsonHacks storefront on Amazon: https://www.amazon.com/shop/jetsonhacks
Visit the website at https://jetsonhacks.com
Sign up for the newsletter! https://newsletter.jetsonhacks.com
Github accounts: https://github.com/jetsonhacks
https://github.com/jetsonhacksnano
Twitter: http://twitter.com/jetsonhacks
Some of these links here are affiliate links. As an Amazon Associate I earn from qualifying purchases at no extra cost to you.
Playback is via YouTube's official embedded player. Data from YouTube; Exumo is not affiliated with YouTube.