🐾 PETBLIP WALL
Phase 1 – Functional AI Deployment Summary
Status: Live + GPU Accelerated + Session-Aware
🧠 Core Brain
AI Server (ai1)
- Intel i7 system
- 64GB RAM
- RTX 3090 Ti (24GB VRAM)
- CUDA 12.2 active
- Ollama running locally
- Model:
qwen2.5:32b - ~20GB VRAM utilized during inference
- Fully GPU accelerated
Result:
Retail-grade 32B local inference with strong multi-turn reasoning.
🔐 Network Architecture
- AI server isolated on internal VLAN (10.0.1.x)
- Wall server (ser2) communicates over LAN only
- AI machine NOT exposed directly to public internet
- Token-based Authorization header required
- Static IP fiber available for DNS when needed
Result:
Secure internal AI architecture with controlled access.
🖥 PetBlip Wall Server (ser2)
- Node.js (v20)
- Express
- Socket.io (real-time WebSocket communication)
- MySQL2 (async MariaDB logging)
- Port 3000 wall interface
Live Capabilities:
- Customer submits question
- Wall sends request to AI server
- AI response returned in real-time
- Response displayed immediately
- Interaction logged to database
🗂 Database Layer (MariaDB 11.4)
Database: blip_analytics
Table: wall_conversations
Stored fields:
- id
- session_id
- question
- response
- source
- created_at
Result:
Every store interaction captured for later insight.
🧠 Session Memory (Phase 1 Complete)
- Memory tied to
socket.id - Last 3 customer messages injected into prompt
- No cross-customer contamination
- No identity tracking
- Stateless after refresh
Result:
Multi-turn contextual continuity without privacy complexity.
🎭 Blip Persona Layer
Blip operates as:
- In-store AI assistant
- Skilled tradesman tone
- Practical guidance
- Short structured responses
- Vet suggestion when appropriate
- Subtle in-store product mention
- Ends with follow-up question
With 32B model, persona holds under context.
⚡ Performance
- 32B model loads fully into VRAM (~20GB usage)
- Responses fast (GPU confirmed active)
- Wall latency acceptable for retail interaction
- No CPU fallback detected
🧱 What Is Working Right Now
You have:
- Functional in-store AI
- GPU-powered reasoning
- Secure local inference
- Session-aware multi-turn conversation
- Persistent logging
- VLAN-isolated architecture
- Real-time wall interaction
- Production-capable hardware stack
This is no longer a prototype.
🚀 What This System Is Now Capable Of
Without adding anything new:
- Handle real customer Q&A
- Maintain conversational continuity
- Collect store question data
- Refine persona over time
- Expand to other LocalAd properties
- Be called from forum, FixItUs, VolusiaMarket
- Serve as central AI brain for ecosystem
🔒 What You Intentionally Did NOT Add
- No customer identity tracking
- No RAG complexity
- No vector DB
- No external cloud dependency
- No direct public AI exposure
- No over-engineering
Safe.
Contained.
Focused.
📍 Current State Classification
Infrastructure Tier: Advanced Local AI Deployment
Retail Integration Tier: Early Production
Hardware Tier: High-end Prosumer AI Node
Security Tier: Properly Isolated LAN Deployment
💬 Comments (0)