AI Surveillance
Execution Plan V4.1

U.S. Convenience Store
AI Surveillance System

An integrated hardware solution combining edge AI computing, NVR recording with built-in HDD, and PoE power supply. Equipped with five core algorithms — shoplifting detection, people counting, VIP recognition, hand-raise detection, and firearm detection — providing end-to-end intelligent loss prevention, security, and operational enhancement for U.S. convenience stores. Seeed Studio as primary choice, with comprehensive Jetson platform and SDK ecosystem comparison.

100TOPS
Edge AI Computing
4+Ch
PoE Cameras
5Types
Core Algorithms
99.3%
Detection Accuracy
SYSTEM ARCHITECTURE

SolutionOverview

Adopting a "Cloud-Edge-Device" three-tier collaborative architecture. Seeed Studio reServer as primary hardware with built-in NVR + 4-channel PoE + 100 TOPS AI computing. Brands unable to import to the U.S. have been excluded, with comprehensive SDK ecosystem comparison across manufacturers.

Perception Layer

4K PoE IP Cameras

Deploy 4 HD cameras, powered directly via built-in PoE ports without external switches.

Edge Computing Layer

AI NVR All-in-One

Jetson Orin NX 100 TOPS computing, runs 5 algorithms (incl. firearm detection) with built-in HDD video storage, NVIDIA VST NVR with Web UI & remote access.

Cloud Service Layer

API & Data Platform

Aggregate analytics data, manage devices, distribute alerts, provide RESTful API interfaces.

Application Layer

Mobile APP

Real-time alert push, video playback, traffic dashboard, VIP notifications, firearm emergency alerts.

Retail store with AI detection zones
AI Detection Zones Active
HARDWARE COMPARISON

HardwareComparison

Comparing 6 Jetson-based edge AI hardware solutions from multiple dimensions including built-in HDD, NVR system, PoE, SDK ecosystem. Brands unable to import to the U.S. (Hikvision, Dahua, etc.) have been excluded. All products support NVIDIA JetPack and are FCC/CE certified.

Edge AI NVR Hardware Setup
Recommended

Seeed Studio reServer J4012

AI-enabled NVR Server — 4-ch PoE + NVR + 100 TOPS AI

$1,399 for 100 TOPS + 4-ch PoE + dual HDD bays. Deploy NVIDIA VST for NVR with Web UI & remote access. Best cost-performance ratio, ideal for 4-camera convenience stores.

Computing Core

NVIDIA Jetson Orin NX 16GB

8-core ARM Cortex-A78AE, 100 TOPS AI

Network Interfaces

4x PoE (802.3af) + 1x GbE

Direct connect 4 IP cameras, perfect for convenience stores

Storage Expansion

2x 2.5" SATA + M.2 NVMe

Dual HDD bays + high-speed SSD, 30+ days recording

Operating Temp

-20°C ~ 60°C Fanless

Passive cooling, zero noise for store environment

Full Hardware Comparison (6 Products)

ProductChipAI ComputingPoE PortsBuilt-in StorageNVR SystemWeb UIRemote AccessPrice
Seeed StudioreServer J4012 Recommended
Jetson Orin NX 16GB100 TOPS4ch2x 2.5" SATA + M.2 NVMeVST Web UI (browser-based)STUN/TURN or Tailscale VPN~$1,399
Seeed StudioreServer J4011
Jetson Orin NX 8GB70 TOPS4ch2x 2.5" SATA + M.2 NVMeVST Web UISTUN/TURN or Tailscale~$1,099
AdvantechMIC-717-OX
Jetson Orin NX 16GB100 TOPS8ch1x 3.5" SATA (max 8TB)Metropolis Web UI (pre-installed)Requires manual config~$2,000
AAEONBOXER-8658AI
Jetson Orin NX 8/16GB70-100 TOPS8ch1x 2.5" SATA + M.2 NVMeN/AN/A~$1,800-2,500
EverFocuseNVP-JNX-IV
Jetson Xavier NX21 TOPS (INT8)8ch1x 2.5" SATA SSDEF-Viewer (built-in)P2P remote access (built-in)~$1,500-2,500
LannerEAI-I134
Jetson Orin NX 16GB100-157 TOPS2chM.2 NVMe (128GB)N/AN/ARFQ

Built-in NVR Solution Details

Seeed reServer J4012 achieves turnkey NVR through NVIDIA VST (Video Storage Toolkit). The system is deployed via Docker Compose, providing complete Web UI, REST API, and remote access capabilities.

Built-in HDD Storage

2x 2.5" SATA bays support up to 2x 4TB HDD/SSD. Recommended: 1x 4TB surveillance HDD for ~30 days continuous recording of 4 cameras.

NVIDIA VST Web UI

Browser-based management interface: camera auto-discovery, live view, playback, video wall, event-triggered recording, storage policy configuration.

Remote Access

Option A: Configure STUN/TURN server for WebRTC remote streaming. Option B: Deploy Tailscale/ZeroTier for zero-config P2P VPN tunnel. Both support mobile APP access.

REST API & SDK

VST provides complete RESTful API: camera CRUD, stream control, recording management, event subscription via Redis. DeepStream SDK for AI pipeline integration.

One-Click Deployment

Pre-built system image with Docker Compose. Power on → auto-start VST + DeepStream + AI algorithms. Achieve true turnkey experience.

Selection Recommendation (Convenience Store Scenario)

Primary Choice

Seeed reServer J4012

$1,399 for 100 TOPS + 4-ch PoE + dual HDD bays. Deploy NVIDIA VST for NVR with Web UI & remote access. Best cost-performance ratio, ideal for 4-camera convenience stores.

Alternative (Turnkey)

Advantech MIC-717-OX

$2,000, pre-installed Metropolis NVR out-of-box, 8-ch PoE, 3.5" HDD bay (up to 8TB). Ideal for teams not wanting to configure NVR software.

Turnkey NVR

EverFocus eNVP-JNX-IV

True turnkey NVR with built-in Web UI & P2P remote access. But Xavier NX only 21 TOPS, SDK closed-source. Suitable for pure NVR needs without heavy AI.

Budget Option

Seeed reServer J4011

$1,099 entry-level, 70 TOPS + 8GB memory. Suitable for 3 cameras or fewer, limited algorithm load.

CAMERA PLACEMENT

Camera Placement Plan

Interactive floor plan showing recommended camera positions for typical convenience store layouts. Select store size to view optimal placement for entrance monitoring, cashier coverage, and shelf surveillance.

Select store size:
Floor Plan10m × 10m = 100㎡
EntranceCashierShelf Area
ENTRANCEEntrance ZoneCashier ZoneShelf AreaStorage / OfficeCASHIER12341m
Deployment Summary
4
Cameras
100㎡
Store Area
Camera List
1
Entrance Camera
360° Fisheye · Counting · FOV 100°
2
Cashier Camera
Dome · AI Analytics · FOV 70°
3
Shelf Camera #1
Dome · AI Analytics · FOV 80°
4
Shelf Camera #2
Dome · AI Analytics · FOV 80°
Installation Tips
  • Mount entrance camera at 2.8-3m height for optimal counting accuracy with fisheye lens.
  • Cashier camera should cover the entire counter area and customer queue zone.
  • Shelf cameras should be positioned to minimize blind spots between aisles.
  • All cameras connect to reServer via built-in PoE — no external switch needed.
AI ALGORITHMS

Core AlgorithmImplementation

Five AI algorithms covering loss prevention, security, operations, and customer experience. Prioritizing pre-trained models and rule engines to minimize data annotation requirements.

AI Vision Analysis

Real-time pose estimation + Object detection + Face feature matching — Multi-model parallel inference pipeline

Implementation Flow

1
Pose ExtractionYOLOv8-pose extracts 17 skeleton keypoints per person (nose, eyes, shoulders, elbows, wrists, hips, knees, ankles) in real-time
2
Skeleton SequenceBoT-SORT tracks each person ID, maintains T-frame (30 frames/1 sec) skeleton temporal sequence
3
Concealment DetectionCalculate wrist→hip distance change rate: hand moves rapidly from shelf area toward body trunk/pocket area
4
Bending DetectionMonitor shoulder-hip line angle: torso tilt angle < 45° and lasting > 2 seconds triggers suspicious flag
5
Lookout DetectionTrack nose keypoint horizontal displacement frequency: left-right turns > 3 times/5 seconds in short period
6
Loitering DetectionSame Track ID stays in specific area (shelf/high-value zone) beyond set threshold
7
Multi-metric ScoringWeighted combination of above metrics generates risk score (0-100), triggers alert when exceeding threshold
8
Real-time PushHigh-risk events pushed via MQTT → APP notification with video clip playback
Models Used
YOLOv8-pose EstimationBoT-SORT Multi-Object TrackingRule Engine / LSTM Classifier

Data Annotation

Annotation Tool

CVAT (Open Source) / Roboflow

Annotation Type
  • Phase 1: No annotation — use skeleton geometry rule engine
  • Phase 2: Annotate behavior clips (normal/suspicious/theft)
  • Each clip: 30-frame skeleton sequence + behavior label
Dataset Scale

Phase 1: 0 annotation | Phase 2: 500-1000 clips

Data Source

PoseLift dataset + UCF-Crime + in-store collection

Min. Annotation

Phase 1 completely zero annotation; Phase 2 only needs 500 video clips (3-person voting annotation)

Technical Challenges & Solutions

Rule Engine vs Deep Learning

Phase 1 uses skeleton geometry rules (no annotation needed) for quick launch; Phase 2 collects data to train 2-layer LSTM classifier for improved accuracy

False Positive Control

Multi-level thresholds: low risk log only, medium risk APP alert, high risk real-time popup + recording mark

Privacy Protection

Only transmit skeleton coordinates (17×3 values), no original video frames, compliant with CCPA/BIPA

Data Annotation Quality Standards & Reduction Strategies

Annotation Quality Standards
Annotation TypeQuality MetricConsistency RequirementReview Rate
Object BBoxIoU ≥ 0.75Different annotators IoU ≥ 0.820%
Body Keypoints[email protected] ≥ 0.9Keypoint deviation < 5px30%
Behavior LabelsN/A3-person voting consensus100%
Weapon BBoxIoU ≥ 0.8Different annotators IoU ≥ 0.8550%
Six Strategies to Reduce Annotation
01
Use Pre-trained Models Directly

Person detection, pose estimation, face recognition can use COCO/MS1MV3 pre-trained model weights directly

02
Public Dataset Transfer

UCF-Crime, PoseLift, deepcam-cn and other public datasets reduce annotation from scratch

03
Rule Engine First

Shoplifting and hand-raise detection prioritize skeleton geometry rules, completely zero annotation

04
Active Learning

Model prioritizes uncertain samples for annotation, reducing total annotation volume by 30-50%

05
Pseudo-Labels + Human Review

Use pre-trained model to generate pseudo-labels, human review corrects errors only

06
Synthetic Data Augmentation

Random crop/rotation/color jitter to expand data, reducing real-world collection needs

SOFTWARE ARCHITECTURE

SoftwareArchitecture

Cloud-edge separated architecture. Flutter cross-platform APP + FastAPI cloud services + DeepStream edge inference engine.

Mobile APP

Flutter
Mobile App Mockup
Real-time Alert Push
Shoplifting, VIP, hand-raise, firearm events with priority-based push. Firearm as highest priority emergency alert.
Video Playback
WebRTC/HLS protocol, view real-time and historical footage for alert-related video clips.
Data Dashboard
Visualize traffic, peak hours analysis, and anomaly event statistics.
Device Management
Remote configure camera parameters, algorithm thresholds, alert settings.

Edge Software Stack

reServer J4012
JetPack 6.0NVIDIA Jetson OS
DeepStream SDKVideo analytics pipeline engine
TensorRTModel inference acceleration
MQTT BrokerLightweight message delivery protocol
NVIDIA VSTBuilt-in NVR with Web UI, REST API, remote access
DEEPSTREAM Pipeline
RTSP Input
HW Decode
Preprocess
TensorRT Infer
Post-process
Metadata Output
MQTT Send

Cloud Backend Services

FastAPI / NestJS
RESTful API Gateway
PostgreSQL + Redis
Structured data + high-speed cache
EMQX MQTT
Edge message relay middleware
Firebase FCM
Mobile push notification service
COST ANALYSIS

CostAnalysis

Total per-store deployment cost approximately $4,119, estimated 6-9 months to achieve ROI through reduced losses.

Per-Store Cost Breakdown

$4,119
$0$350$700$1050$1400Seeed reServer J40124x 4K PoE CamerasFirearm Model Fine-tune4TB Surveillance HDDCables & AccessoriesSoftware License(Annual)200 Supplementary AnnotationsInstallation &Commissioning

Cost Composition

Hardware 61%Software 27%Service 12%
  • Hardware
  • Software
  • Service

ROI Forecast (Monthly Loss vs Savings)

M1M3M6M9M12$0$1500$3000$4500$6000
  • Monthly Loss
  • Monthly Savings
IMPLEMENTATION PLAN

ImplementationPlan

From requirements confirmation to go-live, estimated 10 weeks for full deployment, progressing in four phases.

Phase 1

Requirements & Procurement

Week 1-2
  • Confirm store layout, camera placement plan
  • Procure Seeed reServer J4012 & PoE cameras
  • Procure surveillance HDD & cables
  • Confirm network environment & cloud server
Phase 2

Algorithm Dev & Model Training

Week 3-6
  • Configure skeleton rule engine (shoplifting/hand-raise, zero annotation)
  • Fine-tune firearm detection model using public datasets
  • Build VIP face feature database
  • TensorRT FP16 model optimization & quantization
Phase 3

System Integration & Deployment

Week 7-8
  • Flash pre-built system image to reServer
  • Deploy NVIDIA VST NVR + DeepStream pipeline
  • Configure remote access (Tailscale/STUN)
  • Flutter APP development & cloud API integration
Phase 4

Testing & Delivery

Week 9-10
  • Full scenario functional & stress testing
  • Algorithm threshold tuning & false positive optimization
  • User training & operation manual delivery
  • Go-live & continuous monitoring
COMPLIANCE

ComplianceRequirements

Deploying AI surveillance systems in the U.S. requires strict compliance with federal and state privacy regulations, especially biometric information protection laws involving facial recognition.

CCPA

California Consumer Privacy Act

Required

Requires businesses to disclose data collection practices, grants consumers the right to delete personal data. Must clearly state video surveillance data usage in privacy policy.

BIPA

Biometric Privacy (Illinois)

High Risk

Strictly restricts biometric data such as facial recognition. Must obtain written consent before collection, prohibits selling biometric data, requires data retention and destruction policies.

Data Encryption

Transmission & Storage Security

Security Baseline

All video streams and metadata must use TLS/SSL encryption. Locally stored recordings and face feature data must be AES-256 encrypted.

Notice Obligation

Store Signage & Transparency

Required

Must post clear video surveillance notices at store entrance. If using facial recognition, must provide external notice and opt-out mechanism.

FCC/NDAA

Hardware Import Compliance

Required

Hardware must pass FCC certification. NDAA Section 889 prohibits U.S. federal agencies from purchasing Hikvision/Dahua brands. This solution uses Seeed Studio as primary, Advantech/AAEON as alternatives, all compliant hardware.

Compliance Recommendations

Recommend engaging a U.S. privacy law attorney for compliance review before project launch. VIP facial recognition should be designed as "Opt-in" mode. Firearm detection serves as safety alert tool only, does not replace human judgment. Alerts should be confirmed by staff before action. All hardware selected has passed FCC certification and is not restricted by NDAA.