Trust & Safety

Every file on FirstHandAPI is captured by a verified human on a real device, scored by a multi-model AI ensemble, and checked against multiple fraud detection layers before it reaches your folder. This guide explains how.

Human Verification

FirstHandAPI uses defense-in-depth to verify that every submission comes from a real human on a real device — not a bot, script, or spoofed app.

Device Integrity

Every worker device is checked for compromise before submissions are accepted:

Check	What It Does
Jailbreak detection	6-signal check: suspicious file paths, Cydia/Sileo URL schemes, sandbox integrity, dylib injection (Substrate, Frida, Cycript), `DYLD_INSERT_LIBRARIES`, symlink anomalies
App Attest (Apple)	Cryptographic proof that the request originates from an unmodified copy of the FirstHand app on the device that completed attestation. Uses Apple’s `DCAppAttestService` with counter-based replay prevention
Biometric verification	Face ID / Touch ID required at three gate points: app launch, before every file upload, and before payout withdrawal. Proves the device owner is physically present

Identity & Onboarding

Workers sign in via Clerk OAuth (production) with email verification
New workers complete a practice onboarding flow (5 sample uploads) to calibrate baseline quality
Stripe Connect Express handles KYC, bank verification, and tax documentation for payouts
Duplicate account detection via device fingerprint and email prevents re-registration after bans

GPS Geotag Verification

Every submission captures the worker’s GPS coordinates at upload time:

Field	Description
`submission_latitude`	Worker’s latitude at upload time (numeric, 6 decimal places)
`submission_longitude`	Worker’s longitude at upload time
`location_verified`	`true` if GPS was collected, `false` if worker denied location permission

If a job requires a specific location (e.g., NYC) and the worker submits from a different city, a location mismatch warning is logged (>50km distance)
Location mismatches are advisory signals — submissions are not automatically rejected
Workers who deny location permission can still submit, but their submissions are flagged as location_verified: false

Behavioral Signals

Submission timing analysis: Uploads less than 8 seconds after job acceptance are flagged — legitimate workers need at least 8 seconds to open the camera, compose a shot, and upload
Bot score enforcement: Telemetry-based bot score above 0.7 triggers automatic rejection
Concurrent submission limits: Maximum 10 pending submissions per worker to prevent queue flooding
Velocity caps: 50 submissions per worker per day

Content Authenticity

Every submission is checked for authenticity before AI quality scoring — catching the most common fraud vectors first.

Server-Side Pre-Checks (Before AI Scoring)

These deterministic checks run instantly on upload, before expensive AI scoring:

Check	Description
Cross-job content hash dedup	Same file cannot be submitted to multiple jobs (SHA-256 hash matching)
Cross-worker content hash dedup	Different workers cannot submit the same downloaded image
Minimum file size floor	Images: 100 KB, audio: 10 KB, video: 50 KB — phone photos are 2–8 MB; tiny files indicate downloaded thumbnails
EXIF freshness check	Photos with capture timestamps older than 24 hours are flagged as potential camera roll recycling
Audio/video duration validation	Files shorter than the job’s `min_duration_seconds` are rejected at upload time
Image dimensions persistence	Width and height are stored for downstream AI generator detection

AI Authenticity Detection

After pre-checks pass, the multi-model AI ensemble evaluates content authenticity using a tiered priority system:

Quick-Kill Checks (Most Common Fraud)

Watermarks & attribution — © marks, photo credits, social media handles, Instagram/TikTok overlays
Screen photos — moiré patterns, screen bezels, desktop UI, color banding, pixel grid
Photos of printed images — paper texture, fiber patterns, print dots, creases, glossy reflections
Duplicate content — same composition with minor cropping or filter changes

AI-Generated Content Detection (4 Tiers)

Tier 1 (Highly Reliable): Missing EXIF metadata + garbled text/signage, extra fingers, teeth anomalies, mismatched reflections
Tier 2 (Reliable): Shadow inconsistencies, background dissolution, unnatural symmetry, hair/fur boundary artifacts
Tier 3 (Moderate): Depth of field too perfect, HDR exceeds phone capability, repeating texture patterns
Tier 4 (Subtle): Absent lens distortion, uniform noise patterns, too-perfect composition

Decision rule: 2+ Tier 1 signals OR 3+ signals across any tiers = automatic 1-star rejection.

Audio Authenticity

Speaker playback detection (room reverb layered on content, frequency degradation)
TTS/AI voice detection (robotic cadence, flat intonation, missing human markers like breathing and filler words)
Spliced audio detection (discontinuities in background noise, inconsistent room acoustics)

Whisper Hallucination Suppression

OpenAI Whisper is known to hallucinate plausible transcripts on silent or ambient audio — often generating text in Japanese, Chinese, or Korean about unrelated topics. FirstHandAPI detects and suppresses these hallucinations by checking three signals from Whisper’s per-segment metadata:

Signal	Threshold	What it detects
`no_speech_prob`	> 0.7	Segment is probably silence
`avg_logprob`	< -1.0	Whisper has low confidence in its own output
`compression_ratio`	> 2.4	Repetitive/anomalous hallucinated text

When any threshold is exceeded, the transcript is suppressed and replaced with an ambient audio indicator. This prevents workers from being penalized for submitting legitimate ambient recordings that Whisper misinterprets.

Language mismatch flagging

Video Authenticity

Screen recording detection (player UI, platform logos, letterboxing, cursor)
Video-of-screen detection (refresh rate flicker, moiré shimmer)
Temporal inconsistencies (objects appearing/disappearing, flickering)
Physics violations (hair defying gravity, liquids flowing wrong)
Audio-visual desync (lip movements not matching speech)

Reverse Image Search

Submissions are checked against known stock photo libraries via TinEye API (when configured). Matches against stock libraries result in automatic rejection.

AI Quality Scoring

Every submission that passes authenticity checks is scored by a multi-model AI ensemble:

Content Type	Models Used
Images	Claude Vision (scoring + annotations)
Audio	OpenAI Whisper (transcription) + Claude (analysis + scoring)
Video	Claude Vision (frame sampling) + Whisper (audio track) + Claude (scoring)

Each file receives:

Star rating (1–5) — overall quality score
Dimension scores — relevance, quality, completeness (each 1–5)
Reasoning — explanation of the rating
Feedback — actionable suggestions for improvement
Annotations — structured metadata (object labels, OCR, scene classification, color palettes, speaker counts, transcripts)
Policy violation flag — set for content safety or authenticity failures

Files scoring 3+ stars are automatically approved. Files scoring 2 stars get one retry with specific feedback. Files scoring 1 star are permanently rejected and count toward the worker’s strike record.

Strike System & Bans

Workers who submit low-quality or fraudulent content face escalating consequences:

Event	Consequence
1-star submission	Strike recorded. Worker sees warning with specific feedback.
3 strikes (1-star) in 30 days	Permanent ban. Account suspended, uncollected earnings forfeited.
Ban evasion attempt	Device fingerprint prevents re-registration on the same device.

The 3-strike policy with earnings forfeiture creates strong economic disincentives — a worker with $50 in pending payouts will not risk a fraudulent submission.

Worker Reputation & Tiers

Workers build a reputation based on approval rate and total submissions. The system assigns tiers that affect job sorting and display badges:

Tier	Criteria	Benefit
New	0-2 submissions	Treated as Silver (benefit of the doubt)
Bronze	Under 70% approval (3+ subs)	Standard job sorting
Silver	70-85% approval	”Reliable” badge
Gold	85%+ approval	Highest-paying jobs sorted first + “Top Worker” badge
Diamond	95%+ approval, 50+ subs	Same as Gold + “Diamond” badge

All workers see all jobs — tiers affect sort order, not visibility. Gold and Diamond workers automatically see the highest-paying jobs first, incentivizing consistent quality.

The worker_tier field is returned in the job browse response for iOS badge display.

Data Security

Layer	Implementation
Encryption in transit	TLS 1.3 on all API endpoints
Encryption at rest	AES-256 (SSE-KMS) on all S3 objects
File isolation	Each job’s files are stored in an isolated S3 prefix, accessible only to the buyer’s organization
Access control	Pre-signed download URLs expire after 7 days
Org-scoped queries	Every database query includes `WHERE organization_id = ?` — no cross-tenant data access
Webhook signatures	HMAC-SHA256 signed payloads for tamper detection

Content Safety

Before quality scoring, every submission is checked against content safety policies:

Child exploitation material (CSAM) — immediate 1-star, account ban
Nudity involving minors — immediate 1-star, account ban
Extreme violence or gore — 1-star rejection
Hate symbols or content — 1-star rejection
Explicit content (unless specifically requested by the job description) — 1-star rejection

Content safety violations are flagged with policyViolation: true and trigger immediate strike recording.

Summary

Every submission passes through 5 layers before reaching your folder:

Device verification — jailbreak detection, App Attest, biometric auth
Server-side pre-checks — hash dedup, file size, EXIF freshness, timing analysis
Content authenticity — AI-powered fraud detection (watermarks, screen photos, AI-generated content)
Content safety — policy violation screening
Quality scoring — multi-model AI ensemble with 1–5 star ratings and structured annotations

Only files that pass all 5 layers and score 3+ stars are approved and delivered to your folder. You are never charged for rejected submissions.

Worker Platform AI Scoring Rubric