GuidesTrust & Safety

Trust & Safety

Every file on FirstHandAPI is captured by a verified human on a real device, scored by a multi-model AI ensemble, and checked against multiple fraud detection layers before it reaches your folder. This guide explains how.

Human Verification

FirstHandAPI uses defense-in-depth to verify that every submission comes from a real human on a real device — not a bot, script, or spoofed app.

Device Integrity

Every worker device is checked for compromise before submissions are accepted:

CheckWhat It Does
Jailbreak detection6-signal check: suspicious file paths, Cydia/Sileo URL schemes, sandbox integrity, dylib injection (Substrate, Frida, Cycript), DYLD_INSERT_LIBRARIES, symlink anomalies
App Attest (Apple)Cryptographic proof that the request originates from an unmodified copy of the FirstHand app on the device that completed attestation. Uses Apple’s DCAppAttestService with counter-based replay prevention
Biometric verificationFace ID / Touch ID required at three gate points: app launch, before every file upload, and before payout withdrawal. Proves the device owner is physically present

Identity & Onboarding

  • Workers sign in via Clerk OAuth (production) with email verification
  • New workers complete a practice onboarding flow (5 sample uploads) to calibrate baseline quality
  • Stripe Connect Express handles KYC, bank verification, and tax documentation for payouts
  • Duplicate account detection via device fingerprint and email prevents re-registration after bans

GPS Geotag Verification

Every submission captures the worker’s GPS coordinates at upload time:

FieldDescription
submission_latitudeWorker’s latitude at upload time (numeric, 6 decimal places)
submission_longitudeWorker’s longitude at upload time
location_verifiedtrue if GPS was collected, false if worker denied location permission
  • If a job requires a specific location (e.g., NYC) and the worker submits from a different city, a location mismatch warning is logged (>50km distance)
  • Location mismatches are advisory signals — submissions are not automatically rejected
  • Workers who deny location permission can still submit, but their submissions are flagged as location_verified: false

Behavioral Signals

  • Submission timing analysis: Uploads less than 8 seconds after job acceptance are flagged — legitimate workers need at least 8 seconds to open the camera, compose a shot, and upload
  • Bot score enforcement: Telemetry-based bot score above 0.7 triggers automatic rejection
  • Concurrent submission limits: Maximum 10 pending submissions per worker to prevent queue flooding
  • Velocity caps: 50 submissions per worker per day

Content Authenticity

Every submission is checked for authenticity before AI quality scoring — catching the most common fraud vectors first.

Server-Side Pre-Checks (Before AI Scoring)

These deterministic checks run instantly on upload, before expensive AI scoring:

CheckDescription
Cross-job content hash dedupSame file cannot be submitted to multiple jobs (SHA-256 hash matching)
Cross-worker content hash dedupDifferent workers cannot submit the same downloaded image
Minimum file size floorImages: 100 KB, audio: 10 KB, video: 50 KB — phone photos are 2–8 MB; tiny files indicate downloaded thumbnails
EXIF freshness checkPhotos with capture timestamps older than 24 hours are flagged as potential camera roll recycling
Audio/video duration validationFiles shorter than the job’s min_duration_seconds are rejected at upload time
Image dimensions persistenceWidth and height are stored for downstream AI generator detection

AI Authenticity Detection

After pre-checks pass, the multi-model AI ensemble evaluates content authenticity using a tiered priority system:

Quick-Kill Checks (Most Common Fraud)

  1. Watermarks & attribution — © marks, photo credits, social media handles, Instagram/TikTok overlays
  2. Screen photos — moiré patterns, screen bezels, desktop UI, color banding, pixel grid
  3. Photos of printed images — paper texture, fiber patterns, print dots, creases, glossy reflections
  4. Duplicate content — same composition with minor cropping or filter changes

AI-Generated Content Detection (4 Tiers)

  • Tier 1 (Highly Reliable): Missing EXIF metadata + garbled text/signage, extra fingers, teeth anomalies, mismatched reflections
  • Tier 2 (Reliable): Shadow inconsistencies, background dissolution, unnatural symmetry, hair/fur boundary artifacts
  • Tier 3 (Moderate): Depth of field too perfect, HDR exceeds phone capability, repeating texture patterns
  • Tier 4 (Subtle): Absent lens distortion, uniform noise patterns, too-perfect composition

Decision rule: 2+ Tier 1 signals OR 3+ signals across any tiers = automatic 1-star rejection.

Audio Authenticity

  • Speaker playback detection (room reverb layered on content, frequency degradation)
  • TTS/AI voice detection (robotic cadence, flat intonation, missing human markers like breathing and filler words)
  • Spliced audio detection (discontinuities in background noise, inconsistent room acoustics)

Whisper Hallucination Suppression

OpenAI Whisper is known to hallucinate plausible transcripts on silent or ambient audio — often generating text in Japanese, Chinese, or Korean about unrelated topics. FirstHandAPI detects and suppresses these hallucinations by checking three signals from Whisper’s per-segment metadata:

SignalThresholdWhat it detects
no_speech_prob> 0.7Segment is probably silence
avg_logprob< -1.0Whisper has low confidence in its own output
compression_ratio> 2.4Repetitive/anomalous hallucinated text

When any threshold is exceeded, the transcript is suppressed and replaced with an ambient audio indicator. This prevents workers from being penalized for submitting legitimate ambient recordings that Whisper misinterprets.

  • Language mismatch flagging

Video Authenticity

  • Screen recording detection (player UI, platform logos, letterboxing, cursor)
  • Video-of-screen detection (refresh rate flicker, moiré shimmer)
  • Temporal inconsistencies (objects appearing/disappearing, flickering)
  • Physics violations (hair defying gravity, liquids flowing wrong)
  • Audio-visual desync (lip movements not matching speech)

Submissions are checked against known stock photo libraries via TinEye API (when configured). Matches against stock libraries result in automatic rejection.

AI Quality Scoring

Every submission that passes authenticity checks is scored by a multi-model AI ensemble:

Content TypeModels Used
ImagesClaude Vision (scoring + annotations)
AudioOpenAI Whisper (transcription) + Claude (analysis + scoring)
VideoClaude Vision (frame sampling) + Whisper (audio track) + Claude (scoring)

Each file receives:

  • Star rating (1–5) — overall quality score
  • Dimension scores — relevance, quality, completeness (each 1–5)
  • Reasoning — explanation of the rating
  • Feedback — actionable suggestions for improvement
  • Annotations — structured metadata (object labels, OCR, scene classification, color palettes, speaker counts, transcripts)
  • Policy violation flag — set for content safety or authenticity failures

Files scoring 3+ stars are automatically approved. Files scoring 2 stars get one retry with specific feedback. Files scoring 1 star are permanently rejected and count toward the worker’s strike record.

Strike System & Bans

Workers who submit low-quality or fraudulent content face escalating consequences:

EventConsequence
1-star submissionStrike recorded. Worker sees warning with specific feedback.
3 strikes (1-star) in 30 daysPermanent ban. Account suspended, uncollected earnings forfeited.
Ban evasion attemptDevice fingerprint prevents re-registration on the same device.

The 3-strike policy with earnings forfeiture creates strong economic disincentives — a worker with $50 in pending payouts will not risk a fraudulent submission.

Worker Reputation & Tiers

Workers build a reputation based on approval rate and total submissions. The system assigns tiers that affect job sorting and display badges:

TierCriteriaBenefit
New0-2 submissionsTreated as Silver (benefit of the doubt)
BronzeUnder 70% approval (3+ subs)Standard job sorting
Silver70-85% approval”Reliable” badge
Gold85%+ approvalHighest-paying jobs sorted first + “Top Worker” badge
Diamond95%+ approval, 50+ subsSame as Gold + “Diamond” badge

All workers see all jobs — tiers affect sort order, not visibility. Gold and Diamond workers automatically see the highest-paying jobs first, incentivizing consistent quality.

The worker_tier field is returned in the job browse response for iOS badge display.

Data Security

LayerImplementation
Encryption in transitTLS 1.3 on all API endpoints
Encryption at restAES-256 (SSE-KMS) on all S3 objects
File isolationEach job’s files are stored in an isolated S3 prefix, accessible only to the buyer’s organization
Access controlPre-signed download URLs expire after 7 days
Org-scoped queriesEvery database query includes WHERE organization_id = ? — no cross-tenant data access
Webhook signaturesHMAC-SHA256 signed payloads for tamper detection

Content Safety

Before quality scoring, every submission is checked against content safety policies:

  • Child exploitation material (CSAM) — immediate 1-star, account ban
  • Nudity involving minors — immediate 1-star, account ban
  • Extreme violence or gore — 1-star rejection
  • Hate symbols or content — 1-star rejection
  • Explicit content (unless specifically requested by the job description) — 1-star rejection

Content safety violations are flagged with policyViolation: true and trigger immediate strike recording.

Summary

Every submission passes through 5 layers before reaching your folder:

  1. Device verification — jailbreak detection, App Attest, biometric auth
  2. Server-side pre-checks — hash dedup, file size, EXIF freshness, timing analysis
  3. Content authenticity — AI-powered fraud detection (watermarks, screen photos, AI-generated content)
  4. Content safety — policy violation screening
  5. Quality scoring — multi-model AI ensemble with 1–5 star ratings and structured annotations

Only files that pass all 5 layers and score 3+ stars are approved and delivered to your folder. You are never charged for rejected submissions.