Fashion & apparel
Match outfit screenshots from Pinterest, street style, or influencer posts to in-catalog SKUs.
"That beige one with the chunky sole, square-ish toe, not the suede though." Your shoppers can't type that — they can snap it. Trooply's AI vision model reads the photo, finds the match in your catalog, and ranks it like a salesperson would.
Badges below cite uplift ranges observed during onboarding pilots — not universal guarantees. Your numbers will vary with catalog size, image quality, and baseline search conversion.
Match outfit screenshots from Pinterest, street style, or influencer posts to in-catalog SKUs.
Shoppers snap a room, find compatible pieces ranked by color, aspect, and material.
Upload a lipstick shade or eye-look — we find the closest SKU across your brand portfolio.
Fine-grained similarity — cut, metal, stone, setting — so "like this ring" actually returns like rings.
75 real products across fashion, electronics, and home — indexed in Qdrant, ranked by CLIP. This widget hits our production API live. Go ahead — break it.
Every search runs through a stack of vision and language models trained on hundreds of millions of image-text pairs. You ship the API call; we ship the inference, the GPUs, and the boring math.
OpenAI's 768-dimensional vision-language encoder reads a product photo the way a human shopper does — shape, texture, palette, vibe — not just metadata or filenames.
// 142 ms on GPU embedding: float[768]
U²-Net cuts the subject out of cluttered photos before the encode step. A bag against a Pinterest moodboard scores like a bag on a clean studio backdrop.
rembg → mask → focus crop → cleaner match score
Gemma 4 reads "the beige one with the chunky sole" the way a salesperson would — parses intent, normalises colour, and routes to the right product type before the vector lookup.
"chunky beige sole" → shoes · beige · platform
Vector similarity is just the first pass. We re-score every candidate against six signals — visual fit, product type, popularity, aspect, colour histogram, category — then return the top hits.
visual · type · popularity aspect · colour · category
Shoppers upload a photo — a screenshot, a street-style pic, a moodboard — and our CLIP vision model lands them on the closest SKUs in your catalog.
POST /v1/search Content-Type: multipart/form-data image=@customer-photo.jpg
"Red leather crossbody" matches leather crossbody bags in red — even when the SKU title doesn't say so. CLIP reads vibe, not just tokens; Gemma 4 reads intent.
POST /v1/search
{ "query": "red leather crossbody bag",
"limit": 24 }
Each client gets its own Qdrant collection. No bleed, no shared namespace. Ship visual search as a feature of your platform, not a side project.
X-Tenant-ID: store_9f3c1 Authorization: Bearer …
OAuth 2.0 client credentials, scoped keys, rate limits, SSRF protection, IP allow-lists, HMAC-signed webhooks. The boring-good kind.
POST /oauth/token grant_type=client_credentials
Automatic failover between CPU and GPU model pools. AI background removal and dominant-colour extraction ship in the same call — no extra round-trip.
GET /v1/products/{id}
→ { palette, bbox, is_clean_bg }
Push a CSV of 50k SKUs; we chunk, embed, and write in the background. Poll the job, subscribe to a webhook, or watch it in the dashboard.
POST /v1/products/bulk
→ { job_id, status: "queued" }
Sign up, create a client in the portal, copy the client ID and secret. No credit card. Free tier is forever.
POST product images once. We generate the embeddings, extract colors, detect subjects, and write them to your collection.
Point your storefront search bar here. Every result comes back with a similarity score and the re-rank signals that got it there.
Quotes below are illustrative composites drawn from onboarding interviews and early-access feedback — they're representative of what operators tell us, not individually sourced. Named case studies ship when customers agree to be quoted publicly.
"We went from a four-week backlog of 'make search better' tickets to one engineer, one afternoon, one cURL. Conversions on search sessions are up 34%."
"Our shoppers upload outfit screenshots from Pinterest. We didn't have to teach them — they just started doing it. Now it actually works."
"Zero-result searches dropped to 1.8% the week we shipped. The ROI math was obvious by day four."
Aggregated from internal A/B tests across three mid-market retail catalogs (fashion, beauty, home), 20k–80k SKUs each, 90-day measurement window (Jan–Mar 2026). Replay conditions available on request.
| Trooply | Build in-house | Algolia Visual | Vue.ai | |
|---|---|---|---|---|
| Time to first match | < 1 day | 3–6 months | 2–4 weeks | 4–8 weeks |
| Monthly cost (50k SKUs) | $99 | $8–14k | $2.5k | $4k+ |
| p95 latency | 142 ms | varies | 210 ms | 340 ms |
| Multi-tenant ready | ✓ | ✕ | ✕ | ✓ |
| Background removal | ✓ | ✕ | ✓ | ✕ |
| Re-rank signals | 6 | DIY | 2 | 3 |
| Free tier | ✓ | ✕ | ✕ | ✕ |
Breaking-change flags in the docs. RSS + Slack bot available.
Re-written embedding pipeline with batched inference. Ingestion throughput went from ~850 SKU/min to 3,400 SKU/min on the default pool.
Install the app, connect your store, and we back-fill every product + variant image within minutes. Re-sync on webhook.
768-dim embeddings instead of 512. ~20% lift on recall@12 benchmarks. Existing tenants can migrate in one call.
Webhook payloads now include X-Trooply-Signature. Tenant-scoped IP allow-lists moved to general availability.
Pin your tenant to eu-central-1. Data at rest + in transit stays in-region. SOC 2 Type II coverage extended.
Visual matching runs on OpenAI's CLIP ViT-L/14 — a 768-dimensional vision-language encoder. Subject segmentation runs on U²-Net (rembg) so cluttered backgrounds don't dilute matches. Natural-language queries are parsed by Gemma 4. A six-signal AI re-ranker (visual, type, popularity, aspect, colour histogram, category) scores final results.
JPEG, PNG, WebP, GIF, and BMP, up to 10 MB per image. We compress and normalize on ingest — you don't have to pre-process.
We run OpenAI's CLIP ViT-L/14 — a 768-dimensional vision-language encoder, roughly 20% more accurate than the ViT-B/32 variant most competitors ship. Results are cosine-ranked then re-scored with six signals: visual (57%), product type (23%), popularity, aspect ratio, color histogram, and category.
Yes — the API is platform-agnostic. Anything that can speak HTTP can speak to us. We also ship first-party Python and JavaScript SDKs, plus an OpenAPI spec.
Yes. Every client gets its own Qdrant collection. Zero data crossover between tenants, and all tokens are client-scoped via OAuth 2.0.
You get a 429 with a Retry-After header. We email you at 80% of monthly quota so you can upgrade before it bites.
Yes. 50 products, 1,000 hits per month, 10 requests per minute. No credit card, no time limit.
1,000 API calls per month, free forever. Upgrade when conversions justify it.