מתי להשתמש
"Embeddings", "Vector search", "Semantic search", "Vector DB", "Pinecone", "Similarity search".
הוראות עבודה
1. What are Embeddings
Numeric representations of text (or images, audio). Similar meanings → similar vectors.
Example
- "dog" → [0.12, -0.45, 0.78, ...]
- "puppy" → [0.14, -0.43, 0.81, ...] (similar to "dog")
- "car" → [0.92, 0.03, -0.31, ...] (different)
2. Embedding Models 2026
| Model | Dimensions | Strengths | Cost |
|---|---|---|---|
| OpenAI text-embedding-3-small | 1536 | Cheap, good quality | $0.02 / 1M tokens |
| OpenAI text-embedding-3-large | 3072 | Best quality | $0.13 / 1M |
| Cohere embed-multilingual-v3 | 1024 | Multilingual (Hebrew!) | $0.10 / 1M |
| Voyage AI | Various | Domain-specific | varies |
| Nomic | Various | Open source | Free self-host |
| BGE / Nomic | Various | Open source | Free |
3. Choosing Model
Use OpenAI text-embedding-3-small if
- English-primary.
- Cost-sensitive.
- General use cases.
Use Cohere multilingual if
- Hebrew / multilingual content.
- Need consistent quality across languages.
Use Voyage if
- Specific domain (legal, medical, code).
Use open source if
- Privacy critical (self-host).
- High volume + cost matters.
4. Dimensions Trade-off
- More dimensions = better quality, slower search, more storage.
- 1536 dim = sweet spot for most.
- 3072 dim = top quality, more cost.
- 256-512 dim = fast, cheap, less quality.
5. Vector Databases
| DB | Type | Strengths |
|---|---|---|
| Pinecone | Managed SaaS | Easy, scalable, popular |
| Weaviate | Open source | Hybrid search built-in |
| Qdrant | Open source | Fast, Rust-based |
| Chroma | Lightweight | Local, dev-friendly |
| pgvector | Postgres extension | If already on Postgres |
| Milvus | Open source | Enterprise scale |
| Vespa | Open source | Production scale |
6. Pricing — Pinecone Example
- Starter: $0 (limited).
- Standard: $70/m (10M vectors).
- Enterprise: Custom.
7. Similarity Metrics
Cosine Similarity (most common)
- 1 = identical, 0 = unrelated, -1 = opposite.
- Good for text.
Dot Product
- Faster computation.
- Some embeddings designed for it.
Euclidean Distance
- Less common for text.
- Good for numerical data.
8. Use Cases Beyond RAG
Semantic Search
- "Find similar products" — not just keyword match.
Recommendation Systems
- "Customers who liked X also liked Y".
Classification
- Find nearest example in labeled dataset.
Clustering
- Group similar items.
Deduplication
- Find near-duplicate content.
Anomaly Detection
- Outlier vectors = anomalies.
9. Sample Code — Semantic Search
from openai import OpenAI
import numpy as np
client = OpenAI()
def embed(text):
return client.embeddings.create(
input=text, model="text-embedding-3-small"
).data[0].embedding
# Embed corpus once
documents = ["doc1 text", "doc2 text", ...]
doc_embeddings = [embed(d) for d in documents]
# Query
query = "user's question"
query_embedding = embed(query)
# Similarity
similarities = [
np.dot(query_embedding, d) / (np.linalg.norm(query_embedding) * np.linalg.norm(d))
for d in doc_embeddings
]
# Top results
top_idx = np.argsort(similarities)[::-1][:5]
10. Hybrid Search (Best Quality)
Combine:
- Semantic (vector similarity)
- Keyword (BM25 / Elasticsearch)
- Weighted scoring
Result: better recall + precision.
Tools
- Weaviate (built-in hybrid).
- Elasticsearch + dense_vector.
- Custom orchestration.
11. Re-embedding Strategy
When to Re-embed
- New embedding model release.
- Major content updates.
- Quality improvements needed.
Don't Re-embed Daily
- Costly.
- Old embeddings still work.
- Schedule monthly/quarterly.
12. Israel Specifics
- Hebrew embeddings: Cohere multilingual best.
- Mixed Hebrew/English: still works, slight quality drop.
- Self-host options for privacy-sensitive Israeli companies.
13. Common Pitfalls
❌ Wrong embedding model for content type. ❌ No reranking — quality plateau. ❌ Embedding too small chunks — loses context. ❌ Mixing embedding models in same DB — incompatible.
14. אסיים בהמלצה.
פרומפט לדוגמה
Embedding model for Hebrew + English mixed corpus?
Pinecone vs pgvector — when each?
Build product recommendation with embeddings.
© 2026 AI Expert Pro | גרסה 1.0.0