How to Scrape Yelp — Complete Guide (2026)
Learn how to scrape Yelp business listings, reviews, and ratings. Compare Python scrapers with the SimpleCrawl API for reliable Yelp data extraction.
Yelp is the leading platform for local business reviews, with data on millions of businesses across restaurants, services, retail, and more. Scraping Yelp enables competitive analysis, reputation monitoring, location intelligence, and lead generation for local businesses. This guide covers practical methods for extracting Yelp business data, reviews, and ratings at scale.
What Data Can You Extract from Yelp?
Yelp business and review pages contain rich structured data:
- Business details — name, address, phone number, website, hours, price range ($ to $$$$), categories, amenities, services offered
- Ratings and reviews — overall star rating, review count, individual review text, reviewer name, review date, star rating per review, photos attached to reviews
- Photos — business photos, food photos, interior/exterior shots, user-uploaded images
- Search results — businesses matching query and location, with ratings, review counts, and category tags
- Menu data — menu items, prices, popular dishes (for restaurants)
- Service quotes — request-a-quote data, response time, hiring rate
- Competitor data — "People also viewed" businesses, similar businesses nearby
This data powers local SEO tools, reputation management platforms, price monitoring for local services, and market research for franchises and multi-location businesses.
Challenges When Scraping Yelp
Yelp has mature anti-scraping defenses:
JavaScript Rendering Requirements
Yelp uses React for its frontend. Review content, business details, and search results load dynamically. Static HTML fetches return skeleton markup with no useful data.
Review Pagination and Filtering
Yelp paginates reviews (10 per page) and filters some reviews into a "not recommended" section. Capturing all reviews requires multiple paginated requests and handling Yelp's recommendation algorithm.
Anti-Bot Detection
Yelp uses device fingerprinting, behavioral analysis, and rate limiting. Automated requests are detected through TLS fingerprinting, cookie analysis, and request timing patterns.
Content Obfuscation
Yelp occasionally obfuscates phone numbers and addresses using CSS tricks or JavaScript-rendered text, making simple HTML parsing insufficient.
API Limitations
Yelp's Fusion API exists but has strict rate limits (5,000 requests/day) and doesn't expose full review text — only a 160-character snippet. This makes the official API inadequate for review analysis.
Method 1: Using SimpleCrawl API (Easiest)
SimpleCrawl renders Yelp pages fully and returns structured business data:
curl -X POST https://api.simplecrawl.com/v1/scrape \
-H "Authorization: Bearer sc_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.yelp.com/biz/tartine-bakery-san-francisco",
"format": "extract",
"schema": {
"name": "string",
"rating": "number",
"review_count": "number",
"price_range": "string",
"categories": ["string"],
"address": "string",
"phone": "string",
"hours": "string",
"reviews": [{
"author": "string",
"rating": "number",
"date": "string",
"text": "string"
}]
}
}'
For search results:
{
"url": "https://www.yelp.com/search?find_desc=pizza&find_loc=Chicago",
"format": "extract",
"schema": {
"businesses": [{
"name": "string",
"rating": "number",
"review_count": "number",
"price_range": "string",
"address": "string",
"categories": ["string"]
}]
}
}
Method 2: DIY with Python (Manual)
Using Yelp's Fusion API (Limited)
import requests
API_KEY = "YOUR_YELP_API_KEY"
def search_yelp(term: str, location: str, limit: int = 20) -> list:
url = "https://api.yelp.com/v3/businesses/search"
headers = {"Authorization": f"Bearer {API_KEY}"}
params = {"term": term, "location": location, "limit": limit}
response = requests.get(url, headers=headers, params=params)
data = response.json()
businesses = []
for biz in data.get("businesses", []):
businesses.append({
"name": biz["name"],
"rating": biz["rating"],
"review_count": biz["review_count"],
"price": biz.get("price", "N/A"),
"address": ", ".join(biz["location"]["display_address"]),
"phone": biz.get("display_phone", ""),
"url": biz["url"],
})
return businesses
results = search_yelp("restaurants", "San Francisco")
for biz in results:
print(f"{biz['name']} — {biz['rating']}★ ({biz['review_count']} reviews)")
Web Scraping with Playwright (Full Reviews)
The Fusion API only returns review snippets. For full review text, you need to scrape:
from playwright.sync_api import sync_playwright
import time
def scrape_yelp_reviews(business_url: str, max_pages: int = 3) -> list:
reviews = []
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
for page_num in range(max_pages):
url = f"{business_url}?start={page_num * 10}"
page.goto(url, wait_until="networkidle")
time.sleep(2)
review_els = page.query_selector_all("[data-review-id]")
for el in review_els:
author = el.query_selector("a[href*='/user_details']")
text = el.query_selector("span.raw__09f24__T4Ezm")
rating = el.query_selector("[aria-label*='star rating']")
reviews.append({
"author": author.text_content().strip() if author else None,
"text": text.text_content().strip() if text else None,
"rating": rating.get_attribute("aria-label") if rating else None,
})
browser.close()
return reviews
reviews = scrape_yelp_reviews("https://www.yelp.com/biz/tartine-bakery-san-francisco")
for r in reviews:
print(f"{r['rating']} — {r['text'][:100]}...")
For more Python scraping techniques, see our web scraping with Python guide.
Why SimpleCrawl Is Better for Yelp
| Feature | Yelp Fusion API | DIY Scraping | SimpleCrawl |
|---|---|---|---|
| Full review text | No (snippets only) | Yes (fragile) | Yes (stable) |
| Rate limits | 5,000/day | IP-based | High throughput |
| Auth required | API key | No | API key |
| Review photos | URLs only | Complex | Included |
| Business hours | Yes | Parsing required | Structured |
| Maintenance | Low (official API) | High | Zero |
SimpleCrawl bridges the gap between Yelp's limited API and unreliable DIY scraping. Full review text, structured data, no maintenance. See pricing for details.
Legal Considerations
- Yelp's ToS prohibit scraping — Yelp explicitly prohibits automated access in their Terms of Service.
- Yelp v. scrapers — Yelp has pursued legal action against review scraping operations, particularly those that republish or manipulate review data.
- Review content ownership — Yelp reviews are copyrighted by their authors and licensed to Yelp. Republishing full reviews without permission raises copyright issues.
- Business data — business names, addresses, and phone numbers are generally considered public facts and are less legally protected than review content.
- Use the Fusion API when possible — for basic business search data, Yelp's official API is the safest option. Use scraping for data the API doesn't provide (full reviews, menus).
Check Yelp's crawling permissions with our robots.txt checker.
FAQ
Can I get full Yelp reviews through the API?
No. Yelp's Fusion API returns only the first 160 characters of each review. For full review text, you need to scrape the web page or use SimpleCrawl's extract mode.
How do I scrape "not recommended" Yelp reviews?
Yelp hides reviews it deems unreliable behind a separate page. These can be accessed by navigating to the "not recommended reviews" link on the business page. SimpleCrawl can extract these with the right URL.
Is Yelp scraping useful for competitive analysis?
Absolutely. Monitoring competitor reviews, ratings, and response patterns provides actionable intelligence for local businesses. SimpleCrawl makes this data accessible in structured JSON.
How many Yelp pages can I scrape per day?
DIY scrapers typically get blocked after 200-500 pages. SimpleCrawl supports thousands of pages daily through its distributed proxy infrastructure. Check pricing for credit details.
Can I scrape Yelp restaurant menus?
Yes. Yelp menu pages are accessible through scraping. SimpleCrawl extracts menu items, prices, and descriptions when available on the business page.
Ready to try SimpleCrawl?
We're building the simplest web scraping API for AI. Join the waitlist and get 500 free credits at launch.
More scraping guides
How to Scrape Amazon — Complete Guide (2026)
Learn how to scrape Amazon product data, prices, reviews, and rankings. Compare DIY Python scrapers with the SimpleCrawl API for reliable Amazon data extraction.
How to Scrape Google — Complete Guide (2026)
Learn how to scrape Google search results, SERP data, featured snippets, and People Also Ask boxes. Compare Python scrapers with the SimpleCrawl SERP API.
How to Scrape Indeed — Complete Guide (2026)
Learn how to scrape Indeed job listings, salaries, and company reviews. Compare Python scrapers with the SimpleCrawl API for reliable Indeed data extraction.
How to Scrape LinkedIn — Complete Guide (2026)
Learn how to scrape LinkedIn profiles, job listings, and company data. Covers DIY Python methods and the SimpleCrawl API for reliable LinkedIn data extraction.