Fix TikTok extraction with proxies by karilaa-dev · Pull Request #60 · karilaa-dev/tt-bot

karilaa-dev · 2026-01-16T05:30:37Z

User description

Summary

Fix TikTok video extraction failing when proxies are configured
Use direct connection for metadata extraction (with browser impersonation)
Use proxies for media downloads to hide server IP

Test plan

Tested video extraction with proxies enabled
Verified proxy is used for downloads
Confirmed extraction bypasses proxy to use impersonate feature

PR Type

Bug fix, Documentation

Description

Fix TikTok extraction with proxies using direct connection for metadata
Simplify proxy handling by temporarily disabling it for extraction
Update CODEBASE_MAP.md with new configuration details
Document hardcoded performance values and removed env vars

Diagram Walkthrough

flowchart LR
  A[Proxy Configured] --> B[Disable Proxy Temporarily]
  B --> C[Extract Metadata with Impersonate]
  C --> D[Restore Proxy for Downloads]
  D --> E[Complete Extraction]

File Walkthrough

Relevant files

Bug fix

client.py `Fix TikTok extraction with proxies` tiktok_api/client.py Simplify proxy handling by temporarily disabling it for metadata extraction Use `ie._extract_web_data_and_status()` for both proxy and non-proxy cases Remove complex manual extraction logic for proxy scenario Update download context to use saved proxy for downloads	+25/-74

Documentation

CODEBASE_MAP.md `Update codebase documentation` docs/CODEBASE_MAP.md Update last_mapped date and token counts Add Telegram API credentials documentation Update performance configuration section Add notes about removed fields and hardcoded values	+20/-11

Simplify configuration by removing most performance-related env vars and hardcoding values optimized for maximum resource usage: - ThreadPoolExecutor: 500 workers (vs default 32) - aiohttp connections: unlimited (limit=0) - curl_cffi pool: 10000 max_clients - Image downloads: no concurrency limit (removed semaphore) Keep only 3 user-configurable limits via env vars: - MAX_USER_QUEUE_SIZE (default 0 = no limit) - STREAMING_DURATION_THRESHOLD (default 300s) - MAX_VIDEO_DURATION (default 0 = no limit)

…adata TikTok's browser impersonation (impersonate=True) doesn't work through HTTP proxies, causing extraction to fail with "Unable to extract webpage video data". Changed approach: - Use direct connection (no proxy) for video info extraction with impersonate - Use proxy for media downloads to hide server IP This fixes the issue where all proxy attempts would fail due to TikTok's JavaScript challenge blocking non-browser requests through proxies.

zam-review · 2026-01-16T05:37:26Z

PR Description updated to latest commit (8745d5a)

karilaa-dev · 2026-01-16T05:43:25Z

/review

zam-review · 2026-01-16T05:43:40Z

PR Reviewer Guide 🔍

(Review updated until commit `14c1abe`)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Resource Leak Potential If `yt_dlp.YoutubeDL(ydl_opts)` fails on line 864, the `old_ydl` instance will never be closed, potentially leaking resources. Consider wrapping this in a try-finally block or using a context manager pattern to ensure cleanup even if initialization fails. old_ydl = ydl ydl = yt_dlp.YoutubeDL(ydl_opts) old_ydl.close() # Close old instance after new one is ready Error Handling Change The code now returns `None, "extraction", None` instead of raising `TikTokExtractionError` when video data extraction fails (lines 888, 906). This changes the error handling contract and may confuse callers expecting exceptions. Verify that all calling code handles this return value correctly. # Validate that we got video data if not video_data: logger.error(f"No video data returned for {video_id} (status={status})") return None, "extraction", None Hardcoded Configuration Values The documentation indicates that thread pool (500 workers) and curl_cffi connections (10,000) are now hardcoded for maximum throughput, removing previous environment variables (`THREAD_POOL_SIZE`, `MAX_USER_QUEUE_SIZE`, `MAX_CONCURRENT_IMAGES`). This reduces deployment flexibility and should be validated against different deployment scenarios. \| `MAX_USER_QUEUE_SIZE` \| 0 \| Max concurrent per user (0=unlimited) \| \| `MAX_VIDEO_DURATION` \| 0 \| Max video duration (seconds, 0=unlimited) \| \| `STREAMING_DURATION_THRESHOLD` \| 300 \| Stream videos longer than this (seconds) \| \| `LOG_LEVEL` \| INFO \| Logging level \| Note: Thread pool (500 workers) and curl_cffi connections (10,000) are hardcoded for maximum throughput.

Create the new YoutubeDL instance before closing the old one to ensure we have a valid ydl even if initialization fails.

karilaa-dev · 2026-01-16T05:51:46Z

/review

zam-review · 2026-01-16T05:52:01Z

Persistent review updated to latest commit 248c050

Return extraction error if video_data is None despite a non-error status code, preventing downstream issues from invalid data.

karilaa-dev · 2026-01-16T05:57:55Z

/review

zam-review · 2026-01-16T05:58:10Z

Persistent review updated to latest commit 14c1abe

karilaa-dev added 5 commits January 15, 2026 21:33

Change positions of values in .env.example

7295e49

Add Telegram API credentials to .env.example

8d109a8

Update CODEBASE_MAP.md

acff6be

karilaa-dev force-pushed the dev branch from 8aad275 to 8745d5a Compare January 16, 2026 05:33

zam-review bot added the Review effort 3/5 label Jan 16, 2026

Improve error handling in ydl recreation for proxy extraction

248c050

Create the new YoutubeDL instance before closing the old one to ensure we have a valid ydl even if initialization fails.

Add video_data validation after TikTok extraction

14c1abe

Return extraction error if video_data is None despite a non-error status code, preventing downstream issues from invalid data.

karilaa-dev merged commit 9fd94b7 into main Jan 16, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix TikTok extraction with proxies#60

Fix TikTok extraction with proxies#60
karilaa-dev merged 7 commits intomainfrom
dev

karilaa-dev commented Jan 16, 2026 •

edited by zam-review bot

Loading

Uh oh!

zam-review bot commented Jan 16, 2026

Uh oh!

karilaa-dev commented Jan 16, 2026

Uh oh!

zam-review bot commented Jan 16, 2026 •

edited

Loading

Uh oh!

karilaa-dev commented Jan 16, 2026

Uh oh!

zam-review bot commented Jan 16, 2026

Uh oh!

karilaa-dev commented Jan 16, 2026

Uh oh!

zam-review bot commented Jan 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

karilaa-dev commented Jan 16, 2026 • edited by zam-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Summary

Test plan

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

zam-review bot commented Jan 16, 2026

Uh oh!

karilaa-dev commented Jan 16, 2026

Uh oh!

zam-review bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 14c1abe)

Uh oh!

karilaa-dev commented Jan 16, 2026

Uh oh!

zam-review bot commented Jan 16, 2026

Uh oh!

karilaa-dev commented Jan 16, 2026

Uh oh!

zam-review bot commented Jan 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

karilaa-dev commented Jan 16, 2026 •

edited by zam-review bot

Loading

zam-review bot commented Jan 16, 2026 •

edited

Loading

(Review updated until commit `14c1abe`)