Skip to content

RetroBrowser is a fully functional web browser designed to run on Windows 98, combining classic architecture (Win32 API, GDI, Winsock 2.2) with modern web accessibility (HTTPS, HTTP/2) through an intelligent Python-based proxy server.

License

Notifications You must be signed in to change notification settings

Dung2005qk/RetroBrowser_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BTL Hệ Điều Hành


🌐 RetroBrowser Project

Windows 98 Retro Web Browser with Modern Connectivity

A Comprehensive Operating Systems Course Project

Win98 Compatible C++98 Python 3.x Test Coverage

Tổng QuanKiến TrúcTính NăngCài ĐặtSử DụngTesting


📋 Mục Lục


🎯 Tổng Quan

Giới Thiệu Dự Án

RetroBrowser là một trình duyệt web đầy đủ chức năng được thiết kế để chạy trên Windows 98, kết hợp giữa kiến trúc cổ điển (Win32 API, GDI, Winsock 2.2) và khả năng truy cập web hiện đại (HTTPS, HTTP/2) thông qua một proxy server Python thông minh.

Dự án này là minh chứng cho việc áp dụng các khái niệm Hệ Điều Hành vào thực tế:

  • Process Management: Multi-threaded image loading
  • Memory Management: RAII pattern, resource lifecycle
  • Inter-Process Communication: Browser ↔ Proxy qua TCP/IP
  • File I/O: Cache management, log handling
  • Concurrency: Async image downloads, event-driven architecture
  • API Programming: Win32 API, Winsock, GDI mastery

Mục Tiêu Học Thuật

1. Win32 API Mastery

  • Window management (CreateWindowEx, message loop)
  • Event handling (WM_PAINT, WM_SIZE, WM_COMMAND)
  • GDI rendering (TextOut, BitBlt, double-buffering)
  • Controls (Edit, Button, Static) và layout

2. Network Programming (Winsock 2.2)

  • Socket initialization (WSAStartup)
  • TCP connection management
  • HTTP/1.1 protocol implementation
  • Blocking I/O model
  • Error handling (timeout, connection refused)

3. Concurrency & Threading

  • Worker threads cho async image loading
  • Thread synchronization (PostMessage)
  • Resource sharing (image cache)
  • Deadlock prevention

4. Memory Management

  • RAII pattern (Resource Acquisition Is Initialization)
  • Manual memory management (new/delete)
  • GDI object lifecycle (HFONT, HBITMAP, HDC)
  • Memory leak prevention

5. Parser Design

  • Finite State Machine (FSM)
  • Single-pass parsing
  • Error recovery
  • String processing

Phạm Vi & Giới Hạn

✅ Trong Phạm Vi (Implemented)

  • HTTP GET requests thông qua proxy
  • HTML parsing cơ bản (H1-H6, P, A, IMG)
  • Text rendering với multiple fonts
  • Hyperlink navigation (clickable links)
  • Asynchronous image loading (BMP format)
  • Progressive rendering
  • Scrolling support
  • Error handling (404, timeout, network errors)

⚠️ Giới Hạn (Out of Scope)

  • No CSS support: Chỉ basic inline styles
  • No JavaScript: Security và compatibility reasons
  • No POST/Forms: Chỉ GET requests
  • No HTTPS direct: Proxy handles TLS
  • Limited image formats: BMP only (Win98 native)
  • No tabs: Single window design
  • No bookmarks/history UI: Có history stack nhưng no UI

🎯 Lý Do Thiết Kế

  1. Time constraint: 8 tuần development
  2. Win98 limitations: 200MHz CPU, 64MB RAM
  3. Educational focus: OS concepts over feature completeness
  4. Stability priority: Simple = Stable on legacy hardware

🏗️ Kiến Trúc Hệ Thống

Sơ Đồ Tổng Quan

┌─────────────────────────────────────────────────────────────────────────┐
│                         WIN98 VIRTUAL MACHINE                           │                                                                         │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                    RetroBrowser.exe (C++)                         │  │
│  │  ┌─────────────┐  ┌──────────────┐  ┌────────────┐  ┌────────┐    │  │
│  │  │   UI Module │──│ Main/Core    │──│   Network  │──│ Parser │    │  │
│  │  │  (Win32 GUI)│  │(Orchestrator)│  │  (Winsock) │  │ (FSM)  │    │  │
│  │  └──────┬──────┘  └──────┬───────┘  └─────┬──────┘  └────┬───┘    │  │
│  │         │                │                │              │        │  │
│  │         │                └─────────┬──────┘              │        │  │
│  │         │                          │                     │        │  │
│  │         │                          ▼                     ▼        │  │
│  │         │                    ┌─────────────┐        ┌──────────┐  │  │
│  │         └───────────────────►│  Renderer   │◄───────│  Blocks  │  │  │
│  │                              │  (GDI 2D)   │        │  (Data)  │  │  │
│  │                              └─────────────┘        └──────────┘  │  │  
│  └───────────────────────────────────┬───────────────────────────────┘  │
│                                      │ HTTP/1.0 over TCP                │
└──────────────────────────────────────┼──────────────────────────────────┘
                                       │
                                       │ Host-Only Network
                                       │
┌──────────────────────────────────────▼─────────────────────────────────┐
│                         HOST MACHINE (Modern OS)                       │                                                                        │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │                    Python Proxy Server                           │  │
│  │  ┌──────────────┐  ┌──────────────┐  ┌─────────────────────────┐ │  │
│  │  │   Socket     │──│  Sanitizer   │──│   HTTP/HTTPS Client     │ │  │
│  │  │  Listener    │  │  (BS4/Regex) │  │   (Requests Library)    │ │  │
│  │  │  (8080)      │  │              │  │                         │ │  │
│  │  └──────────────┘  └──────────────┘  └────────────┬────────────┘ │  │
│  └───────────────────────────────────────────────────┼──────────────┘  │
│                                                      │ HTTPS/HTTP2     │
└──────────────────────────────────────────────────────┼─────────────────┘
                                                       │
                                                       ▼
                                                ╔═══════════════╗
                                                ║  Modern Web   ║
                                                ║  (Internet)   ║
                                                ╚═══════════════╝

Luồng Xử Lý

1. User Input Flow

User types URL → Address Bar (Edit Control)
               ↓
User clicks "Go" → WM_COMMAND message
               ↓
UI Module → OnNavigate callback
               ↓
Main/Core validates URL → NetworkManager

2. Network Flow

NetworkManager::FetchUrl(url)
               ↓
Create TCP socket (Winsock)
               ↓
Connect to proxy (127.0.0.1:8080)
               ↓
Send: GET /http://example.com HTTP/1.0
               ↓
Proxy fetches via HTTPS
               ↓
Proxy sanitizes HTML (remove scripts/styles)
               ↓
NetworkManager receives response
               ↓
Return HttpResponse struct

3. Parsing Flow

HtmlParser::Parse(html_vector)
               ↓
FSM single-pass through HTML
               ↓
Build flat block list:
  [{type: H1, content: "Title"},
   {type: P, content: "Text"},
   {type: IMG, src: "image.bmp"}]
               ↓
Return ParseResult

4. Rendering Flow

Renderer::SetContent(blocks)
               ↓
Phase 1 - Calculate Layout:
  Measure text, compute positions
               ↓
Phase 2 - Paint on WM_PAINT:
  Draw to offscreen DC
  BitBlt to screen (flicker-free)

Các Module Chính

📦 1. Core/Main Module (main.cpp 585 dòng, stdafx.h 222 dòng, stdafx.cpp 1 dòng)

Nhiệm vụ: Orchestrator - điều phối tất cả modules

Chức năng:

  • WinMain entry point
  • Callback routing (OnNavigate, OnLinkClick)
  • State management (current URL, history)
  • Image loading orchestration
  • Error handling dialogs

🖥️ 2. UI Module (ui.h 296 dòng, ui.cpp 606 dòng)

Nhiệm vụ: Win32 GUI management

Chức năng:

  • Window creation (CreateWindowEx)
  • Controls: Address bar, Go button, Status bar
  • Message loop (GetMessage/DispatchMessage)
  • Event handling (WM_PAINT, WM_SIZE, WM_COMMAND)
  • Custom messages (UIM_IMAGE_LOADED, etc.)

🌐 3. Network Module (network.h 411 dòng, network.cpp 748 dòng)

Nhiệm vụ: HTTP communication qua Winsock

Chức năng:

  • Winsock lifecycle (WSAStartup/WSACleanup)
  • HTTP/1.0 client implementation
  • Blocking socket I/O
  • Response parsing
  • Binary-safe data handling
  • Error code mapping

📝 4. Parser Module (parser.h 900 dòng, parser.cpp 1,513 dòng)

Nhiệm vụ: HTML → Structured blocks

Chức năng:

  • Finite State Machine (11 states)
  • Single-pass O(n) parsing
  • Forgiving error recovery
  • Flat block list output (no DOM tree)
  • Entity decoding (& → &)
  • Tag/attribute extraction

🎨 5. Renderer Module (renderer.h 276 dòng, renderer.cpp 1,130 dòng)

Nhiệm vụ: GDI rendering engine

Chức năng:

  • Two-phase (Layout + Paint) architecture
  • Double-buffering (flicker-free)
  • Font management (H1/H2/P/A caching)
  • Image cache (LRU eviction)
  • Scrolling support
  • Click detection for links

✨ Tính Năng Nổi Bật

📝 HTML Support

RetroBrowser implements comprehensive HTML parsing capabilities optimized for Windows 98:

Comprehensive Tag Support - H1-H3, P, A, IMG, BR, UL, LI, DIV, SPAN
Semantic HTML5 Tags - header, footer, nav, article, section, aside, main (mapped to generic containers)
Table Content Preservation - TABLE, TR, TD, TH, TBODY, THEAD, TFOOT mapped to DIV for content flow
HTML Entity Decoding - &, <, >, ", numeric entities (&#..;, &#x..;)
Forgiving Parser - FSM with 11 states, error recovery, handles malformed HTML gracefully
Single-Pass O(n) Parsing - Efficient parsing without backtracking or recursion
Warning System - Non-fatal warnings for skipped content (scripts, unsupported tags)

🎨 CSS Styling

Modern inline CSS styling support brings color and visual richness to Win98:

Inline Style Attributes - color, background-color, font-weight, font-style, font-size
Color Format Support - Named colors (red, green, blue), hex (#RGB, #RRGGBB), rgb(r,g,b)
Legacy HTML 3.2 Attributes - BGCOLOR, TEXT, COLOR on BODY/FONT tags
CSS Property Extraction - Parser extracts CSS into HtmlBlock structures for fast rendering
Per-Block Styling - Custom text and background colors for each content block
Page-Level Styling - BODY tag attributes set global page colors
GDI Integration - CSS properties map directly to Win32 GDI functions

Limitations: No external stylesheets, no CSS selectors, no cascading - inline styles only

🖼️ Image Handling

Sophisticated asynchronous image loading system optimized for Win98's limited resources:

Async Multi-Threaded Loading - Worker threads download images without blocking UI
BMP Format Support - Native Win98 GDI format via LoadImageA() for zero decoding overhead
LRU Cache with Eviction - Bounded memory usage (50 images max) with automatic cleanup
GDI BitBlt Rendering - Hardware-accelerated image drawing with StretchBlt scaling
Placeholder Rendering - Empty rectangles (150x100px) reserve space during loading
Alt Text Fallback - Display alt attribute text when images fail to load
Progressive Display - Images appear as they load, page remains usable immediately
Memory Management - RAII pattern ensures no HBITMAP leaks, respects Win98 GDI heap limits

Constraints: Max 4096x4096 pixels, 10MB file size recommended, BMP only (JPEG/PNG future)

🎨 Rendering Engine

High-performance GDI-based rendering engine achieving 60fps scrolling on 200MHz CPUs:

Two-Phase Architecture - Separate layout calculation (expensive, rare) and paint (fast, frequent)
Double-Buffering - Offscreen DC rendering eliminates flicker, atomic BitBlt to screen
Font Caching - Pre-created fonts for H1/H2/H3/P/A avoid expensive CreateFont() calls
DPI-Aware Sizing - Font sizes adapt to display DPI using MulDiv formula
Clipping Optimization - Only render visible blocks (10x speedup on large pages)
Vertical Scrolling - Smooth pixel-perfect scrolling with Win32 scrollbar integration
Hit-Testing - Precise click detection for hyperlinks using pre-computed clickable areas
Custom Colors - Per-block text and background colors from CSS properties
Link Styling - Blue underlined text for hyperlinks with hover support

Performance: 15-20ms paint time for 20 visible blocks, 100-150ms layout for 200-block page

🌐 Network Layer

Robust HTTP/1.0 client with intelligent proxy integration for modern web access:

HTTP/1.0 Protocol - Full GET request implementation via Winsock 2.2
Proxy Integration - Transparent HTTPS/HTTP2 support through Python proxy server
Intelligent Sanitization - XSS prevention, script/style removal, dangerous attribute filtering
Error Handling - Timeout (30s), connection refused, 404/500 status codes with clear messages
Binary-Safe - Handles binary image data and text content correctly
Blocking I/O - Simple socket model suitable for Win98 single-threaded architecture
Connection Management - Proper socket lifecycle (WSAStartup, connect, send, recv, closesocket)

Architecture: Browser (Win98) ↔ TCP/IP ↔ Proxy (Modern OS) ↔ HTTPS ↔ Internet


🎨 CSS Styling Support

RetroBrowser includes support for inline CSS styling, enabling modern web pages to render with proper colors, fonts, and visual styling on Windows 98. This feature allows the browser to display styled content from contemporary websites while maintaining compatibility with legacy hardware.

Inline Style Attributes

The parser extracts and applies the following CSS properties from style attributes:

Property Description Example Values
color Text color "red", "#00FF00", "rgb(0,255,0)"
background-color Block background color "black", "#FFF", "rgb(255,255,255)"
font-weight Text weight (normal/bold) "normal", "bold", "700"
font-style Text style (normal/italic) "normal", "italic"
font-size Text size in pixels "16px", "24px"

Usage Example:

<p style="color: green; background-color: black; font-weight: bold;">
  Styled text on Win98!
</p>

Color Format Support

The CSS color parser supports three standard color formats:

1. Named Colors

Standard HTML/CSS color names:

  • "red", "green", "blue", "black", "white"
  • "yellow", "cyan", "magenta", "gray"
  • And other standard CSS named colors

2. Hexadecimal Colors

  • Short format: #RGB (e.g., #F00 for red, #0F0 for green)
  • Full format: #RRGGBB (e.g., #FF0000 for red, #00FF00 for green)

3. RGB Function

  • Format: rgb(r, g, b) where r, g, b are 0-255
  • Example: rgb(255, 0, 0) for red, rgb(0, 255, 0) for green

Legacy HTML 3.2 Attributes

For compatibility with older HTML, the parser also supports legacy color attributes:

Tag Attribute Purpose Example
<body> bgcolor Page background color <body bgcolor="#FFFFFF">
<body> text Default text color <body text="#000000">
<font> color Text color override <font color="red">Text</font>

These attributes are automatically converted to the internal CSS property representation during parsing.

CSS Limitations

RetroBrowser implements a minimal CSS subset optimized for Win98 GDI rendering. The following features are NOT supported:

External Stylesheets - No <link rel="stylesheet"> support
Style Blocks - No <style> tag parsing
CSS Selectors - No class/ID/element selectors
Cascading - No inheritance or specificity rules
Advanced Properties - No margins, padding, borders, positioning

Supported: Only inline style attributes on individual HTML elements

Rationale: This design prioritizes simplicity and performance on 200MHz CPUs with 64MB RAM, while still enabling modern websites to display with basic visual styling.

Architecture: CSS Property Extraction

The parser implements a two-stage CSS processing pipeline:

Stage 1: Attribute Parsing

When the parser encounters a tag with a style attribute:

<p style="color: green; font-weight: bold;">Text</p>

The parser extracts the style string: "color: green; font-weight: bold;"

Stage 2: Property Extraction into HtmlBlock

The style string is parsed into individual CSS properties and stored directly in the HtmlBlock structure:

struct HtmlBlock {
    BlockType type;           // BLOCK_P
    std::string content;      // "Text"
    
    // CSS properties (extracted from style attribute)
    int textColor;            // RGB(0, 255, 0) - green
    int backgroundColor;      // -1 (default)
    int fontWeight;           // FW_BOLD (700)
    BOOL fontItalic;          // FALSE
    int fontSize;             // 0 (default)
};

Design Benefits:

  • Performance: CSS properties are pre-parsed once during HTML parsing, not repeatedly during rendering
  • Simplicity: Renderer receives ready-to-use GDI COLORREF values, no string parsing needed
  • Memory Efficiency: Properties stored as integers (4 bytes each) instead of strings
  • Type Safety: Invalid CSS values fallback to defaults (-1 sentinel) at parse time

GDI Integration: The extracted properties map directly to Win32 GDI functions:

  • textColorSetTextColor(hdc, textColor)
  • backgroundColorFillRect(hdc, &rect, CreateSolidBrush(backgroundColor))
  • fontWeightCreateFont(..., fontWeight, ...)
  • fontItalicCreateFont(..., fontItalic, ...)
  • fontSizeCreateFont(fontSize, ...)

This architecture enables RetroBrowser to render styled modern web content efficiently on Windows 98 hardware.


HTML Parsing Capabilities

RetroBrowser implements a robust, forgiving HTML parser based on a Finite State Machine (FSM) architecture. The parser is designed to handle real-world HTMLincluding malformed, legacy, and modern markupwhile maintaining excellent performance on Windows 98 hardware.

Supported HTML Tags

The parser recognizes and processes a comprehensive set of HTML elements:

Basic Structure Tags

  • Headings: <h1>, <h2>, <h3> - Hierarchical section headers
  • Paragraphs: <p> - Standard body text blocks
  • Line Breaks: <br> - Force new line in text flow
  • Generic Containers: <div>, <span> - Block and inline containers

Hyperlinks and Media

  • Links: <a href="..." - Clickable hyperlinks with navigation
  • Images: <img src="..." alt="..."> - Embedded images with fallback text

Lists

  • Unordered Lists: <ul> - List container
  • List Items: <li> - Individual bullet points

Semantic HTML5 Tags

RetroBrowser supports modern semantic HTML5 elements, treating them as generic containers while preserving their content:

  • <header> - Page or section header
  • <footer> - Page or section footer
  • <nav> - Navigation section
  • <article> - Self-contained content
  • <section> - Thematic grouping
  • <aside> - Sidebar or tangential content
  • <main> - Primary page content

Implementation Note: Semantic HTML5 tags are mapped to BLOCK_DIV internally, allowing modern websites to render correctly while maintaining a simple rendering model.

Table Support

The parser recognizes HTML table elements and maps them to DIV containers to preserve content flow:

Table Tag Purpose Rendering Behavior
<table> Table container Mapped to DIV (block container)
<tr> Table row Mapped to DIV (preserves row content)
<td> Table data cell Mapped to DIV (displays cell content)
<th> Table header cell Mapped to DIV (displays header content)
<tbody> Table body section Mapped to DIV (groups body rows)
<thead> Table header section Mapped to DIV (groups header rows)
<tfoot> Table footer section Mapped to DIV (groups footer rows)

Rationale: Full table layout (rows, columns, borders) requires complex CSS box model calculations that would be prohibitively slow on 200MHz CPUs. By mapping tables to DIVs, the parser ensures table content remains accessible and readable, even if the visual table structure is lost. This design prioritizes content accessibility over visual fidelity.

Parser Architecture: Finite State Machine (FSM)

The parser uses a single-pass FSM with 11 states for optimal performance:

FSM States

  1. STATE_DATA - Reading text content between tags
  2. STATE_TAG_OPEN - Just encountered < character
  3. STATE_TAG_NAME - Reading tag name (e.g., "p", "div", "img")
  4. STATE_BEFORE_ATTR_NAME - Whitespace before attribute name
  5. STATE_ATTR_NAME - Reading attribute name (e.g., "href", "src")
  6. STATE_AFTER_ATTR_NAME - After attribute name, before =
  7. STATE_BEFORE_ATTR_VALUE - After =, before attribute value
  8. STATE_ATTR_VALUE_DOUBLE_QUOTED - Inside "value"
  9. STATE_ATTR_VALUE_SINGLE_QUOTED - Inside 'value'
  10. STATE_ATTR_VALUE_UNQUOTED - Reading unquoted attribute value
  11. STATE_COMMENT - Inside <!-- comment -->
  12. STATE_SKIP_CONTENT - Skipping <script> or <style> content

Performance Characteristics

  • Time Complexity: O(n) single-pass parsing, where n = HTML byte count
  • Space Complexity: O(m) output blocks, where m = number of content blocks (typically m << n)
  • No Recursion: Stack-safe for Win98's 1MB default stack limit
  • No Backtracking: Cursor advances forward only, never revisits input
  • Target Performance: Parse 100KB HTML in <100ms on 200MHz Pentium MMX

Design Benefits:

  • Predictable Performance: Linear time complexity ensures consistent behavior
  • Memory Efficient: Single-pass eliminates need for intermediate parse trees
  • Win98 Safe: No deep recursion that could overflow limited stack space
  • Maintainable: Clear state transitions make debugging straightforward

HTML Entity Decoding

The parser automatically decodes common HTML entities to their character equivalents:

Supported Entities

Entity Character Description
&amp; & Ampersand
&lt; < Less than
&gt; > Greater than
&quot; " Double quote
&#39; ' Single quote (apostrophe)
&#...; (varies) Numeric character reference (decimal)
&#x...; (varies) Numeric character reference (hexadecimal)

Implementation: Entity decoding is performed during the text extraction phase, ensuring that the renderer receives clean, ready-to-display text without needing to re-process entities.

Unknown Entities: If the parser encounters an unrecognized entity (e.g., &rarr;, &copy;), it:

  1. Logs a warning to the debug output
  2. Leaves the entity as-is in the content (e.g., displays "→" literally)
  3. Continues parsing without errors

Forgiving Parser with Error Recovery

RetroBrowser's parser adopts a "best-effort" philosophy to handle the chaotic reality of web HTML:

Error Handling Strategies

1. Unclosed Tags

  • Parser auto-closes tags when encountering a new block-level tag, extracting maximum displayable content.

2. Missing Quotes

  • Parser accepts unquoted attribute values (stops at whitespace or >), following HTML5 lenient parsing rules.

3. Malformed Attributes

  • Best-effort extraction; logs warning but continues parsing. Invalid attributes may be skipped or partially extracted.

4. Unknown/Unsupported Tags

  • Silently skips unsupported tags, extracts any text content inside, logs warning for debugging.

5. Embedded Scripts and Styles

  • Entire <script> and <style> blocks (including content) are skipped and logged as warnings. This prevents script execution and simplifies parsing.

Warning System

The parser maintains a non-fatal warning log that records issues encountered during parsing:

Warning Categories:

  • Skipped tags: "Skipped <script> content"
  • Unknown entities: "Unknown HTML entity: &rarr; (kept as-is)"
  • Malformed attributes: "Malformed attribute: href (missing value)"
  • Unsupported elements: "Skipped unsupported tag: <marquee>"

Usage: Warnings are returned in ParseResult.warnings vector and can be logged to debug output (OutputDebugString) for troubleshooting. They do not affect parsing success status.

Philosophy: "Show something rather than nothing". The parser prioritizes stability and partial content display over strict correctness. A page with 90% content displayed is better than a blank screen due to a single malformed tag.

Flat Block List Output (No DOM Tree)

Unlike traditional browsers, RetroBrowser does not build a hierarchical Document Object Model (DOM):

Design Rationale:

  • Memory Efficiency: No pointers, no tree nodes, minimal allocation
  • Rendering Simplicity: Renderer uses single for-loop, no recursion
  • Win98 Constraints: Avoids deep recursion (stack overflow risk)
  • Performance: O(n) rendering instead of O(n log n) tree traversal

Trade-off: Container tags (<div>, <section>) are discarded, only content blocks remain. This is acceptable for a text-focused browser where layout complexity is intentionally limited.


Image Handling

RetroBrowser implements a sophisticated asynchronous image loading system optimized for Windows 98's limited resources. Images are downloaded and rendered progressively without blocking the UI thread, providing a smooth browsing experience even on 200MHz CPUs.

Format Support

Primary Format: BMP (Bitmap)

  • Native Win98 Support: BMP is the native image format for Windows 98 GDI
  • Zero Decoding Overhead: Direct loading via LoadImageA() Win32 API
  • Maximum Compatibility: Works on all Win98 systems without additional codecs
  • File Size Trade-off: Larger files but instant rendering on legacy hardware

Why BMP Only?

  • GDI Integration: Win98 GDI has built-in BMP support via LoadImageA() and BitBlt()
  • Memory Constraints: JPEG/PNG decoders require 500KB+ of additional code and runtime memory
  • CPU Limitations: Software JPEG decoding takes 2-5 seconds per image on 200MHz Pentium
  • Stability Priority: Native format support eliminates third-party library dependencies

Future Formats (via libjpeg integration):

  • JPEG support planned but requires careful memory management on Win98
  • PNG support possible via libpng but adds significant binary size

Async Multi-Threaded Loading

RetroBrowser uses a worker thread pool to download images without blocking the UI:

Architecture Overview

Main Thread (UI)              Worker Threads (1-5)           Network Layer
     │                              │                              │
     ├─ Parse HTML                  │                              │
     ├─ Render page (no images)     │                              │
     ├─ Spawn worker threads ───────┤                              │
     │                              │                              │
     │                              ├─ Download image URL ─────────┤
     │                              │                              │
     │                              │◄─ Receive image data ────────┤
     │                              │                              │
     │                              ├─ Save to temp file           │
     │                              ├─ LoadImageA() (decode BMP)   │
     │                              ├─ PostMessage(WM_USER_IMAGE)  │
     │                              │                              │
     │◄─ WM_USER_IMAGE_LOADED ──────┤                              │
     ├─ NotifyImageLoaded()         │                              │
     ├─ InvalidateRect() (repaint)  │                              │
     └─ Render with new image       │                              │

Implementation Details

Thread Creation (main.cpp:LoadImagesForPage()):

  • Scans parsed HTML blocks for <img> tags
  • Spawns one worker thread per image (up to 5 concurrent)
  • Each thread receives ImageLoadRequest struct with URL and window handle

Worker Thread (main.cpp:ImageLoadThreadProc()):

  1. Download: Fetch image data via NetworkManager::FetchUrl()
  2. Temporary Storage: Write binary data to C:\TEMP\retro_img_XXXXX.bmp
  3. Decode: Call LoadImageA() to create HBITMAP from file
  4. Notify: PostMessage(hwnd, WM_USER_IMAGE_LOADED, hBitmap, url)
  5. Cleanup: Delete temporary file, free request struct

Main Thread Handler (main.cpp:WndProc()):

  • Receives WM_USER_IMAGE_LOADED message
  • Calls renderer.NotifyImageLoaded(url, hBitmap) to cache bitmap
  • Calls InvalidateRect() to trigger repaint with new image

Benefits:

  • Non-Blocking: Page displays immediately, images appear progressively
  • Responsive UI: User can scroll/click while images load
  • Parallel Downloads: Multiple images load simultaneously (network permitting)
  • Graceful Degradation: Page remains usable if images fail to load

GDI Rendering and Caching

Rendering Pipeline

BitBlt-Based Rendering (renderer.cpp:RenderBlock()):

  1. Lookup: Check image cache for HBITMAP by URL
  2. Create Memory DC: CreateCompatibleDC() for offscreen bitmap
  3. Select Bitmap: SelectObject(hdcMem, hBitmap)
  4. Scale and Draw: StretchBlt() to fit layout bounds (respects width/height attributes)
  5. Cleanup: Delete memory DC (bitmap remains in cache)

Why BitBlt?

  • Hardware Acceleration: Win98 GDI accelerates BitBlt on most video cards
  • Fast: 10-20ms for 800x600 image on 200MHz CPU (vs 100ms+ for software scaling)
  • Flicker-Free: Combined with double-buffering for smooth rendering

LRU Cache Eviction

Cache Strategy (renderer.h:MAX_IMAGE_CACHE_SIZE = 50):

  • Bounded Memory: Limit cache to 50 images (~5-10MB typical)
  • LRU Tracking: m_imageCacheLRU vector maintains access order
  • Eviction: When cache exceeds limit, delete least recently used HBITMAP
  • Automatic Cleanup: Unreferenced images evicted on page navigation

Cache Operations:

// Cache Hit (renderer.cpp:LoadImage())
if (m_imageCache.find(url) != m_imageCache.end()) {
    // Move to front of LRU list
    m_imageCacheLRU.erase(find(url));
    m_imageCacheLRU.insert(begin(), url);
    return cached_bitmap;
}

// Cache Miss
return NULL; // Triggers async load via PostMessage

// Cache Insert (renderer.cpp:NotifyImageLoaded())
m_imageCache[url] = hBitmap;
m_imageCacheLRU.insert(begin(), url);

// LRU Eviction
while (m_imageCache.size() > MAX_IMAGE_CACHE_SIZE) {
    string evict_url = m_imageCacheLRU.back();
    DeleteObject(m_imageCache[evict_url]);
    m_imageCache.erase(evict_url);
    m_imageCacheLRU.pop_back();
}

Memory Management:

  • RAII Pattern: Bitmaps deleted in destructor or eviction
  • No Leaks: All HBITMAP handles tracked and released
  • Win98 Safe: Respects 64KB GDI heap limit (50 bitmaps ≈ 5KB GDI heap)

Placeholder Rendering and Alt Text

Fallback Behavior (when image unavailable):

Placeholder Rendering

  • Dimensions: 150x100 pixels (configurable via IMAGE_PLACEHOLDER_WIDTH/HEIGHT)
  • Visual: Empty rectangle with border (future: gray background with "?" icon)
  • Layout Preservation: Placeholder reserves space to prevent layout shift when image loads

Alt Text Fallback

  • Source: <img alt="description"> attribute
  • Rendering: Plain text in placeholder area using default font
  • Color: Black text on white background (readable on all displays)
  • Word Wrap: DT_WORDBREAK flag ensures long alt text fits in placeholder

Implementation (renderer.cpp:RenderBlock()):

case Parser::BLOCK_IMG: {
    HBITMAP hBitmap = LoadImage(src);
    if (hBitmap) {
        // Draw image with StretchBlt
        StretchBlt(hdc, x, y, width, height, ...);
    } else {
        // Fallback: Draw alt text
        SetTextColor(hdc, TEXT_COLOR);
        DrawText(hdc, item.content.c_str(), -1, &rect, DT_WORDBREAK);
    }
}

User Experience:

  • Progressive Enhancement: Page usable before images load
  • Accessibility: Alt text provides context for failed/slow images
  • Bandwidth Awareness: User sees content immediately on slow dial-up connections

Image Size Limits and Win98 Constraints

Hard Limits

Maximum Image Dimensions:

  • Width: 4096 pixels (GDI coordinate limit)
  • Height: 4096 pixels (GDI coordinate limit)
  • File Size: 10 MB recommended (network timeout at 30 seconds)
  • Memory: ~5 MB per decoded image (800x600x24bpp ≈ 1.4 MB)

Win98 System Constraints:

  • Total RAM: 64-128 MB typical (OS uses 32 MB, leaving 32-96 MB for apps)
  • GDI Heap: 64 KB limit (~10,000 handles total, shared across all apps)
  • Virtual Memory: 2 GB per process (but swap to HDD is very slow)
  • Network Buffer: 64 KB per socket (limits concurrent large downloads)

Practical Recommendations

Optimal Image Sizes for Win98:

  • Thumbnails: 100x100 pixels, <50 KB
  • Content Images: 400x300 pixels, <200 KB
  • Full Width: 800x600 pixels, <500 KB (matches typical Win98 screen resolution)
  • Avoid: Images >1 MB (slow download, high memory usage)

Performance Impact:

Image Size Download (56K modem) Decode Time Memory Usage
50 KB (100x100) 7 seconds <100 ms ~30 KB
200 KB (400x300) 28 seconds ~300 ms ~350 KB
500 KB (800x600) 71 seconds ~800 ms ~1.4 MB
2 MB (1600x1200) 284 seconds ~3 seconds ~5.5 MB

Error Handling:

  • Timeout: Network requests abort after 30 seconds
  • Out of Memory: LoadImageA() returns NULL, falls back to alt text
  • Corrupt Data: GDI silently fails, renderer displays alt text
  • Disk Full: Temp file creation fails, image skipped (logged to debug output)

Best Practices for Win98 Compatibility:

  1. Optimize Images: Use BMP with 256 colors (8-bit) instead of 24-bit for smaller files
  2. Lazy Loading: Only load images in viewport (future enhancement)
  3. Compression: Serve images via proxy with HTTP compression (gzip)
  4. Fallback: Always provide meaningful alt text for accessibility
  5. Testing: Test on real Win98 hardware with 64 MB RAM to verify performance

🎨 Rendering Engine

RetroBrowser's rendering engine is a high-performance GDI-based system optimized for Windows 98's limited hardware (200MHz Pentium MMX, 64MB RAM). The engine uses a sophisticated two-phase architecture that separates expensive layout calculations from fast painting operations, enabling smooth 60fps scrolling even on legacy hardware.

Two-Phase Rendering Architecture

The renderer strictly separates layout computation from painting to achieve optimal performance on Win98:

Phase 1: Layout Calculation (Expensive, Infrequent)

When Triggered:

  • New page loaded (SetContent())
  • Window resized (WM_SIZE)
  • Content changes (rare after initial load)

Operations:

  1. Text Measurement: Call DrawText() with DT_CALCRECT flag to measure each block's dimensions
  2. Position Calculation: Compute absolute (x, y) coordinates for each block based on margins, spacing, and word-wrapping
  3. Display List Creation: Build m_displayList vector containing pre-computed RenderItem structs with RECT bounds
  4. Height Computation: Calculate total content height for scrollbar range

Performance Characteristics:

  • Complexity: O(n) where n = number of HTML blocks
  • Cost: ~100-150ms for 200-block page on 200MHz CPU
  • Frequency: Rare (only on content/window changes)
  • Output: Cached display list with pre-computed positions

Why Expensive?

  • DrawText() with DT_CALCRECT performs full text layout including word-wrapping, font metrics, and line breaking
  • Each block requires separate GDI call (~5ms per block on Win98)
  • Font selection overhead (SelectObject()) adds 1-2ms per block

Code Example (renderer.cpp:CalculateLayout()):

// Measure text with word-wrapping
RECT measureRect;
measureRect.left = MARGIN_LEFT;
measureRect.top = currentY;
measureRect.right = clientRect.right - MARGIN_RIGHT;
measureRect.bottom = currentY + 10000; // Large value for DT_CALCRECT

int height = DrawText(hdc, block.content.c_str(), -1, &measureRect, 
                     DT_WORDBREAK | DT_CALCRECT);

// Store pre-computed bounds in display list
RenderItem item;
item.bounds = measureRect;
m_displayList.push_back(item);

Phase 2: Paint (Fast, Frequent)

When Triggered:

  • Every WM_PAINT message
  • Scrolling (every pixel moved)
  • Partial invalidation (e.g., after image loads)

Operations:

  1. Clipping: Calculate visible region [scrollY, scrollY + clientHeight]
  2. Iteration: Loop through display list, skip blocks outside viewport
  3. Drawing: Use pre-computed RECT bounds from Phase 1, no recalculation
  4. BitBlt: Copy offscreen buffer to screen (flicker-free)

Performance Characteristics:

  • Complexity: O(v) where v = visible blocks (typically v << n)
  • Cost: ~15-20ms for 20 visible blocks on 200MHz CPU
  • Frequency: Very high (every scroll, every repaint)
  • Input: Pre-computed bounds from Phase 1 (no layout recalculation)

Why Fast?

  • No text measurement (uses cached RECT from layout phase)
  • Only draws visible blocks (clipping optimization)
  • Simple TextOut() calls instead of DrawText() with sizing
  • Font already selected (minimal SelectObject() overhead)

Code Example (renderer.cpp:Render()):

// Calculate visible range for clipping
int visibleTop = m_scrollY;
int visibleBottom = m_scrollY + clientRect.bottom;

// Render only visible blocks
for (size_t i = 0; i < m_displayList.size(); ++i) {
    const RenderItem& item = m_displayList[i];
    
    // Quick reject: skip blocks outside visible area
    if (item.bounds.bottom < visibleTop) continue; // Above viewport
    if (item.bounds.top > visibleBottom) break;    // Below viewport
    
    // Draw using pre-computed bounds (no measurement)
    RenderBlock(hdc, item);
}

Performance Impact Comparison

Without Two-Phase Separation (naive approach):

Scroll 100px → WM_PAINT → Layout 200 blocks → 150ms lag → Janky scrolling

With Two-Phase Separation (RetroBrowser):

Scroll 100px → WM_PAINT → Draw 20 visible blocks → 15ms smooth → 60fps scrolling

Rationale:

  • Win98 CPU limitation: 200MHz Pentium MMX cannot afford layout in paint loop
  • Smooth scrolling: 60fps requires <16ms per frame; layout would take 100ms+
  • GDI performance: TextOut() is fast (~1ms), DrawText() with sizing is slow (~5ms)
  • Battery efficiency: Minimize CPU cycles on every WM_PAINT message

Double-Buffering for Flicker-Free Rendering

RetroBrowser uses offscreen rendering to eliminate visual flicker during repaints:

Implementation

Offscreen Buffer Creation (renderer.cpp:Render()):

// Create memory DC and compatible bitmap
HDC hdcScreen = GetDC(hwnd);
m_memDC = CreateCompatibleDC(hdcScreen);
m_offscreenBmp = CreateCompatibleBitmap(hdcScreen, 
                                        clientRect.right, 
                                        clientRect.bottom);
m_hOldBitmap = (HBITMAP)SelectObject(m_memDC, m_offscreenBmp);

Rendering Pipeline:

  1. Clear Background: Fill offscreen DC with page background color
  2. Draw Content: Render all visible blocks to offscreen DC
  3. Atomic Copy: Single BitBlt() from offscreen DC to screen DC
  4. Result: Entire frame appears instantly, no partial updates visible

Code Example:

// Draw to offscreen buffer
HDC drawDC = m_memDC;
FillRect(drawDC, &clientRect, m_hBackgroundBrush);
for (each visible block) {
    RenderBlock(drawDC, item);
}

// Copy to screen in one atomic operation
BitBlt(hdc, 0, 0, clientRect.right, clientRect.bottom, 
       m_memDC, 0, 0, SRCCOPY);

Benefits

Without Double-Buffering:

  • User sees partial redraws (background → text → images)
  • Flicker during scrolling (screen clears then redraws)
  • Tearing artifacts on fast scrolls

With Double-Buffering:

  • ✅ Flicker-free scrolling (smooth 60fps)
  • ✅ Atomic updates (entire frame appears instantly)
  • ✅ Professional appearance (no visual artifacts)

Memory Cost:

  • Offscreen bitmap: 800x600x24bpp ≈ 1.4 MB RAM
  • Acceptable on Win98 with 64-128 MB RAM
  • Recreated on window resize (old bitmap deleted)

GDI Optimization:

  • BitBlt() is hardware-accelerated on most Win98 video cards
  • Single large blit faster than many small TextOut() calls
  • Reduces GDI calls from ~100 per frame to 1 per frame

Font Caching and DPI-Aware Sizing

RetroBrowser pre-creates and caches fonts to avoid expensive CreateFont() calls during rendering:

Font Cache Architecture

Pre-Created Fonts (renderer.cpp:Initialize()):

// Calculate DPI-adjusted font sizes
int logPixelsY = GetDeviceCaps(hdc, LOGPIXELSY);
int h1Height = -MulDiv(FONT_SIZE_H1, logPixelsY, 72); // 24pt
int h2Height = -MulDiv(FONT_SIZE_H2, logPixelsY, 72); // 18pt
int h3Height = -MulDiv(FONT_SIZE_H3, logPixelsY, 72); // 14pt
int defHeight = -MulDiv(FONT_SIZE_DEFAULT, logPixelsY, 72); // 12pt

// Create cached fonts (once at initialization)
m_hFontH1 = CreateFont(h1Height, 0, 0, 0, FW_BOLD, ...);
m_hFontH2 = CreateFont(h2Height, 0, 0, 0, FW_BOLD, ...);
m_hFontH3 = CreateFont(h3Height, 0, 0, 0, FW_BOLD, ...);
m_hFontDefault = CreateFont(defHeight, 0, 0, 0, FW_NORMAL, ...);
m_hFontLink = CreateFont(defHeight, 0, 0, 0, FW_NORMAL, TRUE, ...); // Underlined

Font Selection During Rendering (renderer.cpp:GetFontForBlockType()):

HFONT GetFontForBlockType(Parser::BlockType blockType) {
    switch (blockType) {
        case BLOCK_H1: return m_hFontH1;
        case BLOCK_H2: return m_hFontH2;
        case BLOCK_H3: return m_hFontH3;
        case BLOCK_A:  return m_hFontLink;
        default:       return m_hFontDefault;
    }
}

Performance Benefits

Without Font Caching (naive approach):

  • CreateFont() called for every block during rendering
  • ~10-20ms per CreateFont() on Win98
  • 200 blocks × 15ms = 3000ms (3 seconds!) per page render
  • GDI heap exhaustion (64KB limit, ~10,000 handles total)

With Font Caching (RetroBrowser):

  • ✅ Fonts created once at initialization (~100ms total)
  • SelectObject() during rendering (~1ms per block)
  • ✅ 200 blocks × 1ms = 200ms per page render (15x faster)
  • ✅ Only 5 font handles in GDI heap (minimal resource usage)

DPI-Aware Sizing

Formula: fontHeight = -MulDiv(pointSize, logPixelsY, 72)

Explanation:

  • pointSize: Font size in points (e.g., 12pt, 24pt)
  • logPixelsY: Vertical DPI of display (typically 96 on Win98)
  • 72: Points per inch (typography standard)
  • Negative height: Ensures consistent character height regardless of internal leading

Example Calculation (96 DPI display):

H1 (24pt): -MulDiv(24, 96, 72) = -32 pixels
H2 (18pt): -MulDiv(18, 96, 72) = -24 pixels
H3 (14pt): -MulDiv(14, 96, 72) = -19 pixels
Default (12pt): -MulDiv(12, 96, 72) = -16 pixels

Benefits:

  • ✅ Consistent appearance across different DPI settings
  • ✅ Proper scaling on high-DPI displays (rare on Win98 but future-proof)
  • ✅ Matches Windows font sizing conventions

Resource Management (RAII Pattern)

Lifecycle:

  1. Constructor: Initialize font handles to NULL
  2. Initialize(): Create all fonts, store handles
  3. Rendering: Reuse cached fonts via SelectObject()
  4. Destructor: Delete all font handles via DeleteObject()

Safety Guarantees:

  • No font leaks (all handles deleted in destructor)
  • Fallback to system font if CreateFont() fails
  • Exception-safe (destructor always called)

Vertical Scrolling and Scrollbar Integration

RetroBrowser implements smooth vertical scrolling with Win32 scrollbar integration:

Scrolling Architecture

Scroll State Management:

class HtmlRenderer {
    int m_scrollY;              // Current scroll position (document coordinates)
    int m_totalContentHeight;   // Total height of all content
    RECT m_cachedClientRect;    // Visible viewport dimensions
};

Coordinate Transformation:

// Document coordinates → Screen coordinates
int screenY = documentY - m_scrollY;

// Screen coordinates → Document coordinates
int documentY = screenY + m_scrollY;

Scrollbar Integration

Scrollbar Setup (renderer.cpp:OnResize()):

SCROLLINFO si;
si.cbSize = sizeof(SCROLLINFO);
si.fMask = SIF_RANGE | SIF_PAGE | SIF_POS;
si.nMin = 0;
si.nMax = m_totalContentHeight;           // Total content height
si.nPage = clientRect.bottom;             // Visible viewport height
si.nPos = m_scrollY;                      // Current scroll position
SetScrollInfo(hwnd, SB_VERT, &si, TRUE);

Scroll Event Handling (renderer.cpp:OnScroll()):

void OnScroll(HWND hwnd, int scrollType, int scrollPos) {
    int maxScroll = m_totalContentHeight - clientHeight;
    if (maxScroll < 0) maxScroll = 0;
    
    switch (scrollType) {
        case SB_LINEUP:   m_scrollY -= SCROLL_LINE_SIZE; break;  // 20px
        case SB_LINEDOWN: m_scrollY += SCROLL_LINE_SIZE; break;
        case SB_PAGEUP:   m_scrollY -= SCROLL_PAGE_SIZE; break;  // 200px
        case SB_PAGEDOWN: m_scrollY += SCROLL_PAGE_SIZE; break;
        case SB_THUMBTRACK: m_scrollY = scrollPos; break;
    }
    
    // Clamp to valid range
    if (m_scrollY < 0) m_scrollY = 0;
    if (m_scrollY > maxScroll) m_scrollY = maxScroll;
    
    // Update scrollbar thumb position
    SetScrollInfo(hwnd, SB_VERT, &si, TRUE);
}

Clipping Optimization

Visible Region Calculation:

int visibleTop = m_scrollY;
int visibleBottom = m_scrollY + clientRect.bottom;

for (each block in display list) {
    if (block.bounds.bottom < visibleTop) continue; // Above viewport
    if (block.bounds.top > visibleBottom) break;    // Below viewport
    
    // Block is visible, render it
    RenderBlock(hdc, block);
}

Performance Impact:

  • Without Clipping: Render all 200 blocks = 150ms
  • With Clipping: Render 20 visible blocks = 15ms (10x speedup)
  • Smooth Scrolling: 60fps (16ms per frame) achievable on 200MHz CPU

Mouse Wheel Support

Implementation (ui.cpp:WndProc()):

case WM_MOUSEWHEEL: {
    int delta = GET_WHEEL_DELTA_WPARAM(wParam);
    int scrollAmount = -delta / WHEEL_DELTA * SCROLL_LINE_SIZE;
    renderer.OnScroll(scrollAmount);
    InvalidateRect(hwnd, NULL, FALSE);
    return 0;
}

Benefits:

  • ✅ Smooth pixel-perfect scrolling
  • ✅ Keyboard support (Page Up/Down, Arrow keys)
  • ✅ Mouse wheel support (if available on Win98 hardware)
  • ✅ Scrollbar thumb dragging for quick navigation

Hit-Testing for Hyperlinks

RetroBrowser implements precise click detection for hyperlinks using pre-computed clickable areas:

Clickable Area Tracking

Data Structure:

struct ClickableArea {
    RECT bounds;        // Absolute document coordinates
    std::string href;   // Target URL
};

std::vector<ClickableArea> m_clickableAreas;

Area Registration (renderer.cpp:CalculateLayout()):

// During layout phase, register clickable areas for links
if (block.type == BLOCK_A) {
    std::map<std::string, std::string>::const_iterator hrefIt = 
        block.attributes.find("href");
    if (hrefIt != block.attributes.end() && !hrefIt->second.empty()) {
        ClickableArea area;
        area.bounds = measureRect;  // Pre-computed RECT from layout
        area.href = hrefIt->second;
        m_clickableAreas.push_back(area);
    }
}

Click Detection

Hit-Testing Algorithm (renderer.cpp:HandleClick()):

bool HandleClick(int x, int y, std::string& outHref) {
    // Transform client coordinates to document coordinates
    int documentY = y + m_scrollY;
    POINT pt = { x, documentY };
    
    // Reverse iteration for z-order (last drawn = topmost)
    for (size_t i = m_clickableAreas.size(); i > 0; --i) {
        const ClickableArea& area = m_clickableAreas[i - 1];
        if (PtInRect(&area.bounds, pt)) {
            outHref = area.href;
            return true;  // Hit detected
        }
    }
    
    return false;  // No link clicked
}

Integration with UI (ui.cpp:WndProc()):

case WM_LBUTTONDOWN: {
    int x = LOWORD(lParam);
    int y = HIWORD(lParam);
    
    std::string href;
    if (renderer.HandleClick(x, y, href)) {
        // Navigate to clicked link
        OnLinkClick(href);
    }
    return 0;
}

Visual Feedback

Link Styling:

  • Color: Blue (RGB(0, 0, 255)) for unvisited links
  • Decoration: Underlined text via m_hFontLink (underline flag set)
  • Cursor: Hand cursor on hover (future enhancement)
  • Visited Links: Purple (RGB(85, 26, 139)) - planned feature

Rendering (renderer.cpp:RenderBlock()):

case BLOCK_A: {
    SetTextColor(hdc, LINK_COLOR);  // Blue
    DrawText(hdc, item.content.c_str(), -1, &screenRect, DT_WORDBREAK);
    
    // Draw underline
    HPEN hUnderlinePen = CreatePen(PS_SOLID, 1, LINK_COLOR);
    SelectObject(hdc, hUnderlinePen);
    MoveToEx(hdc, screenRect.left, screenRect.bottom - 1, NULL);
    LineTo(hdc, screenRect.right, screenRect.bottom - 1);
    DeleteObject(hUnderlinePen);
    break;
}

Performance Characteristics

Hit-Testing Complexity:

  • Time: O(n) where n = number of clickable areas (typically <50 per page)
  • Cost: <1ms on 200MHz CPU (simple RECT intersection tests)
  • Frequency: Only on mouse click (rare compared to rendering)

Memory Overhead:

  • Each ClickableArea: 20 bytes (RECT + string pointer)
  • 50 links × 20 bytes = 1 KB (negligible)

Benefits:

  • ✅ Precise click detection (pixel-perfect)
  • ✅ Supports overlapping links (z-order handled)
  • ✅ Scroll-aware (coordinate transformation)
  • ✅ Fast response (<1ms click-to-navigation)

📂 Cấu Trúc Thư Mục

Core Source Structure

RetroBrowser_Project/
├── src/                        # Source code
│   ├── browser/                # C++ browser 
│   │   ├── core/               # Main + PCH
│   │   │   ├── main.cpp        # Orchestrator 
│   │   │   ├── stdafx.h        # Precompiled header
│   │   │   └── stdafx.cpp      # PCH implementation
│   │   ├── ui/                 # Win32 GUI 
│   │   │   ├── ui.h            # Public API 
│   │   │   └── ui.cpp          # Implementation 
│   │   ├── network/            # HTTP client 
│   │   │   ├── network.h       # API
│   │   │   └── network.cpp     # Winsock 
│   │   ├── parser/             # HTML parser 
│   │   │   ├── parser.h        # API 
│   │   │   └── parser.cpp      # FSM 
│   │   ├── renderer/           # GDI engine 
│   │   │   ├── renderer.h      # API 
│   │   │   └── renderer.cpp    # Rendering 
│   │   └── res/                # Resources
│   │       ├── app.ico         # Application icon
│   │       ├── app.rc          # Resource script
│   │       ├── hand.cur        # Hand cursor
│   │       └── resource.h      # Resource IDs
│   └── proxy/                  # Python proxy
│       ├── proxy.py            # Server logic
│       ├── config.py           # Configuration
│       └── requirements.txt    # Dependencies
├── libs/                       # External libraries
│   └── libjpeg/                # JPEG support 
├── deploy/                     # Build output
│   ├── RetroBrowser.exe        # Standard build
│   └── RetroBrowser_Win98.exe  # Win98-optimized build
└── demo/                       # Test HTML pages
    ├── *.html                  # Various test cases
    └── images/                 # Sample BMP images

Documentation & Tools

RetroBrowser_Project/
├── docs/                       # Documentation
│   ├── win98_testing_guide.md
│   ├── win98_compatibility_matrix.md
│   └── ...
└── tools/                      # Development tools
    └── test_scripts/           # Test suite

📊 Code Statistics

Component Files Lines Purpose
Core 3 808 Orchestration
UI 2 902 GUI management
Network 2 1,159 HTTP client
Parser 2 2,413 HTML → Blocks
Renderer 2 1,406 GDI rendering
Proxy 2 1,142 HTTPS bridge
Tests 8 1,600+ Quality assurance
TOTAL 21 ~9,430 Production-ready

🛠️ Cài Đặt & Triển Khai

Prerequisites

Host Machine

  • OS: Windows 10/11, Linux, or macOS
  • Python: 3.7+ with requests beautifulsoup4
  • Network: Internet connection

Win98 VM

  • Hypervisor: VirtualBox or VMware
  • OS: Windows 98 SE
  • RAM: 128 MB recommended
  • Network: Host-Only Adapter

Quick Start

1. Clone Repository

git clone https://github.com/Dung2005qk/RetroBrowser_Project.git
cd RetroBrowser_Project

2. Install Python Dependencies

pip install -r src/proxy/requirements.txt

3. Start Proxy Server

python src/proxy/proxy.py

# Output:
# ============================================================
#   Win98 Retro Browser - Intelligent Proxy Starting Up
# ============================================================
# Listening on: 0.0.0.0:8080
# ...

4. Build Browser (if needed)

# Use VS Code task (Ctrl+Shift+B)
# Or compile manually with VC++ 6.0
# Binary available in deploy/RetroBrowser.exe

5. Run in Win98 VM

1. Copy deploy/RetroBrowser.exe to VM
2. Launch RetroBrowser.exe
3. Enter URL: http://example.com
4. Click "Go" button

Detailed Build Instructions

Method 1: VS Code Task (Recommended)

1. Open project in VS Code
2. Press Ctrl+Shift+B
3. Select "Build RetroBrowser"
4. Output: deploy/RetroBrowser.exe

Method 2: Command Line (Standard Build)

cl.exe /nologo /Fe:deploy\RetroBrowser.exe /Fo:obj\ ^
  /EHsc /MT /Zi /Od /W3 ^
  /D WIN32 /D _WINDOWS /D WINVER=0x0410 ^
  /I src\browser /I src\browser\core ^
  src\browser\core\*.cpp ^
  src\browser\ui\*.cpp ^
  src\browser\network\*.cpp ^
  src\browser\parser\*.cpp ^
  src\browser\renderer\*.cpp ^
  /link /SUBSYSTEM:WINDOWS /MACHINE:X86 ^
  kernel32.lib user32.lib gdi32.lib ws2_32.lib

Method 3: Windows 98 Optimized Build ⭐

For maximum Windows 98 compatibility, use the dedicated build script:

build_win98.bat

What it does:

  • Sets Win98-specific compiler flags (WINVER=0x0410, _WIN32_WINNT=0x0410)
  • Uses static CRT linking (/MT) - no MSVCRT.DLL dependency
  • Sets subsystem version to 4.10 (critical for Win98 loader)
  • Optimizes for speed (/O2) on 200MHz CPUs
  • Validates build environment before compilation
  • Outputs: deploy/RetroBrowser_Win98.exe

Key Compiler Flags:

  • /DWINVER=0x0410 - Target Windows 98 API level
  • /D_WIN32_WINNT=0x0410 - NT 4.0 compatibility
  • /D_WIN32_WINDOWS=0x0410 - Win9x family targeting
  • /MT - Static CRT (no external DLL dependencies)
  • /O2 - Speed optimization for legacy hardware
  • /SUBSYSTEM:WINDOWS,4.10 - Win98 PE subsystem version
  • /MACHINE:X86 - 32-bit x86 architecture

Build Output:

Building RetroBrowser for Windows 98...
Environment: Visual Studio detected
Compiling modules...
  [✓] stdafx.cpp
  [✓] main.cpp
  [✓] ui.cpp
  [✓] network.cpp
  [✓] parser.cpp
  [✓] renderer.cpp
Linking...
Build successful!
Output: deploy\RetroBrowser_Win98.exe (1.2 MB)

Verification: After building, verify Win98 compatibility:

dumpbin /headers deploy\RetroBrowser_Win98.exe | findstr "subsystem"
REM Should show: subsystem version 4.10

Windows 98 Compatibility Documentation

For comprehensive Win98 compatibility information, see:

📖 Windows 98 Testing Guide

  • VM setup (VirtualBox/VMware)
  • Network configuration
  • Installation steps
  • 10 comprehensive test cases
  • Performance benchmarks
  • Troubleshooting guide

📊 Windows 98 Compatibility Matrix

  • Supported OS versions (98 FE, 98 SE, ME, NT 4.0, 2000+)
  • Hardware requirements
  • Known limitations
  • Troubleshooting solutions
  • Performance benchmarks across OS versions

🔧 Quick Win98 Setup:

  1. Create Win98 SE VM (128 MB RAM, 2 GB HDD)
  2. Configure host-only network adapter
  3. Install Winsock 2.2 (included in 98 SE)
  4. Copy RetroBrowser_Win98.exe to VM
  5. Configure proxy address (e.g., 192.168.56.1:8888)
  6. Launch and browse!

Tested Configurations:

  • ✅ Windows 98 SE (4.10.2222 A) - Primary target
  • ✅ Windows 98 FE (4.10.1998) - Requires Winsock 2.2 update
  • ✅ Windows ME (4.90.3000) - Fully compatible
  • ✅ Windows 2000/XP/7/10/11 - Overqualified, works perfectly

Minimum Hardware (Win98):

  • CPU: Pentium 200 MHz
  • RAM: 64 MB (128 MB recommended)
  • HDD: 10 MB free space
  • Display: 800x600, 256 colors
  • Network: 10 Mbps Ethernet or dial-up

🚀 Sử Dụng

Basic Usage

  1. Start proxy on host: python src/proxy/proxy.py
  2. Launch browser in Win98 VM
  3. Enter URL in address bar
  4. Click "Go" or press Enter
  5. Browse: Click links, scroll pages

Example URLs

✅ Recommended (Lightweight)

⚠️ Complex (May be slow)

❌ Not Supported

Keyboard Shortcuts

Key Action
Enter Navigate
Tab Cycle controls
Mouse Wheel Scroll
Ctrl+A Select all (address bar)
Alt+F4 Close

Troubleshooting

"Network Error"

  • Check proxy is running
  • Verify VM can ping host IP
  • Check firewall allows port 8080

Page not loading

  • Try simpler URL (example.com)
  • Check proxy logs for errors
  • Verify internet connection

Images not showing

  • Only BMP format supported
  • Check image URL in logs
  • Verify size < 1MB

📚 Tài Liệu Chi Tiết

Module Documentation

Each module has detailed documentation in header files:

  • UI Module: src/browser/ui/ui.h - Win32 API usage, callbacks
  • Network: src/browser/network/network.h - Winsock, HTTP protocol
  • Parser: src/browser/parser/parser.h - FSM design, block types
  • Renderer: src/browser/renderer/renderer.h - GDI rendering, layout
  • Proxy: src/proxy/config.py - Configuration options

Windows 98 Compatibility Documentation

  • Testing Guide: docs/win98_testing_guide.md - Complete VM setup, network config, 10 test cases
  • Compatibility Matrix: docs/win98_compatibility_matrix.md - OS support, limitations, troubleshooting
  • Build Script: build_win98.bat - Optimized Win98 build with detailed comments
  • API Compatibility: docs/win98_compatibility_report.html - API usage analysis

Additional Resources

  • Project Report: docs/BaoCao_BaiTapLon.pdf (Vietnamese)
  • Test Documentation: tools/test_scripts/TEST_DOCUMENTATION.md
  • Test Summary: TESTING_SUMMARY.md (24/24 tests passing)
  • Demo README: READMEdemo.md (original overview)

🧪 Testing

Comprehensive Test Suite

File: tools/test_scripts/test_comprehensive.py
Coverage: 24 test cases, 100% passing

Test Categories

  1. Basic Functionality (3 tests)

    • Proxy connectivity
    • Simple HTTP requests
    • HTTP version handling
  2. Edge Cases (4 tests)

    • Empty/malformed requests
    • Very long URLs
    • Special characters
  3. Security (4 tests)

    • Script/style removal
    • Dangerous attribute filtering
    • Invalid URL schemes
  4. Content Validation (3 tests)

    • HTML structure
    • Headers
    • Content-Length
  5. Performance (5 tests)

    • Response time
    • Sequential requests
    • Concurrent handling
    • Large responses
    • Timeouts
  6. Boundary Conditions (3 tests)

    • Zero-length responses
    • Unicode URLs
    • Connection reuse
  7. End-to-End (2 tests)

    • Full cycle
    • Multiple URL formats

Run Tests

# Start proxy
python src/proxy/proxy.py

# Run tests (separate terminal)
python tools/test_scripts/test_comprehensive.py

# Output:
# ══════════════════════════════════════════════════════════════
# TEST SUMMARY
# ══════════════════════════════════════════════════════════════
# Total Tests: 24
# Passed: 24 ✅
# Failed: 0
# Success Rate: 100%
# 🎉 ALL TESTS PASSED! 🎉

Test Results

  • Response Time: 0.43s average
  • Throughput: 2.4 requests/second
  • Concurrency: 8-10 simultaneous connections
  • Success Rate: 100% (24/24 tests)

🐛 Known Issues & Future Work

Known Limitations

  1. Image Format: Only BMP, JPEG/PNG supported (Win98 native)
  2. CSS/JavaScript: Not supported (by design)
  3. POST Requests: Only GET implemented
  4. HTTPS Direct: Proxy handles TLS
  5. Performance: Slow on complex pages (200MHz CPU)

Future Enhancements

  • Thread pool for image loading
  • Connection pooling (HTTP Keep-Alive)
  • Disk cache for images
  • History UI (back/forward buttons)
  • Bookmark management
  • Download manager
  • Print support

👥 Credits

Development Team

Project: BTL Hệ Điều Hành - Operating Systems Course
Institution: [PTIT]
Term: [D23]

Technologies Used

  • C++98: Legacy compatibility
  • Win32 API: GUI, GDI, Winsock
  • Python 3.x: Proxy server
  • Requests: HTTP/HTTPS client
  • BeautifulSoup4: HTML sanitization
  • Visual C++ 6.0: Compiler
  • VirtualBox: Virtualization

References

  • Petzold, Charles. Programming Windows (5th Edition)
  • Microsoft Win32 API Documentation (MSDN)
  • RFC 7230-7235: HTTP/1.1 Specification
  • Winsock 2.2 Reference
  • GDI Programming Guide

📜 License

This is an academic project for educational purposes.

License: Academic Use Only
Copyright: © 2025 RetroBrowser Project Team
Disclaimer: Not for commercial use


⭐ Star this repo if you find it useful!

GitHub

Made with ❤️ for Operating Systems Course

About

RetroBrowser is a fully functional web browser designed to run on Windows 98, combining classic architecture (Win32 API, GDI, Winsock 2.2) with modern web accessibility (HTTPS, HTTP/2) through an intelligent Python-based proxy server.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published