Court URL Paginator¶

The Court URL Paginator module (opal.court_url_paginator) provides utilities for handling pagination in the Alabama Appeals Court Public Portal. It includes functions for parsing, building, and generating paginated URLs.

Overview¶

The Alabama Appeals Court portal (publicportal.alappeals.gov) uses URL-encoded pagination parameters. This module handles:

Parsing page numbers from encoded URLs
Building URLs for specific pages
Extracting total page count from initial loads
Generating complete sets of paginated URLs
Validating Appeals Court portal URLs

Functions¶

`parse_court_url(url)`¶

Extracts current page number and total pages from a court URL.

from opal.court_url_paginator import parse_court_url

url = "https://publicportal.alappeals.gov/portal/search/case/results?..."
current_page, total_pages = parse_court_url(url)

print(f"Current page: {current_page}, Total pages: {total_pages}")
# Current page: 0, Total pages: 5

Parameters: - url (str): The court URL to parse

Returns: - Tuple[Optional[int], Optional[int]]: (current_page, total_pages) or (None, None) if parsing fails

URL Pattern Parsed: The function looks for these patterns in decoded URLs: - page~(.*?number~(\d+) - extracts current page number - totalPages~(\d+) - extracts total page count

`build_court_url(base_url, page_number)`¶

Constructs a URL for a specific page number.

from opal.court_url_paginator import build_court_url

base_url = "https://publicportal.alappeals.gov/portal/search/case/results?..."
page_2_url = build_court_url(base_url, 2)

Parameters: - base_url (str): The original court URL (any page) - page_number (int): The desired page number (0-indexed)

Returns: - str: URL for the specified page

Implementation: Uses regex to replace the page number in the pattern: page~%28.*?number~X

`extract_total_pages_from_first_load(url, parser)`¶

Extracts the total number of pages by loading the first page and checking for JavaScript updates.

from opal.court_url_paginator import extract_total_pages_from_first_load
from opal.parser_appeals_al import ParserAppealsAL

parser = ParserAppealsAL()
total_pages = extract_total_pages_from_first_load(court_url, parser)
print(f"Total pages: {total_pages}")

Parameters: - url (str): Initial URL (typically page 0) - parser: ParserAppealsAL instance to make the request

Returns: - int: Total number of pages (1 if extraction fails)

Process: 1. Makes request using the parser 2. Waits for JavaScript to update the URL 3. Parses the updated URL for total page count 4. Falls back to 1 if unable to determine

`paginate_court_urls(base_url, parser=None)`¶

Generates a list of URLs for all pages in the search results.

from opal.court_url_paginator import paginate_court_urls
from opal.parser_appeals_al import ParserAppealsAL

parser = ParserAppealsAL()

# With parser for dynamic total page detection
urls = paginate_court_urls(first_url, parser)

# Without parser (uses URL info only)
urls = paginate_court_urls(first_url)

for i, url in enumerate(urls):
    print(f"Page {i}: {url}")

Parameters: - base_url (str): Initial court search URL - parser (optional): ParserAppealsAL instance for dynamic page detection

Returns: - List[str]: List of URLs for all pages (0-indexed)

Logic: 1. Try to parse total pages from URL 2. If not available and parser provided, load first page to detect 3. Generate URLs for all pages (0 to total_pages-1) 4. Return just base URL if pagination cannot be determined

`is_court_url(url)`¶

Validates if a URL is from the Alabama Appeals Court portal.

from opal.court_url_paginator import is_court_url

if is_court_url(url):
    print("Valid Appeals Court URL")
else:
    print("Not an Appeals Court URL")

Parameters: - url (str): URL to validate

Returns: - bool: True if URL contains both publicportal.alappeals.gov and /portal/search/case/results

URL Structure¶

Appeals Court URLs use encoded pagination parameters:

https://publicportal.alappeals.gov/portal/search/case/results?searchParams=...page~%28size~25~number~0~totalElements~125~totalPages~5%29

Key components: - page~%28 - Start of page parameter block - size~25 - Results per page - number~0 - Current page (0-indexed) - totalElements~125 - Total result count - totalPages~5 - Total number of pages

Integration Examples¶

With ParserAppealsAL¶

from opal.parser_appeals_al import ParserAppealsAL
from opal.court_url_paginator import paginate_court_urls, extract_total_pages_from_first_load

parser = ParserAppealsAL()

# Get total pages dynamically
total_pages = extract_total_pages_from_first_load(search_url, parser)
print(f"Found {total_pages} pages")

# Generate all page URLs
all_urls = paginate_court_urls(search_url, parser)

# Process each page
all_cases = []
for i, url in enumerate(all_urls):
    print(f"Processing page {i+1}/{len(all_urls)}")
    cases = parser.extract_page_data(url)
    all_cases.extend(cases)

With Configurable Court Extractor¶

The configurable court extractor uses these functions internally:

# Internal usage in configurable_court_extractor.py
def _process_paginated_results(self, first_page_url):
    # Generate URLs for all pages
    page_urls = paginate_court_urls(first_page_url, self.parser)

    # Process each page
    for url in page_urls:
        self._process_page(url)

Manual Pagination Handling¶

from opal.court_url_paginator import parse_court_url, build_court_url

# Parse current state
current_page, total_pages = parse_court_url(search_url)

if total_pages and total_pages > 1:
    # Process remaining pages
    for page_num in range(current_page + 1, total_pages):
        next_url = build_court_url(search_url, page_num)
        # Process next_url...

Error Handling¶

The paginator functions are designed to fail gracefully:

parse_court_url: Returns (None, None) if parsing fails
extract_total_pages_from_first_load: Returns 1 if extraction fails
build_court_url: Returns original URL if building fails
paginate_court_urls: Returns single-item list with base URL if pagination fails

# Safe usage pattern
from opal.court_url_paginator import paginate_court_urls

try:
    urls = paginate_court_urls(court_url, parser)
    if len(urls) == 1:
        print("Single page or pagination detection failed")
except Exception as e:
    print(f"Pagination error: {e}")
    urls = [court_url]  # Fallback to original URL

Performance Considerations¶

URL parsing is fast and doesn't require network requests
Dynamic page detection requires loading the first page
Consider caching total page counts for repeated searches
Use with rate limiting to avoid overwhelming the server

Debugging¶

Enable debug output by checking the console messages:

from opal.court_url_paginator import extract_total_pages_from_first_load

# Function prints debug messages:
# "Detected X total pages from URL"
# "Error parsing URL: ..."
# "Error extracting total pages: ..."

total_pages = extract_total_pages_from_first_load(url, parser)

Limitations¶

Appeals Court Specific: Only works with publicportal.alappeals.gov URLs
JavaScript Dependency: Requires browser/parser for dynamic page detection
URL Structure Dependency: May break if portal changes URL encoding
0-Based Indexing: Page numbers are 0-indexed (page 0 is first page)
Session Dependency: URLs may be session-based and expire

Complete Example¶

from opal.court_url_paginator import (
    is_court_url, 
    parse_court_url,
    paginate_court_urls,
    extract_total_pages_from_first_load
)
from opal.parser_appeals_al import ParserAppealsAL

def process_all_appeals_court_pages(search_url):
    # Validate URL
    if not is_court_url(search_url):
        raise ValueError("Not a valid Appeals Court URL")

    # Parse initial URL
    current_page, total_pages = parse_court_url(search_url)
    print(f"Starting from page {current_page}, total: {total_pages}")

    # Setup parser
    parser = ParserAppealsAL()

    # Get total pages if not in URL
    if total_pages is None:
        total_pages = extract_total_pages_from_first_load(search_url, parser)
        print(f"Detected {total_pages} total pages")

    # Generate all URLs
    all_urls = paginate_court_urls(search_url, parser)

    # Process each page
    results = []
    for i, url in enumerate(all_urls):
        print(f"Processing page {i}/{len(all_urls)-1}")
        page_data = parser.extract_page_data(url)
        results.extend(page_data)

    return results

# Usage
search_url = "https://publicportal.alappeals.gov/portal/search/case/results?..."
all_cases = process_all_appeals_court_pages(search_url)
print(f"Extracted {len(all_cases)} total cases")

Key Differences from Other Court Systems¶

This module is specifically designed for the Alabama Appeals Court portal, which differs from other Alabama court systems:

URL Domain: publicportal.alappeals.gov (not alacourt.gov)
Pagination: URL-encoded parameters (not JavaScript/AJAX)
Page Indexing: 0-based (page 0 is first page)
Search Path: /portal/search/case/results (not /ajax/courts.aspx)

Make sure you're using the correct parser and URLs for the Appeals Court system.