Court Case Scraper Extension Requirements¶

To help you build this court case scraper extension while maintaining OPAL's modularity, I'll need the following information:

1. Website Details¶

The URL of the court case website Example URL: https://publicportal.alappeals.gov/portal/search/case/results?criteria=~%28advanced~false~courtID~%2768f021c4-6a44-4735-9a76-5360b2e8af13~page~%28size~25~number~0~totalElements~0~totalPages~0%29~sort~%28sortBy~%27caseHeader.filedDate~sortDesc~true%29~case~%28caseCategoryID~1000000~caseNumberQueryTypeID~10463~caseTitleQueryTypeID~300054~filedDateChoice~%27-1y~filedDateStart~%2706%2a2f11%2a2f2024~filedDateEnd~%2706%2a2f11%2a2f2025~excludeClosed~false%29%29
Example URLs of pages containing the tables you want to scrape Example URL after pagination of next batch of table elements: https://publicportal.alappeals.gov/portal/search/case/results?criteria=~%28advanced~false~courtID~%2768f021c4-6a44-4735-9a76-5360b2e8af13~page~%28size~25~number~1~totalElements~317~totalPages~13%29~sort~%28sortBy~%27caseHeader.filedDate~sortDesc~true%29~case~%28caseCategoryID~1000000~caseNumberQueryTypeID~10463~caseTitleQueryTypeID~300054~filedDateChoice~%27-1y~filedDateStart~%2706%2a2f11%2a2f2024~filedDateEnd~%2706%2a2f11%2a2f2025~excludeClosed~false%29%29
Screenshots or HTML snippets of the table structure

2. Data Requirements¶

What specific data fields do you need from the tables? (case number, parties, dates, status, etc.) I will need to access the following html fields for data

Column 1 Title: Court Court

Column 1 Content Follows this pattern: Alabama Supreme Court

Column 2 Title: Case Number Case Number

Column 2 Content Follows this pattern: SC-2025-0424

Column 3 Title: Case Title Case Title

Column 3 Content Follows this pattern: Frank Thomas Shumate, Jr. v. Berry Contracting L.P. d/b/a Bay Ltd.

Column 4 Title: Classification Classification

Column 4 Content Follows this pattern: Appeal - Civil - Injunction Other

Column 5 Title: Filed Date Filed Date

Column 5 Content Follows this pattern: 06/10/2025

Column 6 Title: Open / Closed Open / Closed

Column 6 Content Follows this pattern: Open

Do you need data from multiple tables per page or one main table?

I only want one table with all of the results of all of the pages at that url. Even if pagination is used to reduce the number of table elements that appear at a time, I want all the results in a single table.

Any specific formatting requirements for the extracted data?

Does the site use pagination like the news sites? The pagination is not the same. The content is grouped into small chunks, but accessible at the same base url.
Are there search/filter parameters in the URL? There are multiple search parameters in the URL. For example, here are the search terms for this url [case~%28caseCategoryID~1000000, caseNumberQueryTypeID~10463, aseTitleQueryTypeID~300054, iledDateChoice~%27-1y~filedDateStart~%2706%2a2f11%2a2f2024~filedDateEnd~%2706%2a2f11%2a2f2025, excludeClosed~false%29%29]
Do you need to follow links within tables to get additional details?

I do not want to follow the links within the table, but I do want to store the text and the reference embedded in the link.

4. Technical Considerations¶

Does the site require authentication? No
Is the content loaded dynamically (JavaScript) or static HTML? Dynamically
Any rate limiting concerns we should be aware of? Please keep the rate limits low

Proposed Extension Architecture¶

Based on OPAL's current architecture, here's how we'd extend it:

Create a new parser class (e.g., CourtCaseParser) extending NewsParser in parser_module.py
Adapt or create a new URL discovery function if the pagination pattern differs from the news sites
Modify the CLI in main.py to add the court parser option
Ensure the output format makes sense for tabular data (might need to adjust from the line-by-line article format)

Next Steps¶

Please provide: 1. The court website URL 2. Description of the table structure you need to parse 3. Any specific requirements or constraints

This will help me design the extension to fit seamlessly with your existing OPAL architecture.