Codebase
Explore the ScopeScrape source code. Clean architecture, well-documented modules.
Project Structure
scopescrape/
Project root
scopescrape/
Core package
__init__.py
Package init
cli.py
Click CLI entrypoint
config.py
YAML config loader
scoring.py
Pain point scoring engine
signals.py
Signal phrase detection
models.py
Data models (Post, PainPoint)
adapters/
Platform adapters
__init__.py
base.py
Abstract base adapter
reddit.py
Reddit (PRAW)
hackernews.py
Hacker News (Algolia)
github.py
GitHub Discussions (GraphQL)
producthunt.py
Product Hunt (GraphQL)
analysis/
AI analysis layer
__init__.py
pipeline.py
Two-stage LLM pipeline
providers.py
OpenAI / Anthropic abstraction
export/
Export handlers
__init__.py
json_export.py
csv_export.py
parquet_export.py
tests/
Test suite
test_scoring.py
test_signals.py
test_adapters/
dashboard/
Streamlit dashboard
app.py
pyproject.toml
Poetry config
scopescrape.yaml.example
Config template
README.md
Key Modules
cli.py
Click-based CLI with scan, analyze, export, and config commands. Supports YAML config and flag overrides.
adapters/base.py
Abstract base class defining the adapter interface. fetch(), normalize(), and rate_limit() methods.
signals.py
60+ signal phrases across 4 tiers. Pattern matching with regex, case-insensitive, context-aware window extraction.
scoring.py
Multi-dimensional scoring: frequency, intensity, specificity, recency. Configurable YAML weights. Hard filter support.
analysis/pipeline.py
Two-stage LLM pipeline. Stage 1: cheap batch pre-filter. Stage 2: expensive insight extraction. Budget caps.
export/
JSON, CSV, Parquet, SQLite. Consistent schema across all formats. Streaming export for large datasets.