How ScopeScrape Scrapes Twitter Without Paying $100/month for the API

In November 2023, Elon Musk made the Twitter API paid-only. The Basic tier ($100/month) gives you 2 million tweets per month. The Pro tier ($200/month) gives you 10 million. The Enterprise tier requires a sales conversation. Most indie developers cannot justify those costs for exploratory data collection. So I looked for an alternative.

I found Nitter, an open-source frontend for Twitter that returns clean, structured HTML. Unlike the official API, Nitter requires no authentication and no payment. ScopeScrape's Twitter adapter uses it. This post explains what Nitter is, how to use it safely, what data you can extract, and the reliability tradeoffs.

What is Nitter?

Nitter is a lightweight, privacy-focused frontend for Twitter maintained by the open-source community. It mimics Twitter's UI but strips out JavaScript, tracking, and ads. You can access any public tweet, thread, or user timeline through Nitter without an account.

The project runs on several community-hosted instances. You can visit nitter.net (the primary instance), or use any of a dozen mirror instances maintained by volunteers. The source code is on GitHub and free to self-host.

When you request a user's timeline or a search query through Nitter, it fetches the data from Twitter's servers and returns it as HTML. The HTML is structured and easy to parse. No login needed. No API key needed. No rate limits enforced (in practice).

How does the adapter work?

ScopeScrape's Twitter adapter makes HTTP requests to a Nitter instance, parses the HTML response, and extracts tweet data. The flow is simple:

User specifies a search query, hashtag, or profile to monitor.
Adapter constructs a URL to the Nitter instance (e.g., https://nitter.net/search?q=pain%20point&type=tweets)
Make an HTTP GET request.
Parse the HTML response using BeautifulSoup.
Extract tweet text, author, timestamp, engagement metrics (retweets, likes, replies).
Build structured tweet objects.

The parsed data includes:

Tweet text (full, untruncated)
Author username and display name
Creation timestamp
Retweet, like, and reply counts
Links and media indicators

This is enough for pain point detection. Engagement metrics tell you which conversations matter to the community. Tweet text reveals the actual problems people face.

What can you NOT get from Nitter?

Nitter is read-only and public-data-only. You cannot get:

Direct messages (DMs).
Protected tweets or private accounts.
Real-time notifications or streaming.
Verified badge status (Nitter doesn't expose it consistently).
Quote tweets as a separate data structure (they appear as regular replies).
Write operations (liking, retweeting, tweeting).

For market research and pain point discovery, none of these gaps matter. You are looking at public conversations, not private data or interaction capabilities.

Reliability and rate limits

This is where Nitter has a real problem. Nitter instances are run by volunteers. They are not production infrastructure. The primary instance at nitter.net has gone down several times. Mirror instances come and go. Response times can be slow (2-10 seconds). Some instances block certain queries without warning.

Twitter actively tries to block Nitter. They change the HTML structure, block Nitter IP ranges, and enforce stricter rate limits on requests that look like they come from a scraper. If an instance gets too much traffic, Twitter's anti-bot systems kick in and the instance becomes unreliable.

There are no formal rate limits published by Nitter. In practice, you can make about 10-20 requests per minute per instance before getting blocked. If you hit a block, you either rotate to a mirror instance or wait an hour and try again.

For ScopeScrape, I handle this with instance rotation and exponential backoff. If a request fails, the adapter tries the next instance in a list. If all instances fail, it waits and retries later. This makes the tool slower but more reliable.

Comparison: Nitter vs official Twitter API vs browser scraping

Aspect	Nitter	Official Twitter API	Browser Scraping
Cost	Free	$100-$200+/month	Free (but infrastructure-heavy)
Authentication required	No	Yes (OAuth2)	No
Public data access	Yes	Yes	Yes
Rate limits (formal)	None published	450 requests/15 minutes	None
Practical throughput	10-20 req/min	30 req/min	Depends on proxies
Real-time data	No (30-60 sec delay)	Yes	Yes
Write operations	No	Yes	No
Reliability	Medium (instance-dependent)	High (Twitter-operated)	High (with good infrastructure)
Compliance risk	Medium (violates ToS)	Low (official)	High (violates ToS)

When should you use the official API instead?

If you need real-time data or write operations, use the official API. If you are building a commercial product with high SLA requirements, pay for the API. If you need to monitor protected accounts or private conversations, only the official API works.

But if you are doing one-off research, market surveys, or building a free tool for yourselves and friends, Nitter is a practical option. The cost savings are real.

Ethical considerations

It is worth saying plainly: using Nitter to scrape data probably violates Twitter's Terms of Service. Nitter exists because Twitter prices out smaller builders. That does not make it legal, just useful.

My approach is pragmatic. I use Nitter for data collection because the alternative is either no Twitter data at all or $100/month out of pocket for something I might abandon. But I also respect Twitter's infrastructure. I keep request rates low. I do not scrape protected accounts. I do not use the data commercially. I am not testing the limits of Nitter's reliability deliberately.

If Twitter decides to shut down Nitter completely, or if I build something that becomes commercially viable, I will revisit this decision and explore paying for official API access. Until then, Nitter is the practical option.