This is a telegram bot to download web pages via telegram bot to bypass pages blocked with 451 error code.
Note: the architecture might look too complicated for this small project, but it was done on purpose to lern on how to split components to make it scalable.
The bot consists of two parts: the actual bot that handles telegram commands and the REST backend that is responsible for downloading the page and sending it back to the user. The bot part can be ran in the standalone mode so it does not require the backend to handle requests.
Running the bot in the distributed mode with a separate backend can help to handle a big number of requests as it can be run on a more powerful server and can be scaled horizontally.
Before starting the bot follow the steps to create your bot in telegram and get a token for it.
Add the token as the TELOXIDE_TOKEN environmental variable.
There is a docker image that provides all dependencies, but it can also be run from a binary.
When running in the standalone mode it's required to provide a path to singilefile binary.
Supported options:
backend-url- the url for the backend to serve the requests, required for the distributed modework-dir- path to the folder needed to save the pages, required for the standalone modethrottling-timeout-seconds- throttling interval for requests from the same client
Supported arguments:
SINGLEFILE-CLI- path to the singlefile binary, required for standalone mode
To start the bot in the standalone mode:
./bot --work_dir=<path> <path_to_binary>To start the bot in the distributed mode:
./bot --backend_url=example.comTo print help
./bot --helpFor the simplicity there is a docker image with all dependencies preinstalled.
docker build -f Dockerfile.bot -t bot .
docker container run bot <params>The bot can be run with a separate backend to server the requests.
The recomended way to run the backend is docker.
The beckend requires a database to work. There are two options for the database: sqlite or postgres.
The sqlite option is built-in into the docker and will be used out of the box if postgres endpoint is not found. It also requires singlefile binary which comes preinstalled with the docker image.
pg_url- Postgres endpoint, will be run with sqlite if omittedwork_dir- a directory to download files and store sqlite database, requiredpg_user- database user, required whenpg_urlis setpg_password- password for the user, required whenpg_urlis setpg_database- database name in postgres deployment, required whenpg_urlis setsinglefile_cli- path to the singlefile binary
docker build -f Dockerfile.backend -t backend .
docker container run backend <params>- Caching does not work properly with pages that have ads built-in. Every time a page loads a new adds usually appears which breaks comparison check. A possible solution could be running an ads blocker on the host that loads pages
- Accept cookies popup is visible and could block content without an option to close it
- Implement a browser extension that could open up bot's chat in the telegram app with url copied to the clipboard of the link inserted to the message box.