Headline
CVE-2023-45815: Viewing wget extractor output while logged in as an admin allows malicious archived JS to act on behalf of the admin
ArchiveBox is an open source self-hosted web archiving system. Any users who are using the wget
extractor and view the content it outputs. The impact is potentially severe if you are logged in to the ArchiveBox admin site in the same browser session and view an archived malicious page designed to target your ArchiveBox instance. Malicious Javascript could potentially act using your logged-in admin credentials and add/remove/modify snapshots, add/remove/modify ArchiveBox users, and generally do anything an admin user could do. The impact is less severe for non-logged-in users, as malicious Javascript cannot modify any archives, but it can still read all the other archived content by fetching the snapshot index and iterating through it. Because all of ArchiveBox’s archived content is served from the same host and port as the admin panel, when archived pages are viewed the JS executes in the same context as all the other archived pages (and the admin panel), defeating most of the browser’s usual CORS/CSRF security protections and leading to this issue. A patch is being developed in https://github.com/ArchiveBox/ArchiveBox/issues/239. As a mitigation for this issue would be to disable the wget extractor by setting archivebox config --set SAVE_WGET=False
, ensure you are always logged out, or serve only a static HTML version of your archive.
Impact
Any users who are using the wget extractor and view the content it outputs.
The impact is potentially severe if you are logged in to the ArchiveBox admin site in the same browser session and view an archived malicious page designed to target your ArchiveBox instance. Malicious JS could potentially act using your logged-in admin credentials and add/remove/modify snapshots, add/remove/modify ArchiveBox users, and generally do anything an admin user could do.
The impact is less severe for non-logged-in users, as malicious JS cannot modify any archives, but it can still read all the other archived content by fetching the snapshot index and iterating through it.
Because all of ArchiveBox’s archived content is served from the same host and port as the admin panel, when archived pages are viewed the JS executes in the same context as all the other archived pages (and the admin panel), defeating most of the browser’s usual CORS/CSRF security protections and leading to this issue.
Patches
Follow here for progress on mitigating this issue: #239
Workarounds
Disable the wget extractor by setting archivebox config --set SAVE_WGET=False, ensure you are always logged out, or serve only a static HTML version of your archive.
References
- https://en.wikipedia.org/wiki/Cross-site_request_forgery
- https://github.com/ArchiveBox/ArchiveBox#caveats
- https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview
- https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive#security-concerns