StepForge-DataCollection

A collection of modules in ruby (maybe others too) that are geared for attaining and building metadata of files as well as tracking changes and the chain of custody of such things. - Metadata that is not extractable from the file alone such as personal comments. - Metadata information is also maintained so that it can be persisted accross devices. - Verify and checksup information - Sources - Any instructions or followups that may be required later. - Backlinks (if possible) to places where the files may be reference from. - Any contact info / followup info related to cases or needs to additional info - Any directions and dates for people to be reminded or need to remember to followup on.
Updated 2026-06-14 01:49:13 -07:00
Some ruby code to transform various data from sources such as call logs, chat logs, and forms so that they can be unified for reconstruction of events and timelines.
Updated 2026-06-14 01:05:40 -07:00
Web Scraper in Go, similar to BeautifulSoup
Updated 2026-06-13 23:22:33 -07:00
A guide for extracting titles, authors, and citations from Google Scholar using Python and Oxylabs SERP Scraper API.
Updated 2025-12-30 22:50:12 -08:00
A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG
Updated 2023-10-30 04:46:27 -07:00
`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Updated 2022-10-02 20:04:10 -07:00
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
Updated 2021-06-28 12:56:23 -07:00