DARPA has released Memex, a suite of dark web search tools, as open source. The suite is a collaborative project that includes components built by numerous universities, small and large companies, NASA, and R&D firms.
There are 40 tools in this release, and the following are a few of the more notable components that make up Memex.
- ArrayFire – A high performance software library that makes parallel GPU processing easy through its API.
- Dossier Stack – A framework of library components for building active search applications that offer results based on the user’s actions.
- Autologin – A utility that allows a web crawler to search for the login page for a website and use provided credentials to login.
- HG Profiler – A tool that allows users to take a list of entities from one source and search for it across a set of predefined sources.
- Splash – A lightweight, scriptable browser as a service with an HTTP API
- ImageCat – A tool to analyze images and extract metadata and any text contained via an OCR that can handle millions of images.
- DeepDive – A knowledge base construction system that extracts data from millions of resources and incorporates domain-specific knowledge and user feedback to improve the content it returns.