Yahoo has announced the release of Vespa, a big data processing and service engine. This engine is a critical component of the Yahoo search engine, and they hope that by making this open source more developers will be able to “build applications that can compute responses to user requests, over large datasets, at real time and at internet scale .”
Yahoo uses Vespa to process 90,000 data requests every second with millisecond response time. This capability is something that has been limited primarily to large corporations.
From the announcement:
With Vespa, our teams build applications that:
- Select content items using SQL-like queries and text search
- Organize all matches to generate data-driven pages
- Rank matches by handwritten or machine-learned relevance models
- Serve results with response times in the low milliseconds
- Write data in real-time, thousands of times per second per node
- Grow, shrink, and re-configure clusters while serving and writing data