What is GOSTCrawl?
by Kristina Drye, on Sep 28, 2022 8:10:23 AM
One of the most consistent features of the internet is that it is always changing. Everyone who uses it knows this – we often post something and delete it; we clean out our old files; and we add and remove pictures, content, and statuses frequently. What exists one minute can be deleted the next – including information that could help a risk analyst make a more informed decision. A truly effective risk tool would be able to capture some of this information and use it to help make more precise assessments.
The GOST Fall 2022 Release introduces one of Giant Oak’s most advanced capabilities, called GOSTCrawl. GOSTCrawl is a proprietary capability of Giant Oak that finds and preserves webpages on the internet, offering more data for the algorithms to use. “Crawling” the internet and preserving information from websites, GOSTCrawl is a truly robust data source for clients. GOSTCrawl to date has indexed over a billion web pages, and it continues to grow by eight million web pages a day.
What this means for risk analysts is that in addition to the live version of the internet being indexed through your respective derogatory model, GOST is also screening against the data of billions of web pages and their archived information. This ensures that you don’t miss any derogatory information or negative news when making threat assessments. GOSTCrawl’s higher-quality entity extraction improves the precision of your risk practice’s results.