Posts

Showing posts from May, 2019

SIPping and TIPping for faster search over sorted arrays

Image
Can you do better than binary search? TL;DR: Yes you can with our new algorithms SIP and TIP ;  and, not everything at SIGMOD/PVLDB has to be called ML-in-Systems! That was the question that two of my students Peter Van Sandt (a recent graduate from our BS program and the winner of David DeWitt Undergraduate Scholarship) and Yannis Chronis worked on over the last few years. It is a central question as searching over sorted in-memory arrays is a fundamental operation used in nearly every modern data platform. And, most of these platforms use some form of Binary Search ( BS ).  Interestingly, Interpolation Search ( IS ) has a better average-case complexity of O(log log N), but this result assumes that the data is uniformly distributed. Past attempts to use IS in any of less-than-ideal situation has generally not worked, leaving BS as the undisputed search method over sorted arrays.   We took a deeper look at this search problem in a paper that Peter and Yannis will co