On YouTube, you might learn that finding outliers is as simple as determining which data was above 2-times standard deviation. At Squark we wish that were true, but that isn’t a really accurate guideline, however helpful. It’s a dull data blade when you need a sharp data scalpel. Finding outliers with just 2x stddev is just insufficient. Why?
Outlier and anomaly detection is all about the threshold of what makes an outlier, well, an outlier. It’s dependent on the data, not just some rule in a book or in a video. This fact is especially true in high-dimensional data and big data, where it can be challenging if nearly impossible to visualize the data cost effectively. When the tools don’t support your EDA what are you going do? Write some Python, rack some servers, and hope?
Use Squark! We run 9 different anomaly and outlier detection algorithms on your time series data. New and powerful algos that your team may likely not use. Each one may find something different. We then present the multi-algorithmic results combined into one proprietary “outlier score” per outlier. We created this automated and fast approach to help you determine if the datapoint is really an outlier. Next, either allow us to automatically handle the outliers as part of our automated AI process, or choose which we should use.
With Squark, outlier and anomaly detection is easier than ever. Some might even say it is 2-times standard deviation easier too, but we think the threshold we beat is even bigger than that! See it all in a hot minute, by getting a demo here (<—click. this. link. You know you want to).
Judah Phillips