Factoring and Time Series How date and time features are important in models. Many data sets contain date-time fields which we hope will provide predictive value in our models. But date-time fields in the form of MM-DD-YYY HH:SS are essentially unique
What Is Feature Engineering? Data sets can be made more predictive with a little help. “Features” are the properties or characteristics of something you want to predict. Machine learning predictions can often be improved by “engineering”— adjusting features that are
There Is this “F1” Thing Why 50% probability isn’t always always the prediction cut-off. Say you are classifying 100 examples of fruit and there are 99 oranges and one lime. If your model predicted all 100 are oranges, then it
Do your models seem too accurate? They might be. Feature leakage, a.k.a. data leakage or target leakage, causes predictive models to appear more accurate than they really are, ranging from overly optimistic to completely invalid. The cause is highly correlated
Your Data Does Not Have to Be Big In fact, certain algorithms work well with smaller datasets. Some models do require big datasets to deliver significant predictive power. But don’t assume that you need hundreds of feature columns or millions of