Table of Contents
- Don’t waste days, weeks, months like I did
- Stay away from unsupervised learning
- skip neural networks
- binarize and classify all problems
- Tune your hyperparameters
- Set a window of time to try, not the result
- ALWAYS ALWAYS ALWAYS RECORD EXPERIMENTS
Don’t waste days, weeks, months like I did
I worked in machine learning for over 3 years at a startup. Raised money and built some cool tech products. However, I wasted a lot of time in the process.
Learning machine learning when your company’s future is at stake isn’t the most relaxed path to mastering it.
In the course of my work, I have made many mistakes. Despite the failures, there were also some successes.
This article is anecdotal advice about those failures and successes.
This article is best suited for those who want to solve real-world problems, but it’s also aimed at all machine learning beginners.
Let’s discuss it.
Stay away from unsupervised learning
Stay away from unsupervised learning. It was a big waste of time.
Many attempts at unsupervised learning have yielded zero value, despite the advice of all the AI PhDs. There is a lot of academic research going on in this area. Otherwise this paradox cannot be explained.
Unsupervised learning is training a model on untagged data. Usually clustering. In theory, we can discover previously unknown patterns.
Supervised learning , on the other hand, learns the relationship between inputs and tagged outputs. The technique is facilitated through learning what features are associated with what outputs.
In my case, unsupervised learning fell short of human intuition every time.
So while there may be cool apps in this space, they don’t easily beat human intuition. Get experience elsewhere and then come back to this type of unsupervised learning.
skip neural networks
I’ve seen neural networks outperform traditional models, but the gains are small and the effort required to build them is high.
Neural networks present several challenges, especially for AI engineers in the early stages of their careers:
- Slow learning iterations. The learning curve is a function of the speed at which you try new things. Neural networks generally take longer to train than traditional models. Therefore, the time to repeat learning is reduced.
- A large amount of data is required to avoid overfitting. In many cases, a large amount of data needs to be collected long enough after starting a business, but most companies do not have pre-tagged data.
- Too many choices. While logistic regression has a limited set of hyperparameters, neural networks can be set up in infinitely different ways. Neural networks are like rabbit holes, and you’re more likely to get lost and frustrated than come up with a solution.
- Even traditional machine learning models often perform well. For MVPs, it is often enough to just introduce a ready-made model of sklearn. A few weeks of tuning the neural net might raise the F-number by a few points, but it’s not worth it initially.
- Finding a mentor is hard. Neural networks are strange creatures. Everyone knows how it works roughly. But few people have experience using neural networks to solve real problems. Therefore, you have to solve it by yourself.
In conclusion, I’m not against neural networks. But let’s use it to go from 90 to 100, not when we go from 0 to 1.
binarize and classify all problems
Make training the model as easy as possible. The simplest problem is binary classification.
A binary classification model is one that outputs 1 or 0. A specific example of binary classification is whether or not there is a dog in the photo.
Multiclass classification , on the other hand, returns 0, 1, 2, or 3 depending on whether the photo contains a dog, cat, parrot, or emu.
Many times, it was better to run multiple binary classifiers in parallel rather than a single multiclass model that handled all cases.
The greatest benefit comes not from choosing the right model, but from framing the problem in the right way.
Tune your hyperparameters
This makes a big difference.
Hyperparameters are level settings for the model itself. For example, the learning rate is applicable.
Use automated tools to tune hyperparameters. There are several such tools (e.g. GridSearchCV, TPOT…).
You don’t have to spend time manually tuning the model. Set your tuning limits and run your experiments in the cloud.
A tip from experience: write error-recovery code and save the results regularly. I’ve had many cloud experiments without saved progress crash on day 3 and lose the results.
Default hyperparameters are rarely optimal. Tune them.
Set a window of time to try, not the result
Machine learning is not like software engineering.
We cannot predict how long it will take to solve a problem, or even if it can be solved. But you can predict how long the experiment will take.
Trying to estimate how long it will take machine learning to solve will eventually get you into trouble. For the business side of a company, there is nothing more annoying than an engineer who underestimates the man-hours required.
These are simple points, but they are important when learning machine learning while working.
ALWAYS ALWAYS ALWAYS RECORD EXPERIMENTS
Record your experiment and you’ll thank yourself for it six months later.
We recommend recording the following items:
- Choice of model/architecture
- A rough description of the data (source, size, date, features…)
- Results (e.g. precision, recall, F value…)
- A link to a snapshot of the data (if available)
- Lessons learned from comments and experiments
Try not to think too much about what you record. A spreadsheet is useful for recording.
Over time, the president or new advisor may ask you to try something you’ve already tried. But you don’t remember why you failed last time. In such a case, if past results can be investigated and presented, it will be possible to greatly reduce time and effort.
Also, writing post-mortems (and occasionally reporting successes) enhances learning. Continuing to record helps identify patterns and develops intuition. Recording is the condition for becoming a “senior human resource” in the long run.
These are some of the things I’ve learned from building machine learning-powered apps over the years.
My experience is mostly limited to the field of natural language processing , but that doesn’t mean it can’t be applied to other fields.
What I would like to convey to my readers is the following. State-of-the-art technology can bring great results, but you may not be ready for it yet. So it’s a matter of trial and error, pushing the limits where necessary.
So let’s get down to creating useful technology.