We all want to be future-proof: not just prepared for unforeseen developments but positioned well to take advantage of them. Having a flexible, adaptable, and scalable technology stack is a great way to get to achieve that goal when it comes to being able to leverage data science effectively. Here are five ideas we personally think are crucial to keep in mind when building out your own functionality:
1. Your pipeline is only as good as its weakest link.
It’s great that your predictive modelers have come up with a thousand new features to incorporate, but have you asked your data engineers how that will affect the performance of backend queries? What about your data collection and ingestion flow? Maybe your team is frothing at the mouth for an upgrade to Spark Streaming to run their clustering algorithms in real time, but your frontend will lose responsiveness if you try to display the results as fast as they come in. The key here is not to get sucked into the hype of “scaling up” without fully recognizing the implications across your entire organization and what new demands will be placed on all those moving parts.
2. Figure out the reasons for the tech before buying it.
Speaking of hype, it’s easy to get googly-eyed over fancy new technology and if you’re like me, you’ll immediately start throwing projects at it to see how it performs. I think the smarter sequence of events is to think carefully about the kinds of projects you expect to have and then consider the technology capable of implementing them. One of the more underrated advantages of older technology is the established knowledge base and the fact that other people have probably seen that error message before. Investing in the cutting edge can certainly make sense if you know how you want to use it, but be aware that it’ll come with growing pains that manifest as tangible productivity losses. If you’ve done your research and can point to a positive cost-benefit analysis, these losses will be easier to swallow.
3. Be methodical: make sure new solutions won’t cause problems/breaches, especially with core functionality.
These days, our systems are more interconnected than ever. While this results in a lot of conveniences, it also creates the potential for strange and unforeseen interactions when you decide to upgrade one of those systems. Especially when dealing with the fundamental pieces of your business model, or sensitive information, it’s critical to have systems in place to test these integrations that you rely on. Since tests usually check one thing at a time, it can be useful to deliberately only change one thing at a time in order to ensure that your network of interacting systems remains robust.
4. Establish a feedback loop to monitor and deal with issues.
If you’re investing in new technology for a reason (and you should be!), it only makes sense to measure how closely it meets your expectations and how well it actually achieves your goals. Not only can this provide a means to support the employees who are on the front lines defining new workflows, but it can inform your higher-level decision making when it comes to identifying areas to improve on down the line. I happen to think that it’s important not to over-invest in ironing out every single wrinkle (remember the 80% rule?), but to keep in mind that some of the most pernicious problems with technology spawn the ideas for the next generation of tools. I do think that in the early stages of adoption, it’s important to be aggressive about establishing the new status quo. After things have settled, you’ve hopefully reaped the rewards from your investment, and you have a good idea of where to go next, you can make decisions about where to go next.
5. Hire and build a data-literate workforce.
People are systems too, and their integration can be just as influential on productivity as your server hardware. As a data scientist, my ideal work environment starts with being able to communicate with everyone from business analysts to the Chief Marketing Officer. Informal training is great for having everyone on the same page, but taking the time to formally teach people new skills is great for buy-in and understanding the nuances. It’s also enabling and empowering when everyone can contribute to the discussion and it’s not just a one-way information dump. Ultimately the best way to future-proof your company is to make sure its employees feel valued and heard. If data science is a part of your company’s future, then make sure everyone can participate in that future.