- 59% of all large enterprises are deploying data science (DS) and machine learning (ML) today.
- Nearly 50% of all organizations have up to 25 or more ML models in use today.
- 29% of enterprises are refreshing their data science and machine learning models every day.
- The higher the data literacy an enterprise can achieve before launching Data Science & Machine Learning initiatives, the higher the probability of success.
These and many other insights defining the state of the data science and machine learning market in 2021 are from Dresner Advisory Services’ 2021 Data Science and Machine Learning Market Study. The 7th annual report is noteworthy for its depth of analysis and insight into how data science and machine learning adoption is growing stronger in enterprises. In addition, the study explains which factors drive adoption and determine the key success factors that matter the most when deploying data science and machine learning techniques. The methodology uses crowdsourcing techniques to recruit respondents from over 6,000 organizations and vendors’ customer communities. As a result, 52% of respondents are from North America and 34% from EMEA, with the balance from Asia-Pacific and Latin America.
“The perceived importance of data science and machine learning correlates with organizational success with BI, with users that self-report as completely successful with BI almost twice as likely to rate data science as critical,” said Jim Ericson, vice president, and research director at Dresner Advisory. “The perceived level of data literacy also correlates directly and positively with the current or likely future use of data science and machine learning in 2021.”
Key insights from the study include the following:
- 59% of large enterprises are deploying data science and machine learning in production today. Enterprises with 10K employees or more lead all others in adopting and using DS and ML techniques, most often in R&D and Business Intelligence Competency Center (BICC)-related work. Large-scale enterprises often rely on DS and ML to identify how internal processes and workflows can be streamlined and made more cost-efficient. For example, the CEO of a manufacturing company explained on a recent conference call that DS and ML pilots bring much-needed visibility and control across multiple plants and help troubleshoot inventory management and supply chain allocation problems.
- The importance of data science and ML to enterprises has doubled in eight years, jumping from 25% in 2014 to 70% in 2021. The Dresner study notes that a record level of enterprises sees data science and ML as critically important to their business in 2021. Furthermore, 90% of enterprises consider these technologies essential to their operations, rating them critically important or very important. Successful projects in Business Intelligence Competency Centers (BICC) and R&D helped data science and ML gain broad adoption across all organizations. Larger-scale enterprises with over 10K employees are successfully scaling data science and ML to improve visibility, control, and profitability in organizations today.
- Enterprises dominate the recruiting and retention of data science and machine learning talent. Large-scale enterprises with over 10K employees are the most likely to have BI experts and data scientists/statisticians on staff. In addition, large-scale enterprises lead hiring and retention in seven of the nine roles included in the survey. It’s understandable how the Business Intelligence (BI) expertise of professionals in these roles is helping remove the roadblocks to getting more business value from data science and machine learning. Enterprises are learning how to scale data science and ML models to take on problems that were too complex to solve with analytics or BI alone.
- 80% of DS and ML respondents most want model lifecycle management, model performance monitoring, model version control, and model lineage and history at a minimum. Keeping track of the state of each model, including version control, is a challenge for nearly all organizations adopting ML today. Enterprises reach ML scale when they can manage ML models across their lifecycles using an automated system. The next four most popular features of model rollback, searchable model repository, collaborative, model co-creation tools, and model registration and certification are consistent with the feedback from Data Science teams on what they need most in an ML platform.
- Financial Services prioritize model lifecycle management and model performance monitoring to achieve greater scale from the tens of thousands of models they’re using today. Consistent with other research that tracks ML adoption by industry, the Dresner study found that Financial Services leads all other industries in their need for the two most valuable features of ML platforms, model lifecycle management and model performance monitoring. Retail and Wholesale are reinventing their business models in real-time to become more virtual while also providing greater real-time visibility across supply chains. ML models in these two industries need automated model version control, model lineage and history, model rollback, collaborative, model co-creation tools, and model registration and certification. In addition, retailers and Wholesalers are doubling down on data science and machine learning to support new digital businesses, improve supply chain performance and increase productivity.
- Enterprises need support for their expanding range of regression models, text analytics functions, and ensemble learning. Over the last seven years, text analytics functions and sentiment analysis’ popularity has continually grown. Martech vendors and the marketing technologists driving the market are increasing sentiment analysis’ practicality and importance. Recommendation engines and geospatial analysis are also experiencing greater adoption due to martech changing the nature of customer- and market-driven analysis and predictive modeling.
- R, TensorFlow, and PyTorch are considered the three most critical open-source statistical and machine learning frameworks in 2021. Nearly 70% of respondents consider R important to getting work done in data science and ML. The R language has established itself as an industry standard and is well-respected across DevOps, and IT teams in financial services, professional services, consulting, process, and discrete manufacturing. Tensorflow and Pytorch are considered important by the majority of organizations Dresner’s research team interviewed. They’re also among the most in-demand ML frameworks today, with new applicants having experience in all three being recruited actively today.
- Data literacy predicts DS and ML program success rates. 64% of organizations say they have extremely high literacy rates, implying that DS and ML have reached mainstream adoption thanks partly to BI literacy rates in the past. Enterprises that prioritize data literacy by providing training, certification, and ongoing education increase success odds with ML. A bonus is that employees will have a chance to learn marketable skills they can use in their current and future positions. Investing in training to improve data literacy is a win/win.
- On-database analytics and in-memory analytics (both 91%), and multi-tenant cloud services (88%) are the three most popular technologies enterprises rely on for greater scalability. Dresner’s research team observes that the scalability of data science and machine learning often involves multiple, different requirements to address high data volumes, large numbers of users, data variety while supporting analytic throughput. Apache Spark support continues to grow in enterprises and is the fourth-most relied-on industry support for ML scalability.