To make an impact at the enterprise level, the data science group can’t work in isolation, said Ian Swanson, Oracle vice president of machine learning and artificial intelligence product development, during a presentation at the recent Oracle OpenWorld conference. “In order to do data science right, it has to be a team sport,” said Swanson, former CEO of DataScience.com, which Oracle acquired earlier this year.
One of the data science group’s most valuable teammates is the IT organization, for multiple reasons, he said. The DS group relies on IT to manage and secure the data it uses; support the needed analytics tools; and deliver ready access to scalable bandwidth, compute, and storage capacity to build and train production-oriented analytic models.
Another important ally is the application development team. Developers must incorporate the models DS builds into their “ecosystem” as regular features among the many they use to build production applications, Swanson said.
That points to a significant attribute of production-oriented models: reusability. An ecommerce recommendation engine, for instance, might be reused for forecasting an item’s revenue stream, he said. A key performance indicator for one technology company Swanson worked with on a DS project was “how often that model was used by other parts of the business,” he said.
Line-of-business managers are a valuable constituency as well, because they’re tasked with performing the actions—and getting the results—from applications that use analytic models. An underestimated advantage line-of-business managers bring to the analytics model-building process, Swanson said, is their domain expertise—their experiences working with customers.
As for the top brass, they don’t need “to be involved in every step of the model, but they need to understand how it will be used, the opportunities it offers, the things it can achieve,” Swanson said. “If you’re not involving the top, if they’re not part of the team, data science is not affecting the heart of the business.”
Awash in Tools
Because data science is the new darling of the technology marketplace, the number and variety of analytics tools are staggering. Swanson said he worked with a company whose DS team had accumulated 682 different tools. “How is IT managing 682 different tools?” he wondered.
Still, building predictive analytics models is complicated, requiring a “full stack” of tools, libraries, and languages—preferably open source, which encourages standards and self-service, Swanson said. As DS matures, its practitioners will have to comply with enterprise programming standards, in particular version control. “If you’re writing production code, you should be using some sort of system that encourages working together to follow best engineering practices, such as checking in code and making sure its reproducible,” he said.
But enterprise data science goes beyond programming. “It requires a platform that removes barriers to production, improves collaboration, manages the tool sprawl, provides self-service access to data, and helps with model planning and retention,” Swanson said.
Calling data scientists “the architects and engineers of digital transformation,” Swanson noted that there are DS use cases “in every industry and function,” providing the means to generate “new business channels and new business models.” But achieving those goals requires the will—and a strategy—for extending the work data scientists can do as widely across the enterprise as resources will allow.
“It’s about creating a process that delivers reliable outputs to drive business outcomes,” Swanson said. “You need to put it into action—that’s real DS.”