AI is the goal for many enterprises. But, an organization needs machine learning, in order to do AI. And, machine learning is not possible without analytics. And analytics is not possible without simple, elegant data infrastructure. Simply put, there is no AI without IA (information architecture). Success in both areas is more often about culture, than technology.
***
Culture is either one of a company’s most powerful assets, or it’s an obstruction. Most enterprises do not have a data culture. Many do not even know they need it. Ironically, the current culture of an organization often prevents an enterprise from knowing that they need a new one (i.e. can’t see forest due to trees).
I believe that the biggest obstacle to a data culture is the fear of complexity. Ben Thompson once wrote, “Culture is not something that begets success, rather, it is a product of it.” If an enterprise has not had visible/material success with data, how could anyone possibly expect it to form a data culture?
Our mission is to make data simple and accessible to the world. We are enabling companies to sow the seeds of a data culture, with a practical approach to achieve a successful outcome. Said another way, we are enabling organizations to do data science faster.
***
I once heard that the difference between a data science project and a software engineering project is that with the former, you have no idea if it will actually work. Even if you are a staunch ‘fail fast’ supporter, that is too much of an unknown for many. Most organizations that make an investment want some understanding of how they will generate a return on that investment. I understand that is not the Silicon Valley mantra, but most enterprises are held to a different standard of ROI than Silicon Valley. It’s not right or wrong, it’s just different. High certainty and modest returns is preferred by many enterprises over a more aggressive approach. In economics, we call this tolerance for risk adjusted returns.
My observation is that most of the time invested in building and deploying machine learning is not spent on algorithms and models. Instead, it is spent on the most mundane of tasks: data preparation, data movement, feature extraction, etc. These are a necessary evil and the place where most risk in a data science project resides; garbage-in/garbage-out leads to low certainty.
With the newly announced IBM Integrated Analytics System, we aspire to solve 2 problems that I see in every organization:
1) A desire to apply data science and machine learning at scale. Now, and with certainty.
2) The intent to move to cloud, to accelerate digital transformation.
We started with the assumption of large data sets. Whether on public cloud, private cloud, Hadoop, data warehouse, or otherwise, the ideal solution enables federation across all data types and locations. With our common SQL engine, this is easy.
Once you can easily access all data, the next challenge is to apply data science and machine learning: building, training, and deploying models. And then, via feedback loops, leveraging those models to make predictions and automate previously manual tasks. Fundamentally, those are the 2 reasons to focus on data science: predictions and automation.
Lastly, moving to the cloud is now as simple as a click of a button. With a common codebase across private and public cloud, it is easy to move data and run applications wherever is preferred. With Data Science Experience, you build models where you want (private or public cloud) and deploy to either environment. Your data is limited by your imagination, not your firewall.
***
AI is fundamentally about using machine learning and deep learning techniques to enable applications that are built on data. Every organization that aspires to a data culture has to pick a place to start. Deep learning will make data accessible that previously was not; if that will create momentum, with a high chance of success, that is where you should start. For other organizations, better predictions and automation will beget a data culture. Regardless of which path you choose, the objective is the same: do data science faster.