I recently spoke with Dr. Eric Daimler about how we can build on
the framework he and his colleagues established during his tenure as a
contributor to issues of AI policy in the White House during the Obama administration.
Eric is the CEO of the MIT-spinout Conexus.com and holds a PhD in Computer
Science from Carnegie Mellon University. Here are the interesting results of my
interview with him. His ideas are important as part of the basis for ACM SIGAI
Public Policy recommendations.
LRM: What are the main ways we should be addressing this issue of
data for AI?
EAD: To me there is one big re-framing from which we can approach this
collection of issues, prioritizing data interoperability within a larger frame
of AI as a total system. In the strict definition of AI, it is a learning
algorithm. Most people know of subsets such as Machine Learning and subsets of
that called Deep Learning. That doesn’t help the 99% who are not AI
researchers. When I have spoken to non-researchers or even researchers who want
to better appreciate the sensibilities of those needing to adopt their
technology, I think of AI as the interactions that it has. There is the
collection of the data, the transportation of the data, the analysis or
planning (the traditional domain in which the definition most strictly fits),
and the acting on the conclusions. That sense, plan, act framework works pretty
well for most people.
LRM: Before you explain just how we can do that, can you go ahead
and define some of your important terms for our readers?
EAD: AI is often described as the economic engine of
the future. But to realize that growth, we must think beyond AI to the whole
system of data, and the rules and context that surround it: our data
infrastructure (DI). Our DI supports not only our AI technology, but also
our technical leadership more generally; it underpins COVID reporting, airline
ticket bookings, social networking, and most if not all activity on the
internet. From the unsuccessful launch of
healthcare.gov, to the recent failure of Haven, to the months-long hack into
hundreds of government databases, we have seen the consequences faulty DI can
have. More data does not lead to better outcomes; improved DI does.
Fortunately, we have the technology and foresight to prevent
future disasters, if we act now. Because AI is fundamentally limited by the
data that feeds it, to win the AI race, we must build the best DI. The new
presidential administration can play a helpful role here, by defining standards
and funding research into data technologies. Attention to the need for better
DI will speed responsiveness to future crises (consider COVID data delays) and
establish global technology leadership via standards and commerce. Investing in
more robust DI will ensure that anomalies, like ones that would have helped us
identify the Russia hack much sooner, will be evident, so we can prevent future
malfeasance by foreign actors. The US needs to build better data infrastructure
to remain competitive in AI.
LRM: So how might we go about prioritizing data interoperability?
EAD: In 2016, the Department of Commerce (DOC) discovered that on
average, it took six months to onboard new suppliers to a midsize trucking
company—because of issues with data interoperability. The entire American
economy would benefit from encouraging more companies to establish semantic
standards, internally and between companies, so that data can speak to other
data. According to a DOC report in early 2020, the technology now exists
for mismatched data to communicate more easily and data integrity to be
guaranteed, thanks to a new area of math called Applied Category Theory (ACT).
This should be made widely available.
LRM: And what about enforcing data provenance?
EAD: As data is transformed across platforms—including trendy cloud migrations—its
lineage often gets lost. A decision denying your small business loan can and
should be traceable back to the precise data the loan officer had at that time.
There are traceability laws on the books, but they have been rarely enforced
because up until now, the technology hasn’t been available to comply. That’s no
longer an excuse. The fidelity of data and the models on top of them should be proven—down
to the level of math—to have maintained integrity.
LRM: Speaking more generally, how can we start to lay the groundwork to reap the benefits of these advancements in data infrastructure?
EAD: We need to formalize. When we built 20th century assembly lines,
we established in advance where and how screws would be made; we did not
ask the village blacksmith to fashion custom screws for every home repair. With
AI, once we know what we want to have automated (and there are good reasons to
not to automate everything!), we should then define in advance how we
want it to behave. As you read this, 18 million programmers are already
formalizing rules across every aspect of technology. As an automated car
approaches a crosswalk, should it slow down every time, or only if it senses a
pedestrian? Questions like this one—across the whole economy—are best answered
in a uniform way across manufacturers, based on standardized, formal, and
socially accepted definitions of risk.
LRM: In previous posts, I have discussed roles and
responsibilities for change in the use of AI. Government regulation is of
course important, but what roles do you see for AI tech companies, professional
societies, and other entities in making the changes you recommend for DI and
other aspects of data for AI?
What is different this time is the abruptness of change. When automation technologies work, they can be wildly disruptive. Sometimes this is very healthy (see: Schumpeter). I find that the “go fast and…” framework has its place, but in AI it can be destructive and invite resistance. That is what we have to watch out for. Only with responsible coordinated action do we encourage adoption of these fantastic and magical technologies. Automation in software can be powerful. These processes need not be linked into sequences just because they can. That is, just because some system can be automated does not mean that it should. Too often there is absolutism in AI deployments when what is called for in these discussions is nuance and context. For example, in digital advertising my concerns are around privacy, not physical safety. When I am subject to a plane’s autopilot, my priorities are reversed.
With my work
in the US Federal Government, my bias remains against regulation as a
first-step. Shortly after my time with the Obama Whitehouse, I am grateful to
have participated with a diverse group for a couple of days at the Halcyon
House in Washington D.C. We created some principles for deploying AI to
maximize adoption. We can build on these and rally around a sort of LEED-like
standard for AI deployment.
Dr. Eric Daimler is CEO & Founder of Conexus and Board Member of Petuum and WelWaze. He was a Presidential Innovation Fellow, Artificial Intelligence and Robotics. Eric is a leading authority in robotics and artificial intelligence with over 20 years of experience as an entrepreneur, investor, technologist, and policymaker. Eric served under the Obama Administration as a Presidential Innovation Fellow for AI and Robotics in the Executive Office of President, as the sole authority driving the agenda for U.S. leadership in research, commercialization, and public adoption of AI & Robotics. Eric has incubated, built and led several technology companies recognized as pioneers in their fields ranging from software systems to statistical arbitrage. Currently, he serves on the boards of WelWaze and Petuum, the largest AI investment by Softbank’s Vision Fund. His newest venture, Conexus, is a groundbreaking solution for what is perhaps today’s biggest information technology problem — data deluge. Eric’s extensive career across business, academics and policy gives him a rare perspective on the next generation of AI. Eric believes information technology can dramatically improve our world. However, it demands our engagement. Neither a utopia nor dystopia is inevitable. What matters is how we shape and react to, its development. As a successful entrepreneur, Eric is looking towards the next generation of AI as a system that creates a multi-tiered platform for fueling the development and adoption of emerging technology for industries that have traditionally been slow to adapt. As founder and CEO of Conexus, Eric is leading CQL a patent-pending platform founded upon category theory — a revolution in mathematics — to help companies manage the overwhelming challenge of data integration and migration. A frequent speaker, lecturer, and commentator, Eric works to empower communities and citizens to leverage robotics and AI to build a more sustainable, secure, and prosperous future. His academic research has been at the intersection of AI, Computational Linguistics, and Network Science (Graph Theory). His work has expanded to include economics and public policy. He served as Assistant Professor and Assistant Dean at Carnegie Mellon’s School of Computer Science where he founded the university’s Entrepreneurial Management program and helped to launch Carnegie Mellon’s Silicon Valley Campus. He has studied at the University of Washington-Seattle, Stanford University, and Carnegie Mellon University, where he earned his Ph.D. in Computer Science.