Algorithms and Algorithmic Transparency

Our July 15th post summarized the USACM-EUACM joint statement on Algorithmic Transparency and Accountability (ATA) and introduced the ATA FAQ project by the USACM Algorithms Working Group. Their goal is “to take the lead addressing the technical aspects of algorithms and to have content prepared for media inquiries and policymakers.” The SIGAI has been asked to contribute expertise in developing content for the FAQ. Please comment to this posting so we can collect and share insights with USACM. You can also send your ideas and suggestion directly to Cynthia Florentino, ACM Policy Analyst, at cflorentino@acm.org.

The focus of this post is the discussion of “algorithms” in the FAQ. Your feedback will be appreciated. Some of the input we received is as follows:
“Q: What is an algorithm?
A: An algorithm is a set of well-defined steps that leads from inputs (data) to outputs (results). Today, algorithms are used in decision-making in education, access to credit, employment, and in the criminal justice system.  An algorithm can be compared to a recipe that runs in the same way each time, automatically using the given input data. The input data is combined and placed through the same set of steps, and the output is dependent on the input data and the set of steps that comprise the algorithm.”
and
“Q: Can algorithms be explained? Why or why not?  What are the challenges?
A: It is not always possible to interpret machine learning and algorithmic models. This is because a model may use an enormous volume of data in the process of figuring out the ideal approach. This in turn, makes it hard to go back and trace how the algorithm arrived at a certain decision.”

This post raises an issue with the use of the term “algorithm” in the era of Big Data in which the term “machine learning” has been incorporated into the field of data analytics and data science. The AI community needs, in the case of the ATA issues, to give careful attention to definitions and concepts that enables a clear discourse on ATA policy.

A case in point, and we welcome input of SIGAI, is the central role of artificial neural networks (NN) in machine learning and deep learning. In what sense is a NN algorithmic? Toward the goal of algorithmic transparency, what needs to be explained about how a NN works? From a policy perspective, what are the challenges in addressing the transparency of a NN component of machine learning frameworks with audiences of varying technical backgrounds?

The mechanisms for training neural networks are algorithmic in the traditional sense of the word by using a series of steps repeatedly in the adjustment of parameters such as in multilayer perceptron learning. The algorithms in NN training methods operate the same way for all specific applications in which input data is mapped to output results. Only a high-level discussion and use of simplified diagrams are practical for “explaining” these NN algorithms to policymakers and end users of systems involving machine learning.

On the other hand, the design and implementation of applications involving NN-based machine learning are surely the real points of concern for issues of “algorithmic transparency”. In that regard, the “explanation” of a particular application could discuss the careful description of a problem to be solved and the NN design model chosen to solve the problem. Further, (for now) human choices are made about the number and types of input items and the numbers of nodes and layers, method for cleaning and normalizing input data, choice of an appropriate error measure and number of training cycles, appropriate procedure for independent testing, and the interpretation of results with realistic uncertainty estimates. The application development procedure is algorithmic in a general sense, but the more important point is that assumptions and biases are involved in the design and implementation of the NN. The choice of data, and its relevance and quality, are eminently important in understanding the validity of a system involving machine learning. Thus, the transparency of NN algorithms, in the technical sense, might well be explained, but the transparency and biases of the model and implementation process are the aspects with serious policy consequences.

We welcome your feedback!

6 thoughts on “Algorithms and Algorithmic Transparency”

  1. Do we really want to classify machine learning in the general case as algorithmic? While there are some algorithms involved, machine learning often contains a great deal of inference. I would much rather see machine learning processes categorized as a form of inference engine.

    Inference requires justification, correct algorithms do not. The conclusions of an inference process require both explanation and justification. If we shift the paradigm of machine learning away from algorithms to inference then we can apply justification methods to both the machine learning process and the conclusions. In one interpretation machine learning is a form of knowledge acquisition. From a simplified perspective of epistemology; knowledge should be justified and true.

    Machine learning in the general case is at best a hybrid computation process that involves algorithms, inference, and uncertainty. If we don’t make clear that machine learning is more than a set of algorithms then we risk giving more weight or different weight to machine learning conclusions than what they deserve.

  2. Rober R. Hoffman, Institute for Human and Machine Cognition and Gary Klein, Macrognition, LLC
    raise some relevant discussion in their “Explaining Explanation Theoretical Foundations” in the May/June 2017 Vol 32 No 3 issue of IEEE Intelligent Systems. Where they begin a three part discussion that includes how might intelligent systems explain their inner workings?

    Where as the term algorithm has been well defined and is owned by the mathematics and computer science communities, its not so clear whether that is the case for the phrase “machine learning”. There is a denotation for the phrase “machine learning” but I suspect the connotation of the phrase is shifting almost daily and has become the more readily used notion.

    It is certainly not the case that “algorithm” and “machine learning” are synonymous
    Some algorithms may use or be the result of machine learning, and some machine learning may use or be the result of algorithms but they are very different animals. As a community we have somewhat of a responsibility to properly differentiate these concepts. There is the set of things that we call “algorithms” and there is the set of things that we call “machine learning”, and there are at times intersections between these to sets. We should never let the set difference between the set of machine learning things and this intersection be given the status of algorithm.

    Machine Learning is gaining a pop culture set of definitions that may overtime obscure the denotation of the phrase, we would not want the term algorithm to be taken down with it.

  3. All algorithms are sets of instructions but not all sets of instructions are algorithms. When an algorithm is only contingent on mathematical computation then the validity of that algorithm is limited by the validity of the underlying mathematical computation. When an algorithm is contingent on deductive, inductive, or abductive computation we find ourselves in a different space. Uncertainty, plausibility, defeasibility, truth, and I would dare say epistemology kicks in. A set of computer instructions that involves deduction, induction, or abduction and accomplishes what we call “machine learning” must be held to a different standard for correctness than an algorithm that is only contingent on mathematical computation.

    Of course, the question is how do we know or can we know whether a conclusion or a result is correct when the set of computer instructions that produces that result is deductive, inductive or abductive in nature?

    In simple terms most algorithms can be shown to be correct using deductive argument (mathematical proof). On the other hand “machine learning processes” are not so lucky, and the validity and correctness of their results are not easily subject to mathematical proof but rather must be scrutinized by examining justification ,explanation and interpretation for the data involved (usually big data) in addition to proving correct the set of instruction that process that data.

    Put another way: “An Agent using a machine learning method has acquired enough knowledge to make a decision and has accordingly made the decision”. Before we can even talk about the correctness of the decision, the knowledge acquired must be challenged. To equivocate the agent loop with an algorithm is a mistake. And to assume the agent’s knowledge based on machine learning methods begs the question.

    While Algorithm Transparency for Machine Learning is necessary, it is not sufficient because the inferential knowledge acquired by the Machine Learning process must undergo further justification.

  4. Our community needs to do a lot of work on clarifying these terms. Because for many out there the following is true:

    Artificial Intelligence = Machine Learning

    and now if we are equating Machine Learning and the term “Algorithm” wtf?

  5. Thanks, Cameron.
    As a longtime AI person who is now the director of a Data Science program, the “evolution” of language is hard to take. “Algorithm” describes most everything now, and you can find “debates” about which came first, machine learning or AI. Fortunately my data science students asked me to teach a comprehensive AI course so they could learn about its role and historical context.
    Of course in the history of computer science, we saw that when a new technical term arose, everyone suddenly had one, like spreadsheet = relational database. I think this is in part because of commercial products wanting to be advertised as the latest thing. Then, we have the desire to sound more intellectual by saying “utilize”. So, does some machine learning use, or utilize, algorithms?

    1. This evolution of terminology, language,vernacular oh so true 🙁 but alas we have to be keepers of the gate here. I haven’t quite kept score but do the various other areas of mathematics, or science language undergo the same kind of mutation at the hands of pop culture and the market place?

      Yes it is hard to keep the distinction between computer applications, computer technology and computer science untangled, but the issues that arise that require algorithm transparency, or the slippery slope awaiting us in the predictive policing demand that our community hold the line, or at very least define the line. The stakes are too high.

      The notion and definition of algorithm is so fundamental to mathematics and computer science that we cannot let it be redefined, no more than biology would allow cell, or DNA to be hijacked. Imagine what physics would look like if we could redefine and bend fundamental concepts like force, electron, or mass to mean whatever we wanted them to?

      Our community needs to get a handle on the definitions of things fundamental to computer science and artificial intelligence otherwise it will be difficult to help inform policy or even communicate to the lawmakers. Not to mention how do you transfer knowledge to students what words will we use?

      Of course this opens a can of worms for the ever evolving phrase “Artificial Intelligence” whose definition has not only been dramatically influenced by the market place and pop culture, but is also subject to the whims of the politics of research funding where we must disguise the vernacular used in research proposals every 3-5 years.

      In order for us to have a legitimate discussion about, or contribute to public policy on Artificial Intelligence or Machine Learning or Data Mining, we have to all be using the same dictionary. LOL

      Yea, for some reason I don’t think the other areas of mathematics, or the sciences suffer from shifting and evolving lexicons, at least not to the degree we do. We need to be able to define all of this, while we still can….

Leave a Reply

Your email address will not be published.