Artificial Intelligence and Open Source – Can Machines Think? It Depends.

Alan Turing posed the question: Can machines think? As a lawyer, of course, the answer is: it depends. Alan Turing went on to state that “artificial intelligence is based on the assumption that the process of human thought can be mechanized.”[1] That assumption has driven the development of artificial intelligence (AI) both before and after Turing’s statement. This article will discuss what AI is, a brief history of artificial intelligence, and provide an overview of the technology. In addition, the role of open source software in AI development will be discussed, patenting AI inventions, AI and California privacy law, and the payoffs and perils of using AI in a law practice.

AI isn’t one monolithic idea, rather, it exists on several levels. First is narrow AI, which is a system that can complete a single or a narrow set of tasks at a level equal to, or better than, human intelligence. Narrow AI includes such capabilities as speech recognition, more specifically, text or aural communication with a person and visual, pictorial, depictions, including facial recognition or detection of objects. A second type of AI is general AI, which is a system able to complete human tasks at a level equal to, or better than, human intelligence.[2] A system with general AI should also be able to adapt to situations that were unforeseen at the creation of the AI. This implies that the system is capable of machine learning and has all of the sensory and decision-making capabilities of a human.

AI includes a number of base capabilities, which are included in most current systems or those under development. The first capability is machine learning, where the system learns and adapts without explicit instruction using algorithms and statistical models. In addition, most AI systems rely on neural networks. Neural networks are at the heart of deep learning and rely on training data to learn and improve their accuracy over time. Fine-tuning renders these learning algorithms as powerful drivers that allow classification and clustering data at high velocity. An example of a neural network is Google’s search algorithm. Deep learning takes neural networks a step further. Deep learning is a type of machine learning based on artificial neural networks in which multiple layers of processing extract progressively higher-level features from data.[3] AI also works with computer vision, which enables computers to derive information from images, videos, and other similar inputs.[4] AI uses natural language processing to understand human language, regardless of whether the language is written, spoken, or even scribbled.[5]

AI, in some forms, has been around longer than is commonly realized. The initial work was founded at a workshop held at Dartmouth College in 1956, known as the “Dartmouth Summer Research Project on Artificial Intelligence.” [6] Most of us have used AI through some applications, though we may not have realized it. Voice assistants, text-to-speech programs, and those annoying speech-enabled call handling systems all use some simple forms of AI.

The following chart illustrates the growth and progressive development of AI systems.

the growth and progressive development of AI systems

While funding and growth have been steady, there are gaps when funding was lacking and progress was slow. Many of the problems arose because computing power was in its infancy, and both processor power and memory could not cope with the demands. The 1980s were a boom time, with the rise of “expert systems” that entered the mainstream. Robotics developers began to incorporate AI into the early robots. A renaissance in AI growth was spurred by the improvements in computer processing power and memory. AI benefitted from the development of the supercomputer and the growth of computer memory. Development in various areas has led to multiple competing subfields and areas for AI use. The combinatorics difficulties of solving huge tasks have been greatly assisted by these computer technology improvements. Today, a high-end desktop computer can process at speeds of 400 GFLOPS (109), which is enough to run visual recognition software. Other applications, such as voice recognition and speech processing can readily be run on a desktop computer or even a high-end laptop. With the rise of cloud computing and the development of current AI tools, it is now possible to use AI for many tasks.

Open source software is software that is free to access, use, and change without restrictions. More specifically, open source software is software that is distributed with its source code, making it available for use and distribution with its original rights. Source code is the part of the computer software that computer users don’t see, as it is the code manipulated by computer programmers to control how a program or application operates.[7]      Open source software plays a central role in the development and use of AI. Open source software affects nearly every issue in AI policy, but has not been part of AI policy discussions.[8] Open source tools are especially important to deep learning AI and in that subfield the open source tools are the best tools.[9]

Using open-source tools can speed up AI development and avoid the “reinventing the wheel” problem when using proprietary code. The collaborative development in combination with an engaged developer community that is both collaborative and competitive results in accessible, tested, robust, and high-quality software code.[10]  A further advantage of open-source AI tools is the enablement and increased use of ethical AI. Open source tools like IBM’s AI Fairness 360, Microsoft’s Fairlearn, and the University of Chicago’s Aequitas are facilitating the detection and mitigation of AI bias, such as the bias found in many AI-assisted employment and other screening tools.[11] In addition, open source AI also provides tools for interpreting and explaining AI, such as IBM’s AI Explainability, which make it easier for data scientists to interrogate and understand the inner workings of their models. In effect, the open-source software has created default AI standards.

AI can both help and hinder technology sector competition. Both Google and Facebook have open-sourced their deep learning tools (Tensorflow and PyTorch). The tools may help competition in that more AI development occurs and the better systems will be more used. Alternatively, making the AI open source code from major technology companies may also entrench those companies even further in their dominant positions. While the open source code is freely available, in actuality, neither Google nor Facebook is relinquishing any control over the development of these deep learning tools.[12]

With the growth and use of open source AI issues in intellectual property rights have arisen. Generative AI can produce remarkable visuals as well as essays, poems, and legal documents. AI is an excellent mimic, but it can and does take creative license with facts.[13] The intellectual property issues include infringement and right of use issues, and use of unlicensed content in training data.[14] The issues include whether users should be able to prompt the AI tools with direct references to other creators copyrighted and trademarked works by name and without their permission.[15] Lawsuits have already been filed, with Andersen et al v. Stability AI Ltd. et al,[16] In Andersen, three artists formed a class to sue multiple generative AI platforms on the basis of the AI using their original works to train AI without permission. In addition, Getty, an image licensing service, has also filed a lawsuit against the creators of Stable Diffusion, alleging improper use of its photos, violating both copyright and trademark rights in its watermarked photograph collection.[17]

Fair use doctrine may also pose a problem, as different courts may have different interpretations and case outcomes will hinge on fair use doctrine. Both companies and individuals can mitigate the risk of copyright and trademark issues by ensuring compliance with the laws governing the acquisition of data. In addition, the provenance of AI-generated work should be maintained, and increased transparency should be provided about the works included in the training datasets. Creators should actively take steps to protect their intellectual property and should monitor digital and social channels for the appearance of works that may be derived from their own. Companies should carefully examine the style of derivative works to evolve their trademark and trade dress monitoring.

Patenting AI inventions may be particularly problematic and new technology in AI computing may face patent hurdles. Those hurdles may include the present patentable subject matter difficulties. As is well-known, software may not be patented, and adding an AI component may only complicate matters. Open source material may not be patent-eligible and open source code must be shared with the developer community. Training using a dataset and making determinations based on the learned data may give rise to § 101 issues, as Examiners are likely to issue a rejection on the grounds that the method “can be performed in the mind,” a particularly difficult rejection to overcome. Another challenge may be that it may be difficult to recruit examiners with the education and skills to examine applications in this area. Good drafting practices for software applications, such as claiming graphical user interfaces and actions that produce measurable physical outcomes or output may prove effective.

AI applications may also face issues under the California Consumer Privacy Act (CCPA), which was passed in June of 2018.[18]    The CCPA allows any California consumer to demand to see all of the information a company has saved on them, as well as a list of all third parties the data is shared with. In addition, the CCPA takes a broad view of what constitutes private data. The CCPA will apply if a company serves California residents and has at least $25 million in revenue.[19] This may pose a problem for companies that “scrape” data as private data can easily be collected and make its way into the training datasets used in AI. Information is often copied from the Internet or social media without user’s explicit consent.

With all of these concerns, AI is still poised to be a valuable and timesaving tool for legal work. However, a number of key ethics points should be kept in mind. Many are familiar with the attorney who received sanctions for using ChatGPT to write a brief. The now-infamous brief cited six non-existent court decisions.[20] The ABA Model Rules do not explicitly address AI. There are Model Rules that are implicated in AI use. These include the duty of competence and the duty to preserve client confidences. In the not-too-distant future, attorneys will likely be required to demonstrate competency and proficiency in AI technology, just as they have been required to be proficient in computer usage.

For lawyers looking to use AI there are some steps to take to avoid problems. The hallucination errors the above attorney experienced will continue and it will be incumbent on attorneys to diligently check the arguments and supporting citations, just as many did cite-checks on law review. If this work is delegated to an assistant, or to the AI it will also be considered to be an assistant for purposes of the Model Rules. This will require that the AI program and its assistants be supervised. Lawyers must also take steps to ensure that client’s data is not taken by the AI program and inserted into a training dataset, or the client’s data used to drive the AI program for a legal solution. A digital firewall must be maintained to ensure the client’s data.

AI, and in particular, open source AI has the potential to radically reshape how legal work is performed and who performs that work. Some legal tasks may become easier and much time may be saved. The need for vigilant and thorough review will increase. Hallucinations formerly were the subject of ghost and horror movies, now they may become a part of law practice.

[1] A. M. Turing Computer Machinery and Intelligence, Mind 49: 433-460

[2] Naveen Joshi 7 Types of Artificial Intelligence, Cognitive World (June 2019).

[3] Google Dictionary, Deep Learning.

[4] Google Dictionary, Computer Vision.

[5] https://www.coursera.org/articles/natural-language-processing

[6] Grace Solomonoff The Meeting of the Minds That Launched AI, (May 2023)

[7] https://www.synopsis.com/glossary/what-is-eopn-source-software_thml

[8] Alex Engler How open-source software shapes AI policy, Brookings (August 2021)

[9] Id.

[10]Id.

[11] Id.

[12] Id.

[13] Gil Appel, Juliana Neelbauer, and David A. Schweidel Generative AI Has an Intellectual Property Problem, Harvard Business Review (April 7, 2023).

[14] Id.

[15] Id.

[16] Andersen et al v. Stability AI Ltd. et al., District Ct. N.D. California, 2022.

[17] Generative AI has an Intellectual Property Problem.

[18] California Consumer Privacy Act § 1798.100

[19] Id.

[20] Karen Sloan A Lawyer Used ChatGPT to cite bogus cases, what are the ethics issues, Reuters (May30, 2023).