الذكاء الاصطناعي التوليدي

Generative artificial intelligence (AI) is artificial intelligence capable of generating text, images, or other media, using generative models.[1][2][3] Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.[4][5]

A detailed oil painting of figures in a futuristic opera scene
Théâtre d'Opéra Spatial, an image generated by Midjourney

In the early 2020s, advances in transformer-based deep neural networks enabled a number of generative AI systems notable for accepting natural language prompts as input. These include large language model chatbots such as ChatGPT, Bing Chat, Bard, and LLaMA, and text-to-image artificial intelligence art systems such as Stable Diffusion, Midjourney, and DALL-E.[6][7][8]

Generative AI leverages neural networks to imaginatively produce novel content, driving creative advancements in fields beyond art, like drug discovery and sustainable design. Generative AI has uses across a wide range of industries, including art, writing, software development, product design, healthcare, finance, gaming, marketing, and fashion.[9][10][11] Investment in generative AI surged during the early 2020s, with large companies such as Microsoft, Google, and Baidu as well as numerous smaller firms developing generative AI models.[1][12][13] However, there are also concerns about the potential misuse of generative AI, including cybercrime or creating fake news or deepfakes which can be used to deceive or manipulate people.[14]

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

History

The academic discipline of artificial intelligence was founded at a research workshop at Dartmouth College in 1956, and has experienced several waves of advancement and optimism in the decades since.[15] Since its founding, researchers in the field have raised philosophical and ethical arguments about the nature of the human mind and the consequences of creating artificial beings with human-like intelligence; these issues have previously been explored by myth, fiction and philosophy since antiquity.[16] These concepts of automated art date back at least to the automata of ancient Greek civilization, where inventors such as Daedalus and Hero of Alexandria were described as having designed machines capable of writing text, generating sounds, and playing music.[17][18]

Since the founding of AI in the 1950s, artists and researchers have used generative artificial intelligence to create new works. By the early 1970s, Harold Cohen was creating and exhibiting works created by AARON, the computer program Cohen created to generate paintings.[19]

The field of machine learning often uses statistical models, including generative models, to model and predict data. Beginning in the late 2000s, the emergence of deep learning drove progress and research in image classification, speech recognition, natural language processing and other tasks. Neural networks in this era were typically trained as discriminative models, due to the difficulty of generative modeling.[20]

In 2014, advancements such as the variational autoencoder and generative adversarial network produced the first practical deep neural networks capable of learning generative, rather than discriminative, models of complex data such as images. These deep generative models were the first able to output not only class labels for images, but to output entire images.

In 2017, the Transformer network enabled advancements in generative models, leading to the first Generative pre-trained transformer (GPT) in 2018.[21] This was followed in 2019 by GPT-2 which demonstrated the ability to generalize unsupervised to many different tasks as a Foundation model.[22]

In 2021, the release of DALL-E, a transformer-based pixel generative model, followed by Midjourney and Stable Diffusion marked the emergence of practical high-quality artificial intelligence art from natural language prompts.

In March 2023, GPT-4 was released. A team from Microsoft Research argued that "it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system".[23]

Modalities

A generative AI system is constructed by applying unsupervised or self-supervised machine learning to a data set. The capabilities of a generative AI system depend on the modality or type of the data set used.

Generative AI can be either unimodal or multimodal; unimodal systems take only one type of input, whereas multimodal systems can take more than one type of input.[24] For example, one version of OpenAI's GPT-4 accepts both text and image inputs.[25]

النص

World knowledge in hand,
Infinite pages unfold,
Wisdom's vast, free land.

GPT-4, prompt a haiku about Wikipedia

Generative AI systems trained on words or word tokens include GPT-3, LaMDA, LLaMA, BLOOM, GPT-4, and others (see List of large language models). They are capable of natural language processing, machine translation, and natural language generation and can be used as foundation models for other tasks.[26] Data sets include BookCorpus, Wikipedia, and others (see List of text corpora).

الكود

In addition to natural language text, large language models can be trained on programming language text, allowing them to generate source code for new computer programs.[27] Examples include OpenAI Codex.

الصور

 
Stable Diffusion, prompt Cinematic photo of a dog on the Internet editing Wikipedia

Generative AI systems trained on sets of images with text captions include Imagen, DALL-E, Midjourney, Adobe Firefly, Stable Diffusion and others (see Artificial intelligence art, Generative art, and Synthetic media). They are commonly used for text-to-image generation and neural style transfer.[28] Datasets include LAION-5B and others (See Datasets in computer vision).

الموسيقى

MusicGen, prompt encyclopedic synth pop track with bassy drums and neutral point of view

Generative AI systems such as MusicLM[29] and MusicGen[30] can be trained on the audio waveforms of recorded music along with text annotations, in order to generate new musical samples based on text descriptions such as a calming violin melody backed by a distorted guitar riff.

Video

Runway Gen2, prompt A golden retriever in a suit sitting at a podium giving a speech to the white house press corps

Generative AI trained on annotated video can generate temporally-coherent video clips. Examples include Gen1 and Gen2 by RunwayML[31] and Make-A-Video by Meta Platforms.[32]

Molecules

Generative AI systems can be trained on sequences of amino acids or molecular representations such as SMILES representing DNA or proteins. These systems, such as AlphaFold, are used for protein structure prediction and drug discovery.[33] Datasets include various biological datasets.

Robot actions

Generative AI trained on the motions of a robotic system can generate new trajectories for motion planning or navigation. For example, UniPi from Google Research uses prompts like "pick up blue bowl" or "wipe plate with yellow sponge" to control movements of a robot arm.[34] Multimodal "vision-language-action" models such as Google's RT-2 can perform rudimentary reasoning in response to user prompts and visual input, such as picking up a toy dinosaur when given the prompt pick up the extinct animal at a table filled with toy animals and other objects.[35]

Software and Hardware

Generative AI models are used to power chatbot products such as ChatGPT, programming tools such as GitHub Copilot,[36] text-to-image products such as Midjourney, and text-to-video products such as Runway Gen-2.[37] Generative AI features have been integrated into a variety of existing commercially-available products such as Microsoft Office,[38] Google Photos,[39] and Adobe Photoshop.[40] Many generative AI models are also available as open-source software, including Stable Diffusion and the LLaMA[41] language model.

Smaller generative AI models with up to a few billion parameters can run on smartphones, embedded devices, and personal computers. For example, LLaMA-7B (a version with 7 billion parameters) can run on a Raspberry Pi 4[42] and one version of Stable Diffusion can run on an iPhone 11.[43]

Larger models with tens of billions of parameters can run on laptop or desktop computers. To achieve an acceptable speed, models of this size may require accelerators such as the GPU chips produced by Nvidia and AMD or the Neural Engine included in Apple silicon products. For example, the 65 billion parameter version of LLaMA can be configured to run on a desktop PC.[44]

Language models with hundreds of billions of parameters, such as GPT-4 or PaLM, typically run on datacenter computers equipped with arrays of GPUs (such as Nvidia's H100) or AI accelerator chips (such as Google's TPU). These very large models are typically accessed as cloud services over the Internet.

In 2022, the United States New Export Controls on Advanced Computing and Semiconductors to China imposed restrictions on exports to China of GPU and AI accelerator chips used for generative AI.[45] Chips such as the Nvidia A800[46] and the Biren Technology BR104[47] were developed to meet the requirements of the sanctions.


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Concerns

The development of generative AI has raised concerns from governments, businesses, and individuals, resulting in protests, legal actions, calls to pause AI experiments, and actions by multiple governments. In a July 2023 briefing of the United Nations Security Council, Secretary-General António Guterres stated "Generative AI has enormous potential for good and evil at scale", that AI may "turbocharge global development" and contribute between $10 and $15 trillion to the global economy by 2030, but that its malicious use "could cause horrific levels of death and destruction, widespread trauma, and deep psychological damage on an unimaginable scale".[48]

Controversies

 
A picketer at the 2023 Writers Guild of America strike. While not a top priority, one of the WGA's 2023 requests was "regulations around the use of (generative) AI".[49]

In January 2023, Futurism.com broke the story that CNET had been using an undisclosed internal AI tool to write at least 77 of its stories; after the news broke, CNET posted corrections to 41 of the stories.[50]

In April 2023, German tabloid Die Aktuelle published a fake AI-generated interview with former racing driver Michael Schumacher, who had not made any public appearances since 2013 after sustaining a brain injury in a skiing accident. The story included two possible disclosures: the cover included the line "deceptively real", and the interview included an acknowledgement at the end that it was AI-generated. The editor-in-chief was fired shortly thereafter amid the controversy.[51]

In July 2023, developments in generative AI contributed to the 2023 Hollywood labor disputes. Fran Drescher, president of the Screen Actors Guild, declared that "artificial intelligence poses an existential threat to creative professions" during the 2023 SAG-AFTRA strike.[52]

Regulation

In the European Union, the proposed Artificial Intelligence Act includes requirements to disclose copyrighted material used to train generative AI systems, and to label any AI-generated output as such.[53]

In the United States, a group of companies including OpenAI, Alphabet, and Meta signed a voluntary agreement with the White House in July 2023 to watermark AI-generated content.[54]

In China, the Interim Measures for the Management of Generative AI Services introduced by the Cyberspace Administration of China regulates any public-facing generative AI. It includes requirements to watermark generated images or videos, regulations on training data and label quality, restrictions on personal data collection, and a guideline that generative AI must "adhere to socialist core values".[55][56]

Cybercrime

Generative AI's ability to create realistic fake content has been exploited in numerous types of cybercrime, including phishing scams.[57] Deepfake video and audio have been used to create disinformation and fraud. Former Google fraud czar Shuman Ghosemajumder has predicted that while deepfake videos initially created a stir in the media, they would soon become commonplace, and as a result, more dangerous.[58] Cybercriminals have created large language models focused on fraud, including WormGPT and FraudGPT.[59]

Job losses

In April 2023, it was reported that image generation AI has resulted in 70% of the jobs for video game illustrators in China being lost.[60][61]

See also

References

  1. ^ أ ب Griffith, Erin; Metz, Cade (2023-01-27). "Anthropic Said to Be Closing In on $300 Million in New A.I. Funding". The New York Times. Retrieved 2023-03-14.
  2. ^ Lanxon, Nate; Bass, Dina; Davalos, Jackie (March 10, 2023). "A Cheat Sheet to AI Buzzwords and Their Meanings". Bloomberg News. Retrieved March 14, 2023.
  3. ^ Pinaya, Walter H. L.; Graham, Mark S.; Kerfoot, Eric; Tudosiu, Petru-Daniel; Dafflon, Jessica; Fernandez, Virginia; Sanchez, Pedro; Wolleb, Julia; da Costa, Pedro F.; Patel, Ashay (2023). "Generative AI for Medical Imaging: extending the MONAI Framework". arXiv:2307.15208.
  4. ^ Pasick, Adam (2023-03-27). "Artificial Intelligence Glossary: Neural Networks and Other Terms Explained". The New York Times (in الإنجليزية الأمريكية). ISSN 0362-4331. Retrieved 2023-04-22.
  5. ^ Andrej Karpathy; Pieter Abbeel; Greg Brockman; Peter Chen; Vicki Cheung; Yan Duan; Ian Goodfellow; Durk Kingma; Jonathan Ho; Rein Houthooft; Tim Salimans; John Schulman; Ilya Sutskever; Wojciech Zaremba (2016-06-16). "Generative models". OpenAI.
  6. ^ Metz, Cade (2023-03-14). "OpenAI Plans to Up the Ante in Tech's A.I. Race". The New York Times (in الإنجليزية الأمريكية). ISSN 0362-4331. Retrieved 2023-03-31.
  7. ^ Thoppilan, Romal; De Freitas, Daniel; Hall, Jamie; Shazeer, Noam; Kulshreshtha, Apoorv (January 20, 2022). "LaMDA: Language Models for Dialog Applications". arXiv:2201.08239 [cs.CL].
  8. ^ Roose, Kevin (2022-10-21). "A Coming-Out Party for Generative A.I., Silicon Valley's New Craze". The New York Times. Retrieved 2023-03-14.
  9. ^ "Don't fear an AI-induced jobs apocalypse just yet". The Economist. 2023-03-06. Retrieved 2023-03-14.
  10. ^ Harreis, H.2=Koullias; Roberts, Roger. "Generative AI: Unlocking the future of". {{cite web}}: |first2= missing |last2= (help); Text "lastfashion" ignored (help)CS1 maint: numeric names: authors list (link)
  11. ^ "How Generative AI Can Augment Human Creativity". Harvard Business Review. 2023-06-16. ISSN 0017-8012. Retrieved 2023-06-20.
  12. ^ "The race of the AI labs heats up". The Economist. 2023-01-30. Retrieved 2023-03-14.
  13. ^ Yang, June; Gokturk, Burak (2023-03-14). "Google Cloud brings generative AI to developers, businesses, and governments".
  14. ^ Justin Hendrix (May 16, 2023). "Transcript: Senate Judiciary Subcommittee Hearing on Oversight of AI". techpolicy.press. Retrieved May 19, 2023.
  15. ^ Crevier, Daniel (1993). AI: The Tumultuous Search for Artificial Intelligence. New York, NY: BasicBooks. p. 109. ISBN 0-465-02997-3.
  16. ^ Newquist, HP (1994). The Brain Makers: Genius, Ego, And Greed In The Quest For Machines That Think. New York: Macmillan/SAMS. pp. 45–53. ISBN 978-0-672-30412-5.
  17. ^ Noel Sharkey (July 4, 2007), A programmable robot from 60 AD, 2611, New Scientist, https://www.newscientist.com/blog/technology/2007/07/programmable-robot-from-60ad.html, retrieved on October 22, 2019 
  18. ^ Brett, Gerard (July 1954), "The Automata in the Byzantine "Throne of Solomon"", Speculum 29 (3): 477–487, doi:10.2307/2846790, ISSN 0038-7134. 
  19. ^ Bergen, Nathan; Huang, Angela (2023). "A BRIEF HISTORY OF GENERATIVE AI" (PDF). Dichotomies: Generative AI: Navigating Towards a Better Future (2): 4.
  20. ^ Tony Jebara (2012). Machine learning: discriminative and generative. Vol. 755. Springer Science & Business Media.
  21. ^ "finetune-transformer-lm". GitHub. Retrieved 2023-05-19.
  22. ^ Radford, Alec; Wu, Jeffrey; Child, Rewon; Luan, David; Amodei, Dario; Sutskever, Ilya; others (2019). "Language models are unsupervised multitask learners". OpenAI Blog. 1 (8): 9.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  23. ^ Bubeck, Sébastien; Chandrasekaran, Varun; Eldan, Ronen; Gehrke, Johannes; Horvitz, Eric; Kamar, Ece; Lee, Peter; Lee, Yin Tat; Li, Yuanzhi; Lundberg, Scott; Nori, Harsha; Palangi, Hamid; Ribeiro, Marco Tulio; Zhang, Yi (March 22, 2023). "Sparks of Artificial General Intelligence: Early experiments with GPT-4". arXiv:2303.12712 [cs.CL].
  24. ^ "A History of Generative AI: From GAN to GPT-4". 21 March 2023.
  25. ^ "Explainer: What is Generative AI, the technology behind OpenAI's ChatGPT?". Reuters. March 17, 2023. Retrieved March 17, 2023.
  26. ^ Bommasani, R; Hudson, DA; Adeli, E; Altman, R; Arora, S; von Arx, S; Bernstein, MS; Bohg, J; Bosselut, A; Brunskill, E; Brynjolfsson, E (2021-08-16). "On the opportunities and risks of foundation models". arXiv:2108.07258 [cs.LG].{{cite arXiv}}: CS1 maint: date and year (link)
  27. ^ Chen, Ming; Tworek, Jakub; Jun, Hongyu; Yuan, Qinyuan; Pinto, Hanyu Philippe De Oliveira; Kaplan, Jerry; Edwards, Haley; Burda, Yannick; Joseph, Nicholas; Brockman, Greg; Ray, Alvin (2021-07-06). "Evaluating Large Language Models Trained on Code". arXiv:2107.03374 [cs.LG].
  28. ^ (2021) "Zero-shot text-to-image generation".: 8821–8831, PMLR. 
  29. ^ Agostinelli, Andrea; Denk, Timo I.; Borsos, Zalán; Engel, Jesse; Verzetti, Mauro; Caillon, Antoine; Huang, Qingqing; Jansen, Aren; Roberts, Adam; Tagliasacchi, Marco; Sharifi, Matt; Zeghidour, Neil; Frank, Christian (26 January 2023). "MusicLM: Generating Music From Text". arXiv:2301.11325 [cs.SD].
  30. ^ Dalugdug, Mandy (August 3, 2023). "Meta in June said that it used 20,000 hours of licensed music to train MusicGen, which included 10,000 "high-quality" licensed music tracks. At the time, Meta's researchers outlined in a paper the ethical challenges that they encountered around the development of generative AI models like MusicGen".
  31. ^ Metz, Cade (April 4, 2023). "Instant Videos Could Represent the Next Leap in A.I. Technology". The New York Times (in الإنجليزية).
  32. ^ Queenie Wong (Sep 29, 2022). "Facebook Parent Meta's AI Tool Can Create Artsy Videos From Text". cnet.com. Retrieved Apr 4, 2023.
  33. ^ Heaven, Will Douglas (2023-02-15). "AI is dreaming up drugs that no one has ever seen. Now we've got to see if they work". MIT Technology Review. Massachusetts Institute of Technology. Retrieved 2023-03-15.
  34. ^ Sherry Yang, Yilun Du (2023-04-12). "UniPi: Learning universal policies via text-guided video generation". Google Research, Brain Team. Google AI Blog.
  35. ^ Brohan, Anthony (2023). "RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control". arXiv:2307.15818.
  36. ^ Sabin, Sam (2023-06-30). "GitHub has a vision to make code more secure by design". Axios Codebook. Retrieved 2023-08-15.
  37. ^ James Vincent (Mar 20, 2023). "Text-to-video AI inches closer as startup Runway announces new model". The Verge. Retrieved 2023-08-15. Text-to-video is the next frontier for generative AI, though current output is rudimentary. Runway says it'll be making its new generative video model, Gen-2, available to users in 'the coming weeks.'
  38. ^ Vanian, Jonathan (2023-03-16). "Microsoft adds OpenAI technology to Word and Excel". CNBC. Retrieved 2023-08-15. Microsoft is bringing generative artificial intelligence technologies such as the popular ChatGPT chatting app to its Microsoft 365 suite of business software....the new A.I. features, dubbed Copilot, will be available in some of the company's most popular business apps, including Word, PowerPoint and Excel.
  39. ^ Mark Wilson (2023-08-15). "The app's Memories feature just got a big upgrade". TechRadar. The Google Photos app is getting a redesigned, AI-powered Memories feature...you'll be able to use generative AI to come up with some suggested names like "a desert adventure".
  40. ^ Sullivan, Laurie (May 23, 2023). "Adobe Adds Generative AI To Photoshop". MediaPost. Retrieved 2023-08-15. Generative artificial intelligence (AI) will become one of the most important features for creative designers and marketers. Adobe on Tuesday unveiled a Generative Fill feature in Photoshop to bring Firefly's AI capabilities into design.
  41. ^ Michael Nuñez (July 19, 2023). "LLaMA 2: How to access and use Meta's versatile open-source chatbot right now". VentureBeat. Retrieved 2023-08-15. If you want to run LLaMA 2 on your own machine or modify the code, you can download it directly from Hugging Face, a leading platform for sharing AI models.
  42. ^ Pounder, Les (2023-03-25). "How To Create Your Own AI Chatbot Server With Raspberry Pi 4". Retrieved 2023-08-15. Using a Pi 4 with 8GB of RAM, you can create a ChatGPT-like server based on LLaMA.
  43. ^ Kemper, Jonathan (Nov 10, 2022). ""Draw Things" App brings Stable Diffusion to the iPhone". The Decoder. Retrieved 2023-08-15. Draw Things is an app that brings Stable Diffusion to the iPhone. The AI images are generated locally, so you don't need an Internet connection.
  44. ^ Allan Witt (2023-07-07). "Best Computer to Run LLaMA AI Model at Home (GPU, CPU, RAM, SSD)". To run LLaMA model at home, you will need a computer build with a powerful GPU that can handle the large amount of data and computation required for inferencing.
  45. ^ Nellis, Stephen; Lee, Jane (September 1, 2022). "U.S. officials order Nvidia to halt sales of top AI chips to China". Reuters. Retrieved 2023-08-15.
  46. ^ Shilov, Anton (2023-05-07). "Nvidia's Chinese A800 GPU's Performance Revealed". Tom's Hardware. Retrieved 2023-08-15. the A800 operates at 70% of the speed of A100 GPUs while complying with strict U.S. export standards that limit how much processing power Nvidia can sell.
  47. ^ Dylan Patel (October 24, 2022). "How China's Biren Is Attempting To Evade US Sanctions". Retrieved August 15, 2023.
  48. ^ "Secretary-General's remarks to the Security Council on Artificial Intelligence". un.org. 18 July 2023. Retrieved 27 July 2023.
  49. ^ "The Writers Strike Is Taking a Stand on AI". Time (in الإنجليزية). 4 May 2023. Retrieved 11 June 2023.
  50. ^ Roth, Emma (25 January 2023). "CNET found errors in more than half of its AI-written stories". The Verge. Retrieved 17 June 2023.
  51. ^ "A magazine touted Michael Schumacher's first interview in years. It was actually AI". NPR. 28 April 2023. Retrieved 17 June 2023.
  52. ^ Collier, Kevin (July 14, 2023). "Actors vs. AI: Strike brings focus to emerging use of advanced tech". NBC News. SAG-AFTRA has joined the Writer's [ك‍] Guild of America in demanding a contract that explicitly demands AI regulations to protect writers and the works they create. ... The future of generative artificial intelligence in Hollywood — and how it can be used to replace labor — has become a crucial sticking point for actors going on strike. In a news conference Thursday, Fran Drescher, president of the Screen Actors Guild-American Federation of Television and Radio Artists (more commonly known as SAG-AFTRA), declared that 'artificial intelligence poses an existential threat to creative professions, and all actors and performers deserve contract language that protects them from having their identity and talent exploited without consent and pay.'
  53. ^ Chee, Foo Yun; Mukherjee, Supantha. "EU lawmakers vote for tougher AI rules as draft moves to final stage". Reuters (in الإنجليزية). Retrieved July 26, 2023.
  54. ^ Bartz, Diane; Hu, Krystal. "OpenAI, Google, others pledge to watermark AI content for safety, White House says". Reuters.
  55. ^ Ye, Josh (2023-07-13). "China says generative AI rules to apply only to products for the public". Reuters. Retrieved 2023-07-13.
  56. ^ "生成式人工智能服务管理暂行办法". 2023-07-13.
  57. ^ Sjouwerman, Stu (2022-12-26). "Deepfakes: Get ready for phishing 2.0". Fast Company. Retrieved 2023-07-31.
  58. ^ Sonnemaker, Tyler. "As social media platforms brace for the incoming wave of deepfakes, Google's former 'fraud czar' predicts the biggest danger is that deepfakes will eventually become boring". Business Insider (in الإنجليزية الأمريكية). Retrieved 2023-07-31.
  59. ^ "After WormGPT, FraudGPT Emerges to Help Scammers Steal Your Data". PCMAG (in الإنجليزية). Retrieved 2023-07-31.
  60. ^ Zhou, Viola (2023-04-11). "AI is already taking video game illustrators' jobs in China". Rest of World (in الإنجليزية الأمريكية). Retrieved 2023-08-17.
  61. ^ Carter, Justin (2023-04-11). "China's game art industry reportedly decimated by growing AI use". Game Developer (in الإنجليزية). Retrieved 2023-08-17.